NoSQL inside SQL Server

  • Comments posted to this topic are about the item NoSQL inside SQL Server

  • Steve Jones - SSC Editor (9/5/2015)


    Comments posted to this topic are about the item <A HREF="/articles/Editorial/131238/">NoSQL inside SQL Server</A>

    Ok... thanks to your article, I finally decided to spend a bit of time researching what NoSQL is actually all about. I watched the following video from stem to stern. I was prepared to be amazed and impressed because of the hype that seems to be behind it.

    https://www.youtube.com/watch?v=px6U2n74q3g&feature=youtu.be

    Unfortunately, I'm still waiting to be impressed and I'm amazed the wrong way. What I saw were many claims as to how it was much faster than an "RDBMS" and more scalable and more flexible and... and... and... and absolutely no explanation as to how it was so except through wanton denormalization and rampant duplication of data. Yes, I saw the basic shift in storage from row based to column based but that's the only saving grace that I could see. If you wanted scalability, you still had to buy a shedload of hardware and it usually has to be more hardware because of the duplication of data. The really hilarious part of it all was that the speaker called that duplication of data "indexes". In other words, they were basically full covering indexes.

    One of the supposed advantages of NoSQL appears to be that it's "transactionless". Sorry folks... you can't change physics. Neither the read/write head of a physical disk nor the addressing system of an SSD can write two different pieces of information into the same physical bytes at the same time no matter how hard you try. You also can't have either read from the same physical bytes at the same time they're writing nor vice versa. One or the other action is going to have to wait (be blocked) until the first read or write completes (hopefully they're written in the correct order). You have transactions and locks whether you think you do or not. And, as a reminder, the video said to learn to embrace "writes" because, with all the duplication of data, there's going to be a ton of writes... kinda like having too many indexes on an RDBMS table. 😉 Also, if transactions are such a bad thing, why have they updated a product like Cassandra to suddenly support even light-weight transactions?

    I'll admit that I'm missing a lot of details just because it was only a quick dive of what NoSQL is but I believe I went in with an open mind and what I came away with is that the easy way to learn to what NoSQL can do is to let someone, that doesn't understand databases, design and write code on an RDBMS kinda like what many poor companies have to put up with right now.

    I'll just kinda skip through the idea of non-durable data when it comes to critical things like monetary transactions. It reminds me of an incident I had with one idiotic compliance officer that I had the great displeasure of meeting that said it was ok to use SSNs as the PK for the customer table and in all tables that needed to refer to the customer table. Her claim was that even the Social Security Administration said there was no rule against it (unfortunately, she's right on that although PII rules are beginning to override all that). She also claimed that her security was good enough so that it's not a concern (I was a contractor and had full privs to ALL the databases with sysadmin privs) and she wouldn't change (encrypt) it. I told her that since she was so confident that it would do the rest of us very well if she demonstrated that confidence by entering herself, along with her actual SSN, into the databases that she was so proud of. We all know how that went. Not so confident but still adamant. 😉

    Continuing in the video, the suggest of flexibility came from the use "name value pairs". SERIOUSLY??? Holy shades of hidden tag bloat, Batman! Just make everything XML and call it a day! On a good note, though, the presenter did say that data modeling wasn't dead because of NoSQL. Heh... I agree. It's just like in an RDBMS though... if the wrong people are doing the "modeling", then data modeling is actually dead for that application, whatever it is, especially when it comes to the "flexibility" of name-value pairs.

    Unfortunately, NoSQL seems to be coming very popular and, very much like the tag bloated mistake they call "XML", it'll likely become "the defacto standard" just because of the sheer ignorance as to how to correctly use an RDBMS and that millions of people will use it as the "easy way" out even it it might not be the right way.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • I think the first thing to do is look at your use case, identify the pain points and see if the proposed solution addresses more pain than it solves.

    OLTP does not produce the immense amount of data that renders RDBMS obsolete. I suggested years ago that NOSQL would be better described as NO-RDBMS. I got shot down in flames then but I haven't heard a counter argument. In fact I've seen SQL like language adoption across a wide range of platforms.

    NOSQL is anything that isn't a relational store. If you want a mainly read-only document store with sophisticated search capabilities then Elastic Search fits the bill nicely. SQL Server Full text indexing is something that needed a lot of TLC years ago and still hasn't got it.

    Git source code repositories use both Redis and MySql in combination.

    Some of the NOSQL solutions are specialist tools for a specific need and deserve to be taken seriously. Others strike me as those cheap tools you buy at a car boot sale that break if used for serious work.

  • I encountered a "NoSQL-within-SQL" use case on a project once. The table in question stored a variety of data documents. The documents consisted of several types of documents with differing fields and data. It was far easier to create a "Data" column which could hold a JSON object than to create tables for each individual document type.

    Jay Bienvenu | http://bienv.com | http://twitter.com/jbnv

  • I think you have to be careful to discard NoSQL as a thing, and also to discard it as applying to situations. NoSQL is a range of technologies, which include but aren't limited to

    - graph databases

    - document stores

    - key value stores

    - map/reduce based stores

    - more

    I think in many cases the RDBMS works very well, but there are domains that I think work better for some some flavor or NoSQL. Certainly I think most of these are some subset of what an RDBMS offers, but optimized for a situation.

    Most of these stores do support transactions, but they aren't guaranteeing consistency across all nodes. If you have multiple nodes, there is some physical time for the data to replicate across to other stores. That's accepted, but it's not unlike the time for SQL Server replication. One of the differences is many of these systems are much smoother to get working. That's more a sign of SQL Server immaturity than these do something amazing, but it is what it is.

    We should question the relational model. Not throw it away, but question where does it work well and is it worth the time spent mapping objects to relational storage. Certainly some of these perform some of what we would call denormalization, but they do it in increments and stages, and sometimes when/where you do the extra work can make a big difference in performance.

    I will tend to see many of the NoSQL systems as immature in terms of the work being done on them. We wouldn't necessarily want banking systems on some of these, but I do think that if the systems are easier to develop on, and the domain of database problem supports them, it's worth implementing one.

  • Steve Jones - SSC Editor (9/8/2015)


    I think you have to be careful to discard NoSQL as a thing, and also to discard it as applying to situations. NoSQL is a range of technologies, which include but aren't limited to

    - graph databases

    - document stores

    - key value stores

    - map/reduce based stores

    - more

    I think in many cases the RDBMS works very well, but there are domains that I think work better for some some flavor or NoSQL. Certainly I think most of these are some subset of what an RDBMS offers, but optimized for a situation.

    Most of these stores do support transactions, but they aren't guaranteeing consistency across all nodes. If you have multiple nodes, there is some physical time for the data to replicate across to other stores. That's accepted, but it's not unlike the time for SQL Server replication. One of the differences is many of these systems are much smoother to get working. That's more a sign of SQL Server immaturity than these do something amazing, but it is what it is.

    We should question the relational model. Not throw it away, but question where does it work well and is it worth the time spent mapping objects to relational storage. Certainly some of these perform some of what we would call denormalization, but they do it in increments and stages, and sometimes when/where you do the extra work can make a big difference in performance.

    I will tend to see many of the NoSQL systems as immature in terms of the work being done on them. We wouldn't necessarily want banking systems on some of these, but I do think that if the systems are easier to develop on, and the domain of database problem supports them, it's worth implementing one.

    I remember once casually remarking that it would be nice to decouple transactions from the relational model, simply to store non relational data and still maintain the reliability and predictability that a good transaction model would allow for. My mistake was making this remark in an RDBMS forum, and was met with much ridicule and scorn, and the sheer scale of the bad vibes was quite the surprise.

    I didn't bother mentioning that I spent a few years working on just such a system, we had a wide variety of data structures that would fit into the database, and you could begin the transaction and commit (or rollback) as appropriate, but the transaction mechanism was there for the confidence of the system design and durability of the data, rather than to enforce any sense of relationality.

    It even had a query language that worked pretty well, however it didn't have much of a query planner/optimizer, so folks had to strategize how best to "navigate" the database to get their results.

    https://en.wikipedia.org/wiki/Navigational_database

  • Well, here is an interesting case for me that makes looking at NoSQL as a possible solution or adaptation.

    Our data is large in volume (10+ million a day), in batch (daily log files), highly relational (OLAP) and is used for analytical reporting (big reads versus fast writes).

    The traditional RDBMS handles the volume good these days. Computers are a lot faster, SQL Server is pretty beast and storage is becoming cheap.

    Yet, when the data starts becoming so big that it can't easily be handled by a single machine, NoSQL starts becoming an option due to the ease of scaling out. While scaling up is possible, NoSQL seems to ease that pain a great deal.

    With the expensive license costs of traditional RDBMS, complex logic required to clean and fit data to rock hard models and the inability to be flexible to changing models, NoSQL becomes a likely option.

    However, when it comes to our needs, NoSQL is not good for data analysis with adhoc queries and more. It's not all that perfect when comparing to engines like SQL Server that makes it insanely easy to analyze the data once it's in there for analyzation.

    For me, I see both NoSQL and RDBMS serving entirely different purposes that are critical to our business. One as a flexible data lake that is the one source of truth for our data where the other is the BI platform that is refining and ensuring data integrity (which is critical) for analytical reporting.

  • For me it seems simple: NoSQL is not a replacement for where a RDBMS is most suitable. There are cases where applying a RDBMS is suboptimal but had still been the best available solution.

    The smarter people are suggesting that we all consider whether each data repository scenario is one of these and, if so, whether there is now a more optimal solution. If there is then apply the more suitable alternative otherwise select a RDBMS.

    The not so smart people are demanding that we throw out all our RDBMS solutions and replace them with NoSQL ones and never choose an RDBMS solution again.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • FYI, I have create a PoC of NOSQL on SQL:  https://qa.sqlservercentral.com/articles/nosqlonsql

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply