Best Practice Help - Table Structures

  • So just because their simple process dumps everything into this one table, all of our stuff should be that way too.

    This looks to be a key argument, maybe they don't want to shred this XML before loading it from that simple process. I also suspect that when loading from this source, ie., the "simple process", the XML could be saved intact, and then shredded as needed.

    Heck, the reply you paraphrased essentially says as much. " Which yes maybe not for another system but for ours, yes it is the best way to do it." pretty much implies that there are particular reasons why they want this in this particular case. You certainly have good arguments for the general advantages of sticking with relational principles, but Microsoft themselves have acknowledged the proliferation of XML in today's IT by making it a datatype.

    I see replies like "tell them to ditch sql server", "quit your job and go elsewhere", so just wanted to offer an alternative interpretation of the situation, especially since it seems apparent that those folks did seem to express an interest in storing this XML and really, SQL Server as a product has certainly implemented features that are applicable here.

  • Sorry, the argument isn't over storing it as XML ... which was perhaps an explanation error on my part.  The argument is just over a single table with a blob field storing every value for a data element using string based values instead of keys etc.  They do put some XML into the blob field...but most of the values are just straight characters.  It is also pure laziness and no regard for proper design.  Especially after having me (for months now) labor to create all the said proper tables and then cause one developer complained "boo hoo I have to change my code", he apparently had enough clout to have it remain as is.  I don't really care if the data is stored as XML or not, its just this single table thing that is going to get SUPER huge and some of our data records are one to many relationships... so not quite sure how the hell they plan to handle that.

  • amy26 - Tuesday, September 11, 2018 10:29 AM

    Sorry, the argument isn't over storing it as XML ... which was perhaps an explanation error on my part.  The argument is just over a single table with a blob field storing every value for a data element using string based values instead of keys etc.  They do put some XML into the blob field...but most of the values are just straight characters.  It is also pure laziness and no regard for proper design.  Especially after having me (for months now) labor to create all the said proper tables and then cause one developer complained "boo hoo I have to change my code", he apparently had enough clout to have it remain as is.  I don't really care if the data is stored as XML or not, its just this single table thing that is going to get SUPER huge and some of our data records are one to many relationships... so not quite sure how the hell they plan to handle that.

    Maybe ask them how they propose writing a query to get all sales data for customers who had their first order in the month of June 2017. You also want to see how many orders have been placed and the average cost per order. That is not even remotely as much detail as goes into many ad hoc queries but in that system it would take a few hours to write the potential hundred or so lines of code with a couple dozen joins to the same table over and over.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Agreed!  One of their DBA's that was on the call sort of piped up and defended me ... he didn't say anything about my proposed design but he did say that the existing design could pose issues with querying and pushing data to other systems.  But nobody seemed phased by this... the PM actually laughed at me and said "not necessarily" when I said that it was best practice to do what I was proposing with the table structures.  Their system is written is for the UI and to have the application code do everything... so they aren't used to having to do backend DB stuff.  Which I understand and respect... I tried to explain I wasn't trying to impose my will just cause "this is how I did it and I wanna do it that way" but that it was my professional opinion (I do have multiple certifications in this field).  But nope... take a seat sister and let the developers dictate how the DB will be configured.  So... meh... I'm sort of ready to wash my hands.

  • Document everything, make sure all (ALL) changes are in writing.  Keep your proposed design handy.  At some point the data in the system will hit critical mass and performance will tank and the user community will complain.  They may also find that short of a redesign there may be nothing they can do fix the performance issues in their lack of design.

    I had a similar, yet different, issue where I currently work.  Found a serious issue with a trigger created on many tables, dynamic SQL with a cursor to step through the inserted and deleted virtual tables in the trigger to update main report tables.  Found the issue in 2013, documented it and proposed a solution.  Crickets.  New lead in 2014, showed the documentation, more crickets.  Back state side in 2015, showed the documentation to my new supervisor, more crickets.  Suddenly part of an urgent email about the very same triggers in 2016, dug out the documentation and forwarded it to the developer.  Was thanked, and less than two later more crickets.

    In 2017 assigned a project to add CONTEXT_INFO code to prevent a replication storm in an in-house developed replication process to these triggers.  Guess what else got fixed? Yes, the original issue I found back in 2013.

  • amy26 - Tuesday, September 11, 2018 10:29 AM

    Sorry, the argument isn't over storing it as XML ... which was perhaps an explanation error on my part.  The argument is just over a single table with a blob field storing every value for a data element using string based values instead of keys etc.  They do put some XML into the blob field...but most of the values are just straight characters.  It is also pure laziness and no regard for proper design.  Especially after having me (for months now) labor to create all the said proper tables and then cause one developer complained "boo hoo I have to change my code", he apparently had enough clout to have it remain as is.  I don't really care if the data is stored as XML or not, its just this single table thing that is going to get SUPER huge and some of our data records are one to many relationships... so not quite sure how the hell they plan to handle that.

    Are you telling me you spent months designing tables in isolation from your developers? Maybe we should drop the design questions here and you could go into more detail on the process you're burdened with operating within. How did they finish code that they cannot now change, if the tables weren't done? Clearly there should have been some agreement on table structure before either of you started, why didn't that happen?

  • amy26 - Friday, September 7, 2018 4:17 PM

    Hi there, I am being asked to provide a list of all the reasons that we should create actual normalized tables with data fields vice creating one table with a blob column that stores all sorts of data elements in one field as XML. I can’t believe I actually have to do this but wanted some help with my list to make sure I am not forgetting anything or off base. This is for a system that is heavy on database logic and data manipulation. It is also a transactional system that pushes data to other systems and ingests data.Pros of normalized rational tables:Data integrity Data validation capabilities Querability Query performanceACIDSystem to system interfaces (having like data)Ease of understanding Easier object-to-data mappingSome of my topics may overlap each other but I plan on fleshing it all out and describing each area better. Just wanted to get my thoughts down and then have some community input. I don’t have any other dbas to bounce ideas off of ... it’s just me! 🙂

    Just to pile on about how bad this is (and I know this was not your idea) starting with this sample data:
    IF OBJECT_ID('dbo.table1','U') IS NOT NULL DROP TABLE dbo.table1;
    IF OBJECT_ID('dbo.table2','U') IS NOT NULL DROP TABLE dbo.table2;

    CREATE TABLE dbo.table1 (nameId INT IDENTITY, phoneNbr VARCHAR(20), email VARCHAR(100));
    CREATE TABLE dbo.table2 (nameId INT IDENTITY, ContactInfo XML);

    INSERT dbo.table1(phoneNbr,email) VALUES ('555-444-3333','JoeBlow@gmail.com');
    INSERT dbo.table2(ContactInfo) VALUES
    ('<contactInfo>
    <phone>555-444-333</phone>
    <email>JoeBlow@gmail.com</email>
    </contactInfo>');

    Note these queries:
    -- relational
    SELECT t.nameId, t.phoneNbr, t.email
    FROM dbo.table1 AS t
    WHERE t.email IS NOT NULL

    -- XML:
    SELECT
    t.nameId,
    phoneNbr = t.ContactInfo.value('(contactInfo/phone/text())[1]', 'VARCHAR(20)'),
    email  = t.ContactInfo.value('(contactInfo/email/text())[1]', 'VARCHAR(100)')
    FROM dbo.table2 AS t
    WHERE t.ContactInfo.value('(contactInfo/email/text())[1]', 'VARCHAR(100)') IS NOT NULL;

    Before considering performance, just consider how much more complicated the XML-based solution is. Which is easier to understand? Which appears easier to troubleshoot? Which structure seems easier to maintain? 

    Now look at the execution plans:

    ... and this is just one record. Imagine trying to join some other tables, adding a GROUP BY or other basic SQL. The queries will be miserably slow and impossible to tune because the execution plan becomes insanely verbose. Just more food for thought.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Jeff Moden - Monday, September 10, 2018 10:34 AM

    Couple o' more thoughts.  If your demonstrable code for the normalized methodology beats the XML, and I have no doubt that it will, and they play the "Well, can you prove that it will beat the XML in all instances?" card, don't go crazy.  Simply state that the normalized method just beat an XML method in at least one case and then ask them if they can prove that the XML method will win in all other cases.  😀

    Also, don't forget to measure the extra network traffic that the XML method can generate in both directions.  I've found that, because of the tag bloat, it's usually in the area of 8 to 16 times more network traffic.

    Don't you just have a couple articles you can point them to, Jeff? I'm sure I've seen you do this at least once, maybe in forum posts, I don't know. And again, sad I missed out on you last month, I love "beating them with their own bat".....

    -------------------------------------------------------------------------------------------------------------------------------------
    Please follow Best Practices For Posting On Forums to receive quicker and higher quality responses

  • patrickmcginnis59 10839 - Tuesday, September 11, 2018 12:25 PM

    amy26 - Tuesday, September 11, 2018 10:29 AM

    Sorry, the argument isn't over storing it as XML ... which was perhaps an explanation error on my part.  The argument is just over a single table with a blob field storing every value for a data element using string based values instead of keys etc.  They do put some XML into the blob field...but most of the values are just straight characters.  It is also pure laziness and no regard for proper design.  Especially after having me (for months now) labor to create all the said proper tables and then cause one developer complained "boo hoo I have to change my code", he apparently had enough clout to have it remain as is.  I don't really care if the data is stored as XML or not, its just this single table thing that is going to get SUPER huge and some of our data records are one to many relationships... so not quite sure how the hell they plan to handle that.

    Are you telling me you spent months designing tables in isolation from your developers? Maybe we should drop the design questions here and you could go into more detail on the process you're burdened with operating within. How did they finish code that they cannot now change, if the tables weren't done? Clearly there should have been some agreement on table structure before either of you started, why didn't that happen?

    OMG don't get me started.  NO!!!  We had meetings and discussions and "understandings".  I even have an email from one of the developers a couple days before this last meeting asking me at what point the tables should be populated and getting specifics on things.  But then suddenly, they turned the tables (pun intended).  A lot of the development they were doing they hard coded things and/or just populating this blob table as a "place holder".  Place holder my butt!!!  I am also appalled that they aren't following any sort of SDLC methodology... just sort of "get it done" approach.  But it will be me that gets thrown under the bus when everything tanks...

  • amy26 - Wednesday, September 12, 2018 2:45 PM

    OMG don't get me started.  NO!!!  We had meetings and discussions and "understandings".  I even have an email from one of the developers a couple days before this last meeting asking me at what point the tables should be populated and getting specifics on things.  But then suddenly, they turned the tables (pun intended).  A lot of the development they were doing they hard coded things and/or just populating this blob table as a "place holder".  Place holder my butt!!!  I am also appalled that they aren't following any sort of SDLC methodology... just sort of "get it done" approach.  But it will be me that gets thrown under the bus when everything tanks...

    Be sure that you have all your CYA documentation.  I have seen that turn the bus around on the people trying to throw someone under the bus.

  • amy26 - Wednesday, September 12, 2018 2:45 PM

    OMG don't get me started.  NO!!!  We had meetings and discussions and "understandings".  I even have an email from one of the developers a couple days before this last meeting asking me at what point the tables should be populated and getting specifics on things.  But then suddenly, they turned the tables (pun intended).  A lot of the development they were doing they hard coded things and/or just populating this blob table as a "place holder".  Place holder my butt!!!  I am also appalled that they aren't following any sort of SDLC methodology... just sort of "get it done" approach.  But it will be me that gets thrown under the bus when everything tanks...

    I actually am ok with placeholders, its sort of like "stubbing out" code but that's probably not helpful for you in any case. I do it, its a great way to get sections done in such a way to progress a project. Not saying there aren't costs but costs are ok if you can account for them via the benefits. I doubt you're interested in that discussion though, right?

    I would be really interested in hearing more from your counterparts about their plans. I don't have that great interest in "sargability" posts and the like that you're seeing here because honestly anybody should be able to read on that, even your counterpart / associates. I'm interested in what your counterparts are doing, but you are sort of coloring the discussion in such a way that I think you'd be better served by just going with the "choir" you are "preaching" to and in any case you aren't obliged to serve my curiosity and that's cool!

    I like what your counterpart said to you to be honest, and I'd like to hear more, but I think your posts indicate that you aren't in the mood and this isn't what you're after in this thread so for my part I'll have to let it go at that.

    I will say this however, you're implying that you'll get "thrown under the bus" even when documenting your contrary indications to their plans, so can we at least be honest on this one, ie., that it isn't a technical issue? If you think you'll get adversely affected despite your best efforts isn't that really the heart of the issue?

    I'd love to play devils advocate and examine your initial "list".  In particular I disliked your "ACID" line item and thought it plain that it did not apply. But in explaining my thoughts I'd probably just be typing gratuitiously LOL when in reality I don't think the issue is technical at all and in any case there's plenty of other posters who can make you feel better on this one so there's that.

  • Awe no its all good, I appreciate all feedback.  Also, my original post was intended for actual help coming up with legit arguments against the proposed topic.  A bit much venting made its way into the thread, so sorry for that.  Placeholders are indeed useful and purposeful when their intention is to be placeholders.

Viewing 12 posts - 16 through 26 (of 26 total)

You must be logged in to reply to this topic. Login to reply