Tally OH! An Improved SQL 8K “CSV Splitter” Function

  • SQL Kiwi (5/22/2012)


    Usman Butt (5/22/2012)


    Sadly, it is one of the shops where CLR is not permissible 🙁 Otherwise, for sure, I would have chosen your CLR without any reluctance 🙂

    That is sad. Not least because CLR was new *seven years ago*. 🙂

    Gosh... I'm really torn on things like this. When I first heard that MS was making CLR possible, my head just swam in the great possibilites. Finally, we could make up for all the things that MS didn't include like splitters, building an even better file importer than BULK INSERT, some file handling for Express "Bricks" for ETL, and much, much more.

    Then I had idiots write things like a modulo function and a merge in CLR because they didn't know how to do it in SQL and saw a ton of other horror stories like the moroffs that wrote a generic trigger as a CLR (I'm actually getting rid of such triggers that someone put in my databases). Because of those things and more, I really DO understand why many DBAs simply refuse to turn it on.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • SQL Kiwi (5/22/2012)


    Usman Butt (5/22/2012)


    SQL Kiwi (5/22/2012)


    Usman Butt (5/22/2012)


    Sadly, it is one of the shops where CLR is not permissible 🙁 Otherwise, for sure, I would have chosen your CLR without any reluctance 🙂

    That is sad. Not least because CLR was new *seven years ago*. 🙂

    *seven years ago* SQL 2k5 was also new, and they are still on SQL 2k5. Things have not changed much for them 😉

    :laugh: I see. But still, they've had *seven years* to get over their fears...

    Heh... or 7 years to have them reinforced by folks who shouldn't ever be allowed near a computer. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • peter-757102 (5/22/2012)


    The new licensing model for SQL Server 2012 model isn't very encouraging to the company I work for to make a switch to this version. But we did made it to SQL Server 2008 at least and in doing so, got to use some nice features along the way. The step up from 2005 did pay off in terms of usable features!

    The change in licencing was only a matter I time, I guess. The free cores per socket thing could only go on so long. The changes are quite complex and subtle in some areas though, so it might be worth talking to someone who knows it really well, to make sure your company's assessment is correct. SQL Server 2012 is quite a big release, with many good things to offer (and not all are documented or marketed).

  • Jeff Moden (5/22/2012)


    Then I had idiots write things like a modulo function and a merge in CLR because they didn't know how to do it in SQL and saw a ton of other horror stories like the moroffs that wrote a generic trigger as a CLR (I'm actually getting rid of such triggers that someone put in my databases). Because of those things and more, I really DO understand why many DBAs simply refuse to turn it on.

    This is a bit overstated: a 'moroff' can do damage in any language, with any feature. It's up to the DBA to work with the Dev to ensure only vaguely sensible stuff is done. Bit of common sense, is all.

  • SQL Kiwi (5/22/2012)


    Jeff Moden (5/22/2012)


    Then I had idiots write things like a modulo function and a merge in CLR because they didn't know how to do it in SQL and saw a ton of other horror stories like the moroffs that wrote a generic trigger as a CLR (I'm actually getting rid of such triggers that someone put in my databases). Because of those things and more, I really DO understand why many DBAs simply refuse to turn it on.

    This is a bit overstated: a 'moroff' can do damage in any language, with any feature. It's up to the DBA to work with the Dev to ensure only vaguely sensible stuff is done. Bit of common sense, is all.

    I think the issue here though Paul is that you just stuck a language into the database realm where the people that support it had no knowledge of said language. It was a potential recipe for disaster, that as you stated, could be avoided with wise practices, if utilized.

    Know what I mean? 😛

    David

    @SQLTentmaker

    “He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • David Benoit (5/22/2012)


    I think the issue here though Paul is that you just stuck a language into the database realm where the people that support it had no knowledge of said language. It was a potential recipe for disaster, that as you stated, could be avoided with wise practices, if utilized.

    Know what I mean? 😛

    Certainly. That why I emphasise DBAs and Devs working together (knowledge transfer works both ways) and the use of common sense.

  • SQL Kiwi (5/22/2012)


    David Benoit (5/22/2012)


    I think the issue here though Paul is that you just stuck a language into the database realm where the people that support it had no knowledge of said language. It was a potential recipe for disaster, that as you stated, could be avoided with wise practices, if utilized.

    Know what I mean? 😛

    Certainly. That why I emphasise DBAs and Devs working together (knowledge transfer works both ways) and the use of common sense.

    Unfortunately, common sense isn't common. 😉

  • Lynn Pettis (5/22/2012)

    Unfortunately, common sense isn't common. 😉

    ...and that would be our job to make sure that the "common sense" principles are upheld. 😛 Not always easy though.

    David

    @SQLTentmaker

    “He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • Jeff Moden (5/22/2012)


    Gosh... I'm really torn on things like this. When I first heard that MS was making CLR possible, my head just swam in the great possibilites. Finally, we could make up for all the things that MS didn't include like splitters, building an even better file importer than BULK INSERT, some file handling for Express "Bricks" for ETL, and much, much more.

    Once upon a time I my head swam too - there were these amazing things called "Extended Stored Procedures" which were going to solve all our performance problems by allowing us to write bad C++ instead of bad SQL.

    Then I had idiots write things like a modulo function and a merge in CLR because they didn't know how to do it in SQL and saw a ton of other horror stories like the moroffs that wrote a generic trigger as a CLR (I'm actually getting rid of such triggers that someone put in my databases). Because of those things and more, I really DO understand why many DBAs simply refuse to turn it on.

    Then I discovered what people could do with XPs. Horror stories? Well, not at all - in 2000/2001 the guy in my division who wrote the XPs was extremely competent (he could write better C++ blindfold and shackled than I could with no handicaps, and understood databases well enough to request that I chuck DB2 out of the shop in favour of SQL Server). But it was pretty clear that in less competent hands it could be a complete disaster.

    A few years later, was the introduction of CLR things into T-SQL any different? Good thing for competent people. Potential disaster if average programmers were involved.

    edit: Essentially I agree with Paul's comments. However I recognise (as I am sure he does) that there are DBAs out there who take it as anathema to learn from or teach to developers, the latter being beyond the pale, and equally there are developers out there who think DBAs are and inferior breed of jumped-up pseudo-managers. In a shop where those views are held, CLR is guaranteed to be a disaster: but so is JavaScript (or, even worse, VB) trying to communicate with SQL via ADO.NET - and C# doesn't make things any better when thiose attitudes prevail. I suspect Jeff has seen too many such shops - I've been luckier, so for me the disaster is something that could happen rather than something that did happen.

    Tom

  • L' Eomot Inversé (5/22/2012)


    edit: Essentially I agree with Paul's comments. However I recognise (as I am sure he does) that there are DBAs out there who take it as anathema to learn from or teach to developers, the latter being beyond the pale, and equally there are developers out there who think DBAs are and inferior breed of jumped-up pseudo-managers. In a shop where those views are held, CLR is guaranteed to be a disaster: but so is JavaScript (or, even worse, VB) trying to communicate with SQL via ADO.NET - and C# doesn't make things any better when thiose attitudes prevail. I suspect Jeff has seen too many such shops - I've been luckier, so for me the disaster is something that could happen rather than something that did happen.

    Quite so.

  • SQL Kiwi (5/22/2012)


    David Benoit (5/22/2012)


    I think the issue here though Paul is that you just stuck a language into the database realm where the people that support it had no knowledge of said language. It was a potential recipe for disaster, that as you stated, could be avoided with wise practices, if utilized.

    Know what I mean? 😛

    Certainly. That why I emphasise DBAs and Devs working together (knowledge transfer works both ways) and the use of common sense.

    I have some concerns on CLR management. But I have posted them as a new question here

    http://qa.sqlservercentral.com/Forums/Topic1304740-386-1.aspx

  • SQL Kiwi (5/22/2012)


    Jeff Moden (5/22/2012)


    Then I had idiots write things like a modulo function and a merge in CLR because they didn't know how to do it in SQL and saw a ton of other horror stories like the moroffs that wrote a generic trigger as a CLR (I'm actually getting rid of such triggers that someone put in my databases). Because of those things and more, I really DO understand why many DBAs simply refuse to turn it on.

    This is a bit overstated: a 'moroff' can do damage in any language, with any feature. It's up to the DBA to work with the Dev to ensure only vaguely sensible stuff is done. Bit of common sense, is all.

    I agree... but a DBA has to have some backing from managment to have common sense prevail. Some DBAs don't enjoy such backing and so they have to a bit nasty about certain things.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (5/22/2012)


    I'm pretty sure that I stated that the code in the article is NOT the correct solution for VARCHAR(MAX) because just changing the input datatype to VARCHAR(MAX) causes the code to run twice as slow.

    My primary objective is/was indeed the 8K splitter. My bad, I added the varchar(max) solutions as well 🙁 May be the latter was triggered due to already posted tally table varchar(max) solution. But my concern was to get feedback on any anomalies/shortfalls in 8k solution which somehow got no attention.

  • Jeff - I've been a fan of this splitter for a long time even though I've been a slow adopter, 🙂 recommending it highly around my office at any opportunity.

    I just went to use it today and realized that it only supports 1 character delimiter.

    I'm sure I could modify it to handle a longer delimiter but I'm afraid if I bumble around and make those modifications I'm going to do something nasty to its performance.

    Any suggestions on how to do this and not impact the performance to any significant degree?


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (5/24/2012)


    Jeff - I've been a fan of this splitter for a long time even though I've been a slow adopter, 🙂 recommending it highly around my office at any opportunity.

    I just went to use it today and realized that it only supports 1 character delimiter.

    I'm sure I could modify it to handle a longer delimiter but I'm afraid if I bumble around and make those modifications I'm going to do something nasty to its performance.

    Any suggestions on how to do this and not impact the performance to any significant degree?

    Why does it need to handle more than a single character delimiter? Replace multiple character delimiters with 1 before you pass it to the splitter. Then, beat the tar out of the person that designed the data with a multi-character delimiter. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 15 posts - 316 through 330 (of 981 total)

You must be logged in to reply to this topic. Login to reply