A varchar(MAX) STRING_SPLIT function for SQL 2012 and above

  • Kristen-173977 wrote:

    Jonathan AC Roberts wrote:

    Or you could just add a WHERE clause to filter them out as it's a table valued function.

    Yes ... but my Splitter returns an Item No, and if I use WHERE afterwards I have gaps in the Item No, so I would then also have to add code to solve that, every time I have that requirement, whereas if I get my Splitter to remove them that function returns contiguous Item Nos.

    I still have the option of the Splitter not removing them, so I can have the data "raw" and post-process if that is more appropriate.

    I'm always interested in what other folks come up with in the area of splitters, especially those that can handle LOB datatypes.  Would you post your splitter, please?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden wrote:

    Would you post your splitter, please?

    Sadly that requires more time than I have readily available to sanitise and remove any naming etc. that is "internal" ... which in turn would mean testing to make sure that I hadn't goofed something up ...

    What I can do more easily is to catalogue the splitters I have / use

    CREATE FUNCTION dbo.FN_DelimSplitChar
    (
    @strSource varchar(8000)
    , @strDelimiter varchar(10) = ',' -- NOTE: Delimiter is CASE SENSITIVE
    , @strBlank varchar(8000) = '[DELETE]' -- What to return if an item is blank? (e.g. '', NULL or '[BLANK]')
    -- Use '[DELETE]' to delete any NULL items
    )
    -- Split a string based on delimiter. Returns a Character String (which is NOT Trimmed)
    RETURNS @tblArray TABLE
    (
    S_Item int IDENTITY(1,1) NOT NULL PRIMARY KEY
    , S_StrValue varchar(8000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
    )

    Standard string splitter. Optimised for <= 8,000 char

    CREATE FUNCTION dbo.FN_DelimSplitInt
    (
    @strSource varchar(8000)
    , @strDelimiter varchar(10) = ',' -- NOTE: Delimiter is CASE SENSITIVE
    , @strBlank varchar(8000) = '[DELETE]' -- What to return if an item is blank? (e.g. NULL or '-1')
    -- Use '[DELETE]' to delete any NULL items
    )
    -- Split a string based on delimiter. Returns an array of INTs.
    RETURNS @tblArray TABLE
    (
    S_Item int IDENTITY(1,1) NOT NULL PRIMARY KEY
    , S_IntValue int NULL
    )

    Same as CHAR, optimised for INTs

    CREATE FUNCTION dbo.FN_DelimSplitIntMAX
    (
    @strSource varchar(MAX)
    , @strDelimiter varchar(10) = ',' -- NOTE: Delimiter is CASE SENSITIVE
    , @strBlank varchar(8000) = '[DELETE]' -- What to return if an item is blank? (e.g. NULL or '-1')
    -- Use '[DELETE]' to delete any NULL items
    )
    -- Split a string based on delimiter. Returns an array of INTs.
    RETURNS @tblArray TABLE
    (
    S_Item int IDENTITY(1,1) NOT NULL PRIMARY KEY
    , S_IntValue int NULL
    )

    Same ... used if Source could exceed [and is optimised for] 8,000+ chars

    CREATE FUNCTION dbo.FN_DelimSplitNVarcharMax
    (
    @strSource nvarchar(MAX)
    , @strDelimiter nvarchar(10) = ',' -- NOTE: Delimiter is CASE SENSITIVE
    , @strBlank nvarchar(4000) = '[DELETE]' -- What to return if an item is blank? (e.g. '', NULL or '[BLANK]')
    -- Use '[DELETE]' to delete any NULL items
    )
    -- Split a TEXT string based on delimiter. Returns a Character String.
    RETURNS @tblArray TABLE
    (
    S_Item int IDENTITY(1,1) NOT NULL PRIMARY KEY
    , S_StrValue nvarchar(4000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
    )

    Same, only used when Wide-Chars need to be accommodated (Input up to 8,000 char)

    CREATE FUNCTION dbo.FN_DelimSplitVarcharMax
    (
    @strSource varchar(MAX)
    , @strDelimiter varchar(10) = ',' -- NOTE: Delimiter is CASE SENSITIVE
    , @strBlank varchar(8000) = '[DELETE]' -- What to return if an item is blank? (e.g. '', NULL or '[BLANK]')
    -- Use '[DELETE]' to delete any NULL items
    )
    -- Split a TEXT string based on delimiter. Returns a Character String.
    RETURNS @tblArray TABLE
    (
    S_Item int IDENTITY(1,1) NOT NULL PRIMARY KEY
    , S_StrValue varchar(8000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
    )

    Wide-Chars, more than 8,000 char

    CREATE FUNCTION dbo.FN_FixedSplitChar
    (
    @strValue varchar(8000) -- String Value to split
    , @intWidth int -- Width (of each element)
    , @intOptions smallint = 0 -- RESERVED
    )
    -- Fixed-width string splitter
    RETURNS TABLE
    ...
    DECLARE @tblArray TABLE
    (
    T_Item int NOT NULL
    , T_StrValue varchar(8000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
    , PRIMARY KEY
    (
    T_Item
    )
    )

    Split fixed width strings (i.e. no delimiter)

    CREATE FUNCTION dbo.FN_FixedSplitInt
    (
    @strValue varchar(8000) -- String Value to split
    , @intWidth int -- Width (of each element)
    , @intStuffStart int -- Offset for STUFF
    , @intStuffLen int -- Length for STUFF
    , @intOptions smallint = 0 -- 0=Default, 2=Last item blank=ignore
    )
    -- Fixed-width string splitter
    RETURNS TABLE
    AS
    ...
    DECLARE @tblArray TABLE
    (
    T_Item int NOT NULL
    , T_IntValue int NULL
    , PRIMARY KEY
    (
    T_Item
    )
    )

    Same, but for INTs. The STUFF option is where the INTs are fixed length, but also have delimiter - strings such as:

    '/123/45 /7 /0 /567'

     

    CREATE FUNCTION dbo.FN_PathSplit
    (
    @strSource varchar(8000)
    , @strDelimiter varchar(10) = '/'
    , @strBlank varchar(8000) = '[DELETE]' -- What to return if an item is blank? (e.g. '', NULL or '[BLANK]')
    -- Use '[DELETE]' to delete any NULL items
    )
    -- Split a string into its Path components
    -- e.g. /L1/L2/L3 will be split into /L1, /L1/L2 and /L1/L2/L3
    RETURNS @tblArray TABLE
    (
    S_Item int IDENTITY(1,1) NOT NULL
    , S_Value varchar(8000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
    , PRIMARY KEY
    (
    S_Item
    )
    )
    CREATE FUNCTION dbo.FN_SplitWords
    (
    @strValue varchar(8000) -- String Value to split into words
    , @intOptions smallint = 0 -- 0=Normal, 1=Append plurals
    )
    RETURNS TABLE

    Dictionary table available for significant number of English words, aliased for root/singular words and so on. Not a substitute for a proper lexicon! but we find it works well enough for general text searching where we need "story" to also match "stories", and ""tell" / "told" etc

    So take a search string, of one/multiple words, split it, run it through the dictionary to get "root" words, and then use the "word lookup table" to find (weighted) matches for one/many/all words in the search string, and then join that back to the original table/record. The central Word Lookup Table is maintained by trigger, splitting e.g. a "Description" into words and storing them (the "root" variant) in the Word Lookup Table (WordID, TableID, RecordPKeyID).

    For searching where no pre-processing to Word Lookup Table exists, then convert the words in the search string into all possible singular/plural, present/past tense etc. and then use THAT list (somehow!) to match strings - using LIKE/whatever

    CREATE FUNCTION dbo.FN_Trigram_Split
    (
    @string varchar(8000)
    )
    -- Find matching records in TrigramData Table
    RETURNS TABLE

    We use Trigrams for text searching in some circumstances

     

  • Comments posted to this topic are about the item A varchar(MAX) STRING_SPLIT function for SQL 2012 and above

Viewing 3 posts - 31 through 32 (of 32 total)

You must be logged in to reply to this topic. Login to reply