Split function input be column of a table

  • That script doesn't even have 29 lines. If you double click the error message, which line does it jump to in the query pane?


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • There were blank lines between each statement. Removed them

    Create Table TableA (ID varchar(2), Nodes varchar(5), LT int)

    Create Table TableB (ID int, [Desc] varchar(20), Qty int)

    Insert Into TableA Values ('M1', '1;2;3', 4)

    Insert Into TableA Values ('M2', '4;5;6', 5)

    Insert Into TableB Values (1, 'Desc of 1', 2)

    Insert Into TableB Values (2, 'Desc of 2', 13)

    Insert Into TableB Values (3, 'Desc of 3', 43)

    Insert Into TableB Values (4, 'Desc of 4', 52)

    Insert Into TableB Values (5, 'Desc of 5', 66)

    Insert Into TableB Values (6, 'Desc of 6', 77)

    Select A.ID As TableAID, A.LT, B.ID As TableBID, B.[Desc], B.Qty

    From TableA A

    Inner Join TableB B On B.ID IN

    (Select Element From

    dbo.SplitToInt(A.Nodes, ';'))

    Now the error is on line 15 which is the last line. If I remove the "dbo." in the function call I still get the error. So I guess the error is in the dot in the red text.

  • That makes it seem like the error is in the function. Do you get an error if you run this:

    Select Element From

    dbo.SplitToInt('1;2;3', ';'))


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • There was an extra ) at the end. Removed it and the function runs fine with no error. This function is in production and is called from multiple places.

     

  • ... 'course, if you have a Tally table, you don't even need a function (although they are convenient)... included the original data for convenience...

    Create Table TableA (ID varchar(2), Nodes varchar(5), LT int)

    Create Table TableB (ID int PRIMARY KEY, [Desc] varchar(20), Qty int)

    Insert Into TableA Values ('M1', '1;2;3', 4)

    Insert Into TableA Values ('M2', '4;5;6', 5)

    Insert Into TableB Values (1, 'Desc of 1', 2)

    Insert Into TableB Values (2, 'Desc of 2', 13)

    Insert Into TableB Values (3, 'Desc of 3', 43)

    Insert Into TableB Values (4, 'Desc of 4', 52)

    Insert Into TableB Values (5, 'Desc of 5', 66)

    Insert Into TableB Values (6, 'Desc of 6', 77)

     SELECT a.ID AS TableAID,a.LT,b.ID AS TableBID,b.[Desc],b.QTY
       FROM dbo.TableB b,
            (
             SELECT ID, 
                    LTRIM(SUBSTRING(ta.Nodes, t.N+1, CHARINDEX(';', ta.Nodes, t.N + 1) - t.N - 1)) AS ParsedNode,
                    LT
               FROM dbo.Tally t,
                    (SELECT ID,';'+Nodes+';' AS Nodes, LT FROM dbo.TableA) ta 
              WHERE SUBSTRING(ta.Nodes, t.N, 1) = ';' 
                AND t.N < LEN(ta.Nodes)
            ) a
      WHERE a.ParsedNode = b.ID
      ORDER BY TableAID,a.LT,TableBID

    If you still want to use a split function, suggest you convert it to use the Tally table instead of a while-loop... you can extract the necessary code from above to make the function...

    And, if you still don't have a Tally table, now's probably a good time to make one...

    --===== Create and populate the Tally table on the fly

     SELECT TOP 11000 --equates to more than 30 years of dates

            IDENTITY(INT,1,1) AS N

       INTO dbo.Tally

       FROM Master.dbo.SysColumns sc1,

            Master.dbo.SysColumns sc2

    --===== Add a Primary Key to maximize performance

      ALTER TABLE dbo.Tally

            ADD CONSTRAINT PK_Tally_N PRIMARY KEY CLUSTERED (N)

    --===== Allow the general public to use it

      GRANT SELECT ON dbo.Tally TO PUBLIC

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Seems like a huge waste of time and effort. Not to mention what happens when the ID's reach 11001.


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • Thanks Jeff. Works great. Didn't have a Tally table.

    Too late in the day to look at this query in detail now.

    Thanks again.

  • >> why would you need the ID's to go above 11,000 for splitting varchars?

    I stand corrected. I thought the Tally value would need to be able to join to the ID. My mistake.

    However, if I compare the actual and estimated execution plans of the query I posted and the query you posted, your version has a 94% total cost vs. a 6% total cost for mine.

    If you view the query plans, you will see that SQL Server still performs nested loops to get the data. Just because you don't use the WHILE keyword doesn't mean you're not doing a loop.

    When analyzing the cost of the individual steps in the query plan of your query, what do you think is the largest performance hit?

    If you guessed the Clustered Index Seek on the Tally table ..... you are correct!!!! It accounted for 33% of the total cost of your query.

     

    When analyzing the cost of the individul steps in the query plan of my query, what do you think is the smallest performance hit?

    If you guessed the Split UDF, you are correct again. The UDF accounted for less than 1% of the total cost of my query. The execution plan actually shows 0%, but it's rounding off.


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • (Not sure what happened to the post you responded to)

    Yep... I could be wrong... please post your function and your test code some I might do some testing... I will, of course, post all the test code.

    For starters, here're the test tables I intend to use for performance and resource testing... I've always been curious as to whether a good While loop function will beat a Tally table... Who knows? Could be incentive for me not to use one anymore...

    SET NOCOUNT ON

    --===== Create and populate a small test table as TableA.

     SELECT TOP 10000

            ID         = 'M'+LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3)),

            Nodes      = LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3))+';'

                       + LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3))+';'

                       + LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3))+';'

                       + LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3))+';'

                       + LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3))+';'

                       + LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3))+';'

                       + LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3))+';'

                       + LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3))+';'

                       + LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3))+';'

                       + LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3)),

            LT         = CAST(RAND(CAST(NEWID() AS VARBINARY))*100 AS INT)

       INTO dbo.TableA

       FROM Master.dbo.SysColumns sc1,

            Master.dbo.SysColumns sc2

    --===== Create single position CSVs

     UPDATE dbo.TableA

        SET Nodes = LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3))

      WHERE RIGHT(ID,1) = '3'

    --===== Create 2 position CSVs

     UPDATE dbo.TableA

        SET Nodes = LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3))+';'

                    + LTRIM(STR(RAND(CAST(NEWID() AS VARBINARY))*100,3))

      WHERE RIGHT(ID,1) = '5'

    --===== Create places where the position is blank for CSVs

     UPDATE dbo.TableA

        SET Nodes = REPLACE(REPLACE(REPLACE(Nodes,';0;',';;'),'0;',';'),';0',';')

    --===== Create places where there are nulls for all columns

     UPDATE dbo.TableA

        SET Nodes    = NULL

      WHERE RIGHT(ID,1) = '7'

    --===== Create and populate TableB

     SELECT TOP 101

            ID     = IDENTITY(INT,0,1),

            [Desc] = CAST(NULL AS VARCHAR(20)),

            Qty    = CAST(RAND(CAST(NEWID() AS VARBINARY))*100 AS INT)

       INTO TableB

       FROM Master.dbo.SysColumns sc1,

            Master.dbo.SysColumns sc2

     UPDATE TableB

        SET [Desc] = 'Desc of ' + CAST(ID AS VARCHAR(10))

    --===== Display partial content of both tables just for grins

    --  SELECT TOP 10 * FROM TableA

    --  SELECT TOP 10 * FROM TableB

    DECLARE @StartTime DATETIME

        SET @StartTime = GETDATE()

    ---------------------------------------------------------------------------

    ... insert code to be tested here...

    ---------------------------------------------------------------------------

    PRINT DATEDIFF(ms,@StartTime,GETDATE())

     

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • quote

    But I still get a Incorrect Syntax near '.' error when I run the query by Robert Davis.

    Select A.ID As TableAID, A.LT, B.ID As TableBID, B.[Desc], B.Qty

    From TableA A

    Inner Join TableB B On B.ID IN

    (Select Element From dbo.SplitToInt(A.Nodes, ';'))

    SQL Junkie...

    I tested your function and it works fine like this...

     SELECT *

       FROM dbo.SplitToInt('1;2;3;4;5;6;7',';')

    ...But I don't believe you can use it in a correlated subquery where one of the operands is a column in the external query as in Robert's example... hence, the consistent error at the ".".   In fact, I don't believe you can ever use a column name in association with a function that returns a TABLE data type.

    Could be wrong but I can't get his example to work... still waiting for him to post the function so we can see what he's talking about and how he got it to work.  It would be great if it worked 'cause I can think of lots of uses for it... I just don't think it can be made to work with a column name as an operand and a TABLE data type as a return.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hey Robert!

    Man, I just tried your fnStripTags function and I converted it to use a Tally table and tested both... I'm very well impressed!  In this case, the WHILE loop version (your version) runs twice as fast as the Tally table version (wrote it 2 different ways{one with STUFF and 1 with concatenation}, to boot!).

    Now I really would like to see what you came up with for a split function!  It's gotta be better than the Tally table version...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 11 posts - 16 through 25 (of 25 total)

You must be logged in to reply to this topic. Login to reply