find the position of the charcter in delimited string

  • Would someone please doublecheck my results? For comparison, Peter's original code is shown in a slightly modified version below. It now takes the sequence string as an input variable, rather than a constant. Both versions are using dbo.Tally, and not the "virtual" tally cte. Accepting an input variable appears to slow things down quite a bit, and the presence of multiple strings of all zeroes appears (to me) to confuse Peter's results.

    You're absolutely right, Bob. The change to an input variable caused wrong results, not only with the presence of strings with only zeroes. As I already noticed in my previous post, the result of the SELECT ... FOR XML may not be in the right order. When using a virtual tally table or an input variable instead of a constant value, I see the query plan is changed. In my original solution a Streaming Aggregate was used, which, I assume, preserves order. When using a virtual tally table or a input variable the query plan is changed to use a HashMatch to aggregate. I suppose the HashMatch does not preserve order, hence the incorrect results. Here's the fixed code:

    set statistics time off

    set nocount on;

    declare @input varchar(1000)

    set @input = '0000,0001,1000,0000,01010,000101,0000'

    print '----- Peters Original Solution (fixed) ------'

    set statistics time on;

    select stuff(replace('*' +

    (

    select

    case when t.c = '1' then ',' + t.p else '*' end

    from

    (

    select

    n.N,

    substring(s.Sequence, n.N, 1) as c,

    cast(n.N - max(coalesce(t.N, 0)) as varchar(10)) p

    from

    (

    select @input as Sequence

    ) s

    cross join

    dbo.Tally n

    left join

    dbo.Tally t on t.N <= n.N and substring(s.Sequence, t.N, 1) = ','

    where

    n.N <= len(s.Sequence)

    group by

    n.N, substring(s.Sequence, n.N, 1)

    ) t

    where

    t.c '0'

    order by -- absolutely required

    t.N

    for xml path('')

    )

    , '*,', '*'), 1, 1, '')

    I'm afraid (well, not really :-)) my machine is too fast to say something useful about performance based on the timing statistics and a single input value. Both solutions run in 0ms most of the time. Only elapsed time may vary. But based on these elapsed times, it looks like your solution is slightly faster.

    Peter

  • Thanks for posting back, Peter. Your results make sense when I look at them now. Still using my creaky old laptop, I'm seeing the following times for the new code.

    ----- Peters Original Solution (fixed) ------

    SQL Server Execution Times:

    CPU time = 265 ms, elapsed time = 813 ms.

    If you are getting 0ms response times, I envy you and covet your machine. 😉

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • You don't have to envy me. Based on the figures you posted, obviously this machine is making me producing very slow code, although I have not figured out yet what causes the huge difference on your machine. Maybe I will spend some more time to dig deeper into that. Anyway, a funny excercise.

    Peter

  • Haven't tested the following code for other scenarios, but works for the one mentioned in the post:

    declare @STR varchar(50)

    declare @str2 varchar(50)

    declare @str3 varchar(50)

    declare @str4 varchar(50)

    set @str4='0001,01010,000101'

    set @STR=@str4

    set @str2=''

    set @str2=@str2+charindex('1', @STR)

    while charindex(',',@str)!=0

    begin

    set @str2=@str2+'*'

    set @str3=substring(@str, charindex(',', @STR)+1, len(@str))

    set @str2=@str2+cast(charindex('1', @str3) as varchar)+','

    set @STR=@str3

    set @str3=substring(@str3, charindex('1',@str3)+1, len(@str3))

    set @str2=@str2+cast(charindex('1', @str3)+charindex('1',@str) as varchar)

    end

    select @str4,@str2

  • A straightforward procedural solution, but it really needs to be tested against other patterns. Unless the OP (original poster) tells us otherwise, we can't assume that there will only be three elements, that the first element will only have a single '1', that the second element will only have two '1's, etc.

    You could create a WHILE loop to build a more open ended solution, but if you did I believe you would find that performance would still be slower than either of the set based solutions. Good effort, though.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Yes Bob I did make the following assumptions while writing the code;

    1. First set has one '1'

    2. Every other set has 2 '1's

    3. Last set doesnt have a comma

    Thought not to spend more time unless the "OP" is satisfied with this... 😉

  • select '4*2,4*4,6'

    Might as well make some more assumptions. 😉

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Friday afternoon fun with Tally tables! I decided to go the table route rather than a single variable route (As the original post seems to indicate that he needs to perform this random and puzzling action on an entire column). For that we need some test data.

    Figuring out how to generate random test data without looping actually took about 10x as long as the solution. This isn't terribly efficient and could probably be improved upon, but it met my goals of making it easy to add rows or alter density.

    ------------ Generate some Test Data ----------------------

    CREATE TABLE TD082809(

    TDID INT IDENTITY(1,1) PRIMARY KEY,

    CV VARCHAR(200))

    ;WITH

    RND2 (N,R2)AS (

    SELECT N, RAND(CHECKSUM(NEWID())) R2

    FROM Tally

    WHERE N <=200),

    RND5 (N,R5) AS (

    SELECT N, RAND(CHECKSUM(NEWID())) R5

    FROM Tally

    WHERE N <=500)

    INSERT INTO TD082809(CV)

    SELECT (

    SELECT ''+

    CASE

    WHEN (R2 + R5)/2<.65 THEN '0'

    WHEN (R2 + R5)/2<.85 THEN '1'

    ELSE ','

    END

    FROM Tally T

    INNER JOIN RND2 ON T.N = RND2.N

    ,( SELECT RND5.N, R5

    FROM RND5

    WHERE TOUT.N = RND5.N) RND5

    WHERE T.N <= 200

    FOR XML PATH('')) CV

    FROM Tally TOUT

    WHERE TOUT.N <=500

    UPDATE TD082809

    SET CV = REPLACE(CV,',,',',0')

    WHERE CV LIKE '%,,%'

    UPDATE TD082809

    SET CV = REPLACE(CV,',,',',0')

    WHERE CV LIKE '%,,%'

    -- SELECT * FROM TD082809

    Then the solution. Not terribly happy with the speed of this either, the outer reference kinda kills it.

    SELECT CV,(

    SELECT CASE WHEN SUBSTRING(CV,N,1) = '1'

    THEN CAST(N AS VARCHAR(3)) + ','

    WHEN SUBSTRING(CV,N,1) = '*'

    THEN '*'

    END

    FROM Tally,(SELECT REPLACE(CV,',','*') CV

    FROM TD082809 WHERE TDID = Tout.N) TD

    WHERE N < DATALENGTH(CV)

    FOR XML PATH(''))

    FROM Tally Tout

    LEFT JOIN TD082809 ON Tout.N = TD082809.TDID

    WHERE N < (SELECT COUNT(*) FROM TD082809)

    I'll second the props to Jeff for learnin' me all about Tally tables.

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • Hey Seth 🙂

    You inspired me to convert my CTE solution to an inline table valued function (at the bottom). Use it with a cross apply and it's not too shabby.

    use sandbox

    set nocount on;

    DECLARE @input varchar(100)

    DECLARE @output varchar(100)

    DECLARE @inputTbl Table (rowID int identity(1,1), Input varchar(100))

    INSERT into @inputTbl

    select '0000,0001,1000,0000,01010,000101,0000' union all

    select '0110,0001,1000,0000,01010,000101,10000' union all

    select '011011,01001,1000,0000,01010,000101,00001' union all

    select '10110,01001,1000,00100,01010,000101,0000' union all

    select '0110,0001,1000,011000,01010,000101,01000' union all

    select '01110,0001,10010,0000,01010,000101,0000' union all

    select '011000,0001,1000,001100,01010,0010101,1111'

    set statistics time on;

    select *

    from @inputTbl

    cross apply dbo.BuildString(input)

    set statistics time off;

    SQL Server Execution Times:

    CPU time = 31 ms, elapsed time = 84 ms.

    use sandbox;

    GO

    CREATE FUNCTION dbo.BuildString

    (

    @input varchar(8000)

    )

    RETURNS TABLE

    AS

    RETURN

    (

    -- using predefined tally table

    -- ;with tally (N) as (select row_number() over (order by ID) from master..syscolumns)

    with cte1 (position,value) as

    (select N,substring('*'+ replace(@input,',','*'),N,1)

    from tally

    where substring('*'+ replace(@input,',','*'),N,1) in ('1','*')

    and N<=len('*'+ replace(@input,',','*'))

    )

    ,cte2 (c1Pos,c1Val,c2Pos,c2Val,strValue) as

    (select *,

    case when c.Value = '*' and c2.Value = '*' then '0*'

    when c.Value = '*' and c2.Value '*' then '*'

    else null end as strValue

    from cte1 c

    join cte1 c2 on c2.position < c.position

    where c.value = '*' or (c.value '*' and c2.value = '*')

    )

    ,cte3 as (

    select c1Pos,c1Val, min(strValue) as strValue, min(c1Pos-c2Pos) as bitPos

    from cte2

    group by c1Pos,c1Val

    )

    ,cte4 (c1Pos,strValue) as (

    select c1pos,case when strValue is not null then strValue else cast(bitPos as varchar(3))+',' end from cte3

    )

    ,cte5 (result) as

    (select ( SELECT top 100 percent '' + strValue

    FROM cte4

    ORDER BY c1Pos

    FOR XML PATH('')

    )

    )

    ,cte6 (result) as

    (select replace(replace(result,'**','*0*'),',*','*') from cte5)

    select case when right(result,1) = '*' then result+'0'

    when right(result,1) = ',' then stuff(result,len(result),1,'') else result end as result

    from cte6

    )

    GO

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Garadin (8/28/2009)


    I'll second the props to Jeff for learnin' me all about Tally tables.

    Heh... thanks, Seth. Folks like you, Bob, and a couple of others have made it a household name and have taken it to the next level, as well. I do have to remind folks, though... I didn't invent it... "I jist 'splained howzit wurks". 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • This is a plain looping solution that reads the string character by character. It can handle input string of 2000 chars. String can start with 0, 1, or comma.

    Bob, can you tell me how inefficient it is compared to some of your solutions that I'm trying to understand?

    BTW, seems like the SQL formatting is not maintained 🙁

    declare

    @input nvarchar(2000),

    @inputlength tinyint,

    @readpos tinyint,

    @readchar nvarchar(1),

    @onepos tinyint,

    @output nvarchar(2000)

    select

    @input='0001,01010,000101',

    @inputlength = len(@input),

    @readpos = 1,

    @onepos = 1,

    @output = ''

    while @readpos 0 and right(@output,1) = ','

    begin

    set @output = substring(@output,1,len(@output)-1)

    end

    select

    @output = @output + '*',

    @onepos = 0

    end

    else if @readchar = '1'

    begin

    set @output = @output + cast(@onepos as nvarchar(4)) + ','

    end

    else

    begin

    set @output = @output + ''

    end

    select

    @readpos = @readpos + 1,

    @onepos = @onepos + 1

    end

    if len(@output) > 0 and right(@output,1) = ','

    begin

    set @output = substring(@output,1,len(@output)-1)

    end

    select @output as 'Output'

    Riz

  • Glad to, Riz 🙂 On my machine, the loop is taking 29-30 milliseconds, on the average, to run. A couple of times it dropped to 16, and several times it soared to 50-60. By comparison, the inline table valued function using the revised CTE solution is doing 50 rows (of 7 elements apiece) in 155-170 milliseconds.

    If the concept of a tally table is new to you, you can find Jeff's most excellent article on the subject here[/url]. Essentially, tally tables enable you to do set-based processing of tasks that would appear to require loops. They can be deadly efficient at doing it too.

    Setting statistics time on for the loop, you see the following. (Yes, there really are that many lines.)

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    (1 row(s) affected)

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 1 ms.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Viewing 12 posts - 16 through 26 (of 26 total)

You must be logged in to reply to this topic. Login to reply