Count number commas in string

  • Is there a way to count the number of commas in a record?

    I'm assuming CHARINDEX is used?

     

    EXAMPLE

    Data                                       Count

    Test,123,abc,,,                            5

    Test,,                                     2

     

  • You would have to create a UDF that counts the commas and returns the count.  If this data is being returned to an application, I think it might be more efficient to let the app do the work instead of SQL. 

    Here is an example function:

    use Pubs
    Go
    create function dbo.countChar(@value varchar(50), @char char(1))

    Returns int

    as

    begin
         declare @count int

         set @count = 0

         declare @pointer int

         set @pointer = charindex(@char,@value)

         while @pointer > 0

         begin

                set @count = @count + 1

                set @pointer = charindex(@char,@value,@pointer + 1)

         end

     return @count

    end

    select au_lname, dbo.countchar(au_lname,'a') from authors

    Hope this helps,

    Kathi

     

    Aunt Kathi Data Platform MVP
    Author of Expert T-SQL Window Functions
    Simple-Talk Editor

  • Sorry Kathi, but this is not the best solution. I've posted a similar one to the German newsgroup some days ago. And after thinking a night over it, came up with a better and more effective one.

    SELECT

     au_lname

     , LEN(au_lname)-LEN(REPLACE(au_lname,'a',''))

    FROM

     authors

    Run profiler and measure the duration and reads columns. You should notice a huge difference!

    There is, however, a drawback you need to be aware of. Consider this:

    DECLARE @a CHAR

    SET @a = '-'

    SELECT

    LEN('a1sdsad12---325143gffdfd4dgsf1 -')

    -

    LEN(REPLACE('a1sdsad12---325143gffdfd4dgsf1 -',@a,''))

               

    -----------

    5

    (1 row(s) affected)

    This is wrong, because LEN cuts of trailing blanks. So, if the underlying data, by any chance, contains such stupid data you need to workaround this. Probably the easiest one is:

    DECLARE @a CHAR

    SET @a = '-'

    SELECT

     LEN('a1sdsad12---325143gffdfd4dgsf1 -  ' + 'X')

     -

     LEN(REPLACE('a1sdsad12---325143gffdfd4dgsf1 -  ' + 'X', @a, ''))

               

    -----------

    4

    (1 row(s) affected)

    And finally, if you're using unicode, this will work:

    DECLARE @a NCHAR

    SET @a = '-'

    SELECT

     (DATALENGTH(N'a1sdsad12---325143gffdfd4dgsf1 -')

     -

     DATALENGTH(REPLACE(N'a1sdsad12---325143gffdfd4dgsf1 -',@a,'')))/2

               

    -----------

    4

    (1 row(s) affected)

     

    HTH

     

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Frank,

    I agree that your solution would be more efficient.  I don't have to use profiler to see that.  I knew my solution would work, but suggested that this be done in the application, not in SQL, because of performance.

    Cool solution, I never would have thought of it.  I'll have to start thinking a little more out of the box.

     

    Kathi

    Aunt Kathi Data Platform MVP
    Author of Expert T-SQL Window Functions
    Simple-Talk Editor

  • Yes, doing this at the client is surely the best advise.

    So far for my pseudo answer just to get the chance to post another smilie

    I'm in smilie mood.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Thanks all

  • to overcome the LEN issue you could use DATALENGHT instead


    * Noel

  • Okay, here's a riddle...

    DECLARE @a NCHAR

    SET @a = '-'

    SELECT DATALENGTH('a1sdsad12---325143gffdfd4dgsf1 -')

          -

          DATALENGTH(REPLACE('a1sdsad12---325143gffdfd4dgsf1 -',@a,''))

               

    -----------

    -24

    (1 row(s) affected)

    Can you tell me why?

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Don't know why but this solves it....

    DECLARE @a NCHAR

    SET @a = '-'

    SELECT datalength('a1sdsad12---325143gffdfd4dgsf1 -')

          -

          datalength(convert(varchar,REPLACE('a1sdsad12---325143gffdfd4dgsf1 -',@a,'')))

  • The answer lies within what you have done.

    REPLACE apparently converts the whole second string to NCHAR, since NCHAR has a higher datatype precedence.

    DECLARE @a NCHAR

    SET @a = '-'

    SELECT datalength('a1sdsad12---325143gffdfd4dgsf1 -')

          ,

          datalength(REPLACE('a1sdsad12---325143gffdfd4dgsf1 -',@a,''))

          , datalength('a1sdsad12---325143gffdfd4dgsf1 -')

          -

          datalength(REPLACE('a1sdsad12---325143gffdfd4dgsf1 -',@a,''))

                                       

    ----------- ----------- -----------

    32          56          -24

    (1 row(s) affected)

    Now, while this is somehow intuitive, the nice  thing about this is, that it isn't explicitely explained in BOL. At least I haven't found it.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • I like this for getting the count of a the number of occurrances of a single character.

    DECLARE @a VARCHAR

    SET @a = '-'

    SELECT

    DATALENGTH('a1sdsad12---325143gffdfd4dgsf1 -')

    -

    DATALENGTH(REPLACE('a1sdsad12---325143gffdfd4dgsf1 -',@a,''))

    Or for higher safety do

    DECLARE @a VARCHAR

    SET @a = '-'

    SELECT

    LEN(REPLACE('a1sdsad12---325143gffdfd4dgsf1 -',' ',''))

    -

    LEN(REPLACE(REPLACE('a1sdsad12---325143gffdfd4dgsf1 -',' ',''),@a,''))

    Now for the issue with the NCHAR

    DECLARE @a NCHAR

    SET @a = '-'

    SELECT DATALENGTH('a1sdsad12---325143gffdfd4dgsf1 -')

          -

          DATALENGTH(REPLACE('a1sdsad12---325143gffdfd4dgsf1 -',@a,''))

    DATALENGTH counts the number of bytes and you are running into an implicit typing issue here.

    With DATALENGTH('a1sdsad12---325143gffdfd4dgsf1 -') it is treating is as a CHAR type datatype and thus each character is 1 byte or 32 bytes total.

    But with DATALENGTH(REPLACE('a1sdsad12---325143gffdfd4dgsf1 -',@a,'')) because @a is defined as NCHAR the other string is implicitly cast to a NCHAR type datatype, thus each character is 2 bytes long and 64 total - 8 total for the 4 -'s and final is 56

    So now in the first case we have 32 and in the second it is 56 or 32 - 56 which is -24.

    To correct be explicit in your datatyping.

    DECLARE @a NCHAR

    SET @a = '-'

    SELECT (DATALENGTH(N'a1sdsad12---325143gffdfd4dgsf1 -')

          -

          DATALENGTH(REPLACE(N'a1sdsad12---325143gffdfd4dgsf1 -',@a,''))) / 2

  • To correct be explicit in your datatyping. ...

    Yes, like I stated somewhere above...

    But it was nice trick to confuse some people in the German newsgroup. Hey, with those postings you can easily debunk those pseudo-techies Ken Henderson meant in hia addendum.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • This should give you your exact requirements.  I modified it for you.

    Regards,

    J. McG.

     

    CREATE TABLE #TextSearch (

     [Text] [varchar] (3000) NULL ,

     [Count] [int] NULL

    )

    DECLARE @String  varchar(1000),  @Len   int,   

     @MatchStr  varchar(50),  @StartIndex  int,  

     @Pos   int,    @Chunk   varchar(50),

     @Count  int

    DECLARE cur_FindString CURSOR FAST_FORWARD FOR SELECT body FROM pagecontent

    OPEN  cur_FindString FETCH NEXT FROM cur_FindString INTO @STRING

    WHILE @@FETCH_STATUS = 0

    BEGIN

    SET @Pos = 0

    SET @Len = LEN(@String)

    SET @Count = 0

       

     WHILE CHARINDEX('www', @STRING, @Pos) > 0

      BEGIN   

       SET @StartIndex = CHARINDEX('www', @STRING, @Pos) 

       IF @StartIndex > 0

        BEGIN

         INSERT INTO #TextSearch ( [Text])

         VALUES (@String)

         SET @Count = @Count + 1

        END    

       UPDATE #TextSearch SET [COUNT] = @Count WHERE [Text] = @String

       SET @Pos = @StartIndex +4

       SET @MatchStr = ''   

       END

     FETCH NEXT FROM cur_FindString INTO @STRING  

    END

    CLOSE cur_FindString

    DEALLOCATE cur_FindString   

    SELECT * FROM #TextSearch

    DROP TABLE #TextSearch

  • This version posted earlier doesn't work properly:

    DECLARE @a NCHAR

    SET @a = '-'

    SELECT datalength('a1sdsad12---325143gffdfd4dgsf1 -')

          -

          datalength(convert(varchar,REPLACE('a1sdsad12---325143gffdfd4dgsf1 -',@a,'')))

     

    It returns the right result in this case, but if you search for a 'W' I got a result of 2, even though there is no 'W' in the string.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply