CHARINDEX Not Finding String

  • As part of a bigger query, where it pulls a substring out of a big, jumbled mess of other characters, this particular part is not working:

    CHARINDEX('<t2>', Data)

    I have triple checked that <t2> is indeed part of the data in the data column. Any ideas why it would return a zero and not find the string?

  • Please post an example string and usage...

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • MyDoggieJessie (10/3/2011)


    Please post an example string and usage...

    SELECT id,

    SUBSTRING(Data, CHARINDEX('<t2>', Data)+4, (CHARINDEX('</t2>', Data)- CHARINDEX('<t2>', Data))-4)

    FROM Tablename

    String Example:

    sdfjksdhfksdhkfhsdkfhksdhfksdhfksdhf<t2>net.rim.blackberry</t2><t3>1</t3><t4>sadfsfsdfdsf</t4><t5>asdasdasdsa</t5><t6>7.0.0</t6><t7>asdasdasdsa</t7><t8>asdasdasdas</t8><t9>asdasdasd;</t9></R>

    Should return: net.rim.blackberry

    Instead getting, "Invalid length parameter passed to the SUBSTRING function", because it is returning 0, because it cannot find string

  • I checked your code and ran something similar on SQL 2000/2005/and 2008 servers and it worked perfectly. Please run the code below on your system to see if you get the same result so we can perhaps pinpoint the source of your issue. What version of SQL Server and Service Packs are you running?

    CREATE TABLE #tmp (id int IDENTITY(1,1),col1 varchar(500))

    INSERT INTO #tmp

    SELECT 'sdfjksdhfksdhkfhsdkfhksdhfksdhfksdhf<t2>net.rim.blackberry</t2><t3>1</t3>

    <t4>sadfsfsdfdsf</t4><t5>asdasdasdsa</t5><t6>7.0.0</t6><t7>asdasdasdsa</t7><t8>asdasdasdas</t8>

    <t9>asdasdasd;</t9></R>'

    SELECT id,

    SUBSTRING(col1, CHARINDEX('<t2>', col1)+4, (CHARINDEX('</t2>', col1)- CHARINDEX('<t2>', col1))-4)

    FROM #tmp

    DROP TABLE #tmp

    Returns:

    (1 row(s) affected)

    id col1

    ----------- --------------

    1 net.rim.blackberry

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • MyDoggieJessie (10/3/2011)


    I checked your code and ran something similar on SQL 2000/2005/and 2008 servers and it worked perfectly. Please run the code below on your system to see if you get the same result so we can perhaps pinpoint the source of your issue. What version of SQL Server and Service Packs are you running?

    CREATE TABLE #tmp (id int IDENTITY(1,1),col1 varchar(500))

    INSERT INTO #tmp

    SELECT 'sdfjksdhfksdhkfhsdkfhksdhfksdhfksdhf<t2>net.rim.blackberry</t2><t3>1</t3>

    <t4>sadfsfsdfdsf</t4><t5>asdasdasdsa</t5><t6>7.0.0</t6><t7>asdasdasdsa</t7><t8>asdasdasdas</t8>

    <t9>asdasdasd;</t9></R>'

    SELECT id,

    SUBSTRING(col1, CHARINDEX('<t2>', col1)+4, (CHARINDEX('</t2>', col1)- CHARINDEX('<t2>', col1))-4)

    FROM #tmp

    DROP TABLE #tmp

    Returns:

    (1 row(s) affected)

    id col1

    ----------- --------------

    1 net.rim.blackberry

    That works on my system as is and so does the "real" data (I changed it a little in this post for privacy reasons). Then I realized that your temp table used a varchar field and the table I'm querying is text (third party). So I changed your query to use a text field , and the data I used for the example works. The "real" data has a lot more characters and does not work. Could length of the text field be the problem?

    ETA: It is a SQL Server 2005 SP2

  • I don't believe the length of it matters. I used your code and made up a string of nearly 6400 characters and it still ran just fine...

    Either there's an issue with encoding (i.e. your local server and the one you're getting the data from) or there's something in the string itself that's causing it...

    Sorry wish I could help more!

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • MyDoggieJessie (10/3/2011)


    I don't believe the length of it matters. I used your code and made up a string of nearly 6400 characters and it still ran just fine...

    Either there's an issue with encoding (i.e. your local server and the one you're getting the data from) or there's something in the string itself that's causing it...

    Sorry wish I could help more!

    I think the limit might actually be 8,000 char. My string has 8500 characters; it is huge! I wonder if I can code around it?!? Hmm...

    Thanks for all your help! It really helped me troubleshoot!

  • http://bytes.com/topic/sql-server/answers/80571-sql-server-text-field

    ______________________________________________________________________________Never argue with an idiot; Theyll drag you down to their level and beat you with experience

  • DBAgal (10/3/2011)


    MyDoggieJessie (10/3/2011)


    I don't believe the length of it matters. I used your code and made up a string of nearly 6400 characters and it still ran just fine...

    Either there's an issue with encoding (i.e. your local server and the one you're getting the data from) or there's something in the string itself that's causing it...

    Sorry wish I could help more!

    I think the limit might actually be 8,000 char. My string has 8500 characters; it is huge! I wonder if I can code around it?!? Hmm...

    Thanks for all your help! It really helped me troubleshoot!

    The 8,000 character limit is the expression to search for, not what you are searching. I tried this mimicing your data type (text) to be sure:

    CREATE TABLE #tmp (id int IDENTITY(1,1),col1 text)

    INSERT INTO #tmp

    SELECT 'sdfjksdhfksdhkfhsdkfhksdhfksdhfksdhf<t2>net.rim.blackberry</t2><t3>1</t3>

    <t4>sadfsfsdfdsf</t4><t5>asdasdasdsa</t5><t6>7.0.0</t6><t7>asdasdasdsa</t7><t8>asdasdasdas</t8>

    <t9>asdasdasd;</t9></R>'

    SELECT id,

    SUBSTRING(col1, CHARINDEX('<t2>', col1)+4, (CHARINDEX('</t2>', col1)- CHARINDEX('<t2>', col1))-4)

    FROM #tmp

    DROP TABLE #tmp

    I also got the expected results. Are you sure this column is text? Try something for me... select the column, copy the value, and paste it into notepad. What happens? Also, have you tried using PATINDEX instead of CHARINDEX?

    Thanks,

    Jared

    Jared
    CE - Microsoft

  • does this return the expected data?

    SELECT

    id,

    SUBSTRING(CONVERT(VARCHAR(max), Data),

    CHARINDEX('<t2>', CONVERT(VARCHAR(max), Data)) + 4,

    ( CHARINDEX('</t2>', CONVERT(VARCHAR(max), Data)) - CHARINDEX('<t2>', CONVERT(VARCHAR(max), Data)) ) - 4)

    FROM

    Tablename

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Also, SQL Server will return the error you are getting if even 1 row returns an invalid argument for the substring. So, if one of your strings does not contain '</t2>' and you try to subtract from 0, that will give an invalid length parameter.

    Try this:

    SELECT CHARINDEX('</t2>', data)

    FROM TableName

    WHERE CHARINDEX('</t2>', data) < 4

    If any data is returned, that is your first problem.

    Thanks,

    Jared

    Jared
    CE - Microsoft

  • jared-709193 (10/3/2011)


    DBAgal (10/3/2011)


    MyDoggieJessie (10/3/2011)


    I don't believe the length of it matters. I used your code and made up a string of nearly 6400 characters and it still ran just fine...

    Either there's an issue with encoding (i.e. your local server and the one you're getting the data from) or there's something in the string itself that's causing it...

    Sorry wish I could help more!

    I think the limit might actually be 8,000 char. My string has 8500 characters; it is huge! I wonder if I can code around it?!? Hmm...

    Thanks for all your help! It really helped me troubleshoot!

    The 8,000 character limit is the expression to search for, not what you are searching. I tried this mimicing your data type (text) to be sure:

    CREATE TABLE #tmp (id int IDENTITY(1,1),col1 text)

    INSERT INTO #tmp

    SELECT 'sdfjksdhfksdhkfhsdkfhksdhfksdhfksdhf<t2>net.rim.blackberry</t2><t3>1</t3>

    <t4>sadfsfsdfdsf</t4><t5>asdasdasdsa</t5><t6>7.0.0</t6><t7>asdasdasdsa</t7><t8>asdasdasdas</t8>

    <t9>asdasdasd;</t9></R>'

    SELECT id,

    SUBSTRING(col1, CHARINDEX('<t2>', col1)+4, (CHARINDEX('</t2>', col1)- CHARINDEX('<t2>', col1))-4)

    FROM #tmp

    DROP TABLE #tmp

    I also got the expected results. Are you sure this column is text? Try something for me... select the column, copy the value, and paste it into notepad. What happens? Also, have you tried using PATINDEX instead of CHARINDEX?

    Thanks,

    Jared

    Yes, the column is text. It pasted fine when I pasted the string in Notepad. PATINDEX doesn't make a difference; I receive the same error.

    I know that it isn't finding the string. This is the query I used for troubleshooting:

    select Data, CHARINDEX('<t2>', Data)+4, CHARINDEX('</t2>', Data) -CHARINDEX('<t2>', Data)-4

    FROM tablename where id in ( 4524263) --troublesome id

    The results where the <text string>, 4, -4; which means that CHARINDEX('<t2>', Data) evaluated to zero.

  • Lowell (10/3/2011)


    does this return the expected data?

    SELECT

    id,

    SUBSTRING(CONVERT(VARCHAR(max), Data),

    CHARINDEX('<t2>', CONVERT(VARCHAR(max), Data)) + 4,

    ( CHARINDEX('</t2>', CONVERT(VARCHAR(max), Data)) - CHARINDEX('<t2>', CONVERT(VARCHAR(max), Data)) ) - 4)

    FROM

    Tablename

    Yes, thank you. I was thinking of doing the same thing, just didn't have the time to write it up.

  • jared-709193 (10/3/2011)


    Also, SQL Server will return the error you are getting if even 1 row returns an invalid argument for the substring. So, if one of your strings does not contain '</t2>' and you try to subtract from 0, that will give an invalid length parameter.

    Try this:

    SELECT CHARINDEX('</t2>', data)

    FROM TableName

    WHERE CHARINDEX('</t2>', data) < 4

    If any data is returned, that is your first problem.

    Thanks,

    Jared

    Rows are returned for the query, but I verified that their is indeed a </t2> tag. Hmmm...

  • it's pretty well documented that SQL will keep data for a TEXT datatype in a varchar(8000) as long as it's less than 8000 chars, otherwise it's stored a different way,and then cannot be accessed directly...you have to use textpointers and that makes it a lot harder to fiddle with;

    so some rows are shorter than 8000 chars and can be searched via substrings, and the longer rows cannot.

    add a DATALENGTH(Data) to your results, and see which how large the results are when you are successful...i'm sure they are under 8K in characters

    converting to varchar(max) lets you get them all.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply