Data Extract

  • I have a table witha column as clientid. the client id datais as follows

    '55566',

    '00055566'

    'abcd'

    'ijkl'

    '1234'

    '8890'

    '05566'

    '008890'

    '8765'

    '9021'

    IN the above list 55566 and 8890 is a client which gets repeated a few times with superceding 0's.is there a way we extract ony thoose clientids and see if these exists as substring data in another record.if there a straight sql which could give me the results set as

    '55566'

    '00055566'

    '05566'

    '8890'

    '008890'

    ANy help on this wil lbe greatly appreciated. TIA

  • where the sample values 'abcd' and 'ijkl' there to show that the clientid is alphanumeric, or are all the id's really numeric?

    if they are numeric, you can convert them to INT, for example.

    converting to int will strip preceeding zeros...

    WHERE CONVERT(int,client_id) IN(55556,8890)

    if they are numeric and have to be varchars, recovert again to varchar:

    WHERE CONVERT(varchar,CONVERT(int,client_id)) IN('55556','8890')

    if the values are sometimes alphanumeric, you can use a case statement:

    WHERE

    CASE

    WHEN ISNUMERIC(client_id) = 1

    THEN CONVERT(varchar,CONVERT(int,client_id))

    ELSE client_id

    END IN('55556','8890')

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • I'm not entirely clear on what you want here. Do you want the string-literal, or do you want the numeric equivalent so you can tie them together?

    If the numeric equivalent, then insert into a temp table using "Where PatIndex '%[^0-9]%' = 0" (that will get you the ones that are strictly numeric), and the convert the values in the temp table to integers. That will give you the ones that can be converted to integers, and you can then pull equivalances from there.

    Something like:

    create table #T (

    RawVal varchar(100),

    NumVal int);

    insert into #T (RawVal)

    select MyColumn

    from dbo.MyTable

    where PatIndex '%[^0-9]%' = 0;

    update #T

    set NumVal = RawVal;

    select *

    from #T as T1

    inner join #T as T2

    on T1.RawVal != T2.RawVal

    and T1.NumVal = T2.NumVale;

    That will give you the original values that would be equivalent if they were integers, like "8890" and "0008890". You can join that back to the original table to get rows that have "the same value" numerically but different values as strings.

    Is that what you need? Or am I misreading something?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • You can try something like this and but it will also match ABC in 0ABC (if this is what you want):

    DECLARE @clients TABLE (clientno VARCHAR(20))

    INSERT INTO @clients

    SELECT '55566' As clientno

    UNION ALL SELECT '00055566'

    UNION ALL SELECT 'abcd'

    UNION ALL SELECT 'ijkl'

    UNION ALL SELECT '1234'

    UNION ALL SELECT '8890'

    UNION ALL SELECT '05566'

    UNION ALL SELECT '008890'

    UNION ALL SELECT '8765'

    UNION ALL SELECT '9021'

    UNION ALL SELECT '0ABC'

    UNION ALL SELECT 'ABC'

    SELECT DISTINCT x.clientno

    FROM @clients c

    CROSS APPLY (

    SELECT clientno, REPLACE(LTRIM(REPLACE(clientno, '0', ' ')),' ', '0') as clientno1

    FROM @clients) x

    WHERE CHARINDEX(x.clientno1, c.clientno) > 0 and x.clientno <> c.clientno


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • IsNumeric won't necessarily do the job.

    Try this to see why:

    SELECT ISNUMERIC('1d3');

    SELECT CAST('1d3' AS INT);

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Sorry i should have mentioned it before . its a varchar column which can have client ids from [0-9] or [a-z].

  • dwain.c (3/26/2012)


    You can try something like this and but it will also match ABC in 0ABC (if this is what you want):

    DECLARE @clients TABLE (clientno VARCHAR(20))

    INSERT INTO @clients

    SELECT '55566' As clientno

    UNION ALL SELECT '00055566'

    UNION ALL SELECT 'abcd'

    UNION ALL SELECT 'ijkl'

    UNION ALL SELECT '1234'

    UNION ALL SELECT '8890'

    UNION ALL SELECT '05566'

    UNION ALL SELECT '008890'

    UNION ALL SELECT '8765'

    UNION ALL SELECT '9021'

    UNION ALL SELECT '0ABC'

    UNION ALL SELECT 'ABC'

    SELECT DISTINCT x.clientno

    FROM @clients c

    CROSS APPLY (

    SELECT clientno, REPLACE(LTRIM(REPLACE(clientno, '0', ' ')),' ', '0') as clientno1

    FROM @clients) x

    WHERE CHARINDEX(x.clientno1, c.clientno) > 0 and x.clientno <> c.clientno

    Despite what your signature says, this has hidden RBAR. If you take out the DISTINCT, you'll see that your table with 12 records produces 144 rows. You're essentially doing a cross join to itself and producing the square of the number of rows. Needless to say, this will bog down a server very quickly. This can be easily fixed by removing the FROM clause from inside the CROSS APPLY.

    While your formula works, I think the following formula more closely matches the description of the problem (and is therefore easier to follow) and I was unable to find any performance difference between the two approaches. (Of course, I didn't test on Jeff's million row table, either.)

    SELECT c.clientno, x.ClientNo1

    FROM @clients c

    CROSS APPLY (

    SELECT SUBSTRING(c.ClientNo, PATINDEX('%[^0]%', c.ClientNo), Len(c.ClientNo)) AS ClientNo1

    ) AS x

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Drew - Yes I am aware of hidden RBAR gotchas and if I can I usually try to remove them. In this case, I didn't see that way but your post made me see the light.

    Good solution.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply