Need help on TSQL

  • Hello All,

    I have a table with value and the item in a table. Currently, some item has different value. I need to find the all the item which has different value. But i dont have to do anything with the item which has same value multiple times.

    Here is the sample of the table:

    Value Item

    1 A

    1 A

    1 B

    2 B

    5 C

    6 C

    Desired output look like below

    Value Item

    1 B

    2 B

    5 C

    6 C

    How do i write a query to get the desired output.

    Thanks in advance.

    W.

  • Hi Walton,

    Here's some example code, with comments.

    CREATE TABLE #t1 (Value int, Item char(1))

    INSERT INTO #t1

    SELECT 1, 'A'

    UNION ALL SELECT 1, 'A'

    UNION ALL SELECT 1, 'B'

    UNION ALL SELECT 2, 'B'

    UNION ALL SELECT 5, 'C'

    UNION ALL SELECT 6, 'C'

    -- Check we've got the sample data correct

    SELECT * FROM #t1

    -- First, we need to figure out which Item groups have more than one value.

    -- Divide #t1 into groups based on Item (GROUP BY), and remove any groups that don't have 2 or more different values (HAVING).

    SELECT Item, COUNT(DISTINCT Value) AS NumValues

    FROM #t1

    GROUP BY Item

    HAVING COUNT(DISTINCT Value) > 1;

    -- Using the query above, get all the rows from #t1 that are part of these groups.

    -- Make sure that the previous statement before WITH is terminated with a ;

    WITH cteDifferences AS (

    SELECT Item, COUNT(DISTINCT Value) AS NumValues

    FROM #t1

    GROUP BY Item

    HAVING COUNT(DISTINCT Value) > 1

    )

    SELECT #t1.Value, #t1.Item

    FROM #t1

    JOIN cteDifferences ON #t1.Item = cteDifferences.Item

    DROP TABLE #t1

  • Walton,

    Please make a note of how Jim set up the problem with scripts to create and populate the tables. The volunteers who help out here are much likelier to jump on your problem and code a solution if they don't have to first translate your cut-and-paste example into something they can actually work on. Since you are asking them to help you out, it is simple courtesy to do the create/populate script yourself. Thanks. 🙂

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Thanks a lot Jim. It did work exactly as expected.

    Thanks a lot again

    W.

  • There is an interesting alternative that avoids the join:

    SELECT Grouped.Item,

    Grouped.Value

    FROM (

    SELECT Ranked.Item,

    Ranked.Value,

    max_seq = MAX(Ranked.seq) OVER (PARTITION BY Item)

    FROM (

    SELECT T1.Item,

    T1.Value,

    seq = RANK() OVER (PARTITION BY Item ORDER BY Item, Value)

    FROM #t1 T1

    ) Ranked

    ) Grouped

    WHERE Grouped.max_seq > 1;

    Paul

  • Here's a Q&D alternative from a TSQL grunt:

    SELECT

    Value,

    Item

    FROM (

    SELECT

    Value,

    Item,

    [Instances] = COUNT(*) OVER (PARTITION BY Value, Item)

    FROM #t1

    ) d

    WHERE Instances = 1

    I like it because it has a pleasing shape on the page when you turn it upside down.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Anyone have any idea why I felt the need to try to reinvent COUNT(*) OVER using RANK and MAX?

    Thanks Chris.

  • Paul White NZ (9/23/2010)


    Anyone have any idea why I felt the need to try to reinvent COUNT(*) OVER using RANK and MAX?

    Thanks Chris.

    Sure - avoiding the JOIN in the code isn't quite what it seems. Funny though, the plans for your query and mine are very similar. Well, far more so than I'd expect.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • select *

    from #t1

    where item in (select item from #t1 group by item having COUNT(distinct value) > 1)

    Talk about your low hanging fruit...

    Mine could be just a bit longer.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • The Dixie Flatline (9/23/2010)


    select *

    from #t1

    where item in (select item from #t1 group by item having COUNT(distinct value) > 1)

    Talk about your low hanging fruit...

    Mine could be just a bit longer.

    :blush:

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Chris, I think your query returns incorrect results if we add a single extra item to the sample data:

    CREATE TABLE #t1

    (

    Value INTEGER NOT NULL,

    Item CHAR(1) NOT NULL

    );

    GO

    INSERT #t1

    (Value, Item)

    VALUES (1, 'A'),

    (1, 'A'),

    (1, 'B'),

    (2, 'B'),

    (5, 'C'),

    (6, 'C'),

    (9, 'D'); -- New!

    Interestingly, Bob's query, while undeniably compact, still scans the source table twice.

    It's estimated cost is 0.0287261

    My RANK and MAX query has an estimated cost of 0.0148543

    Interesting isn't it?

    By the way Chris, the reason your query and mine have such similar execution plans is because they both use something called a Common Subexpression Spool. I posted a detailed explanation of those on my blog: http://sqlblog.com/blogs/paul_white/archive/2010/07/28/partitioning-and-the-common-subexpression-spool.aspx

    Paul

  • Paul White NZ (9/24/2010)


    Chris, I think your query returns incorrect results ...

    :blush: I don't see it Paul...

    DROP TABLE #t1

    CREATE TABLE #t1

    (

    Value INTEGER NOT NULL,

    Item CHAR(1) NOT NULL

    );

    GO

    INSERT #t1

    (Value, Item)

    VALUES (1, 'A'), (1, 'A'), -- dupe, omit from output

    (1, 'B'), (1, 'B'), -- dupe, omit from output

    (2, 'B'),

    (5, 'C'),

    (6, 'C'), (6, 'C'), -- dupe, omit from output

    (9, 'D')

    SELECT

    Value,

    Item

    FROM (

    SELECT

    Value,

    Item,

    [Instances] = COUNT(*) OVER (PARTITION BY Value, Item)

    FROM #t1

    ) d

    WHERE Instances = 1

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Chris Morris-439714 (9/24/2010)


    :blush: I don't see it Paul...

    Yours is the only one to return the new row (item 'D').

    As I read the first post, the task is to return rows where an item has more than one value.

    Make sense, or am I being dumb here? (It's quite possible)

  • Paul White NZ (9/24/2010)


    Chris Morris-439714 (9/24/2010)


    :blush: I don't see it Paul...

    Yours is the only one to return the new row (item 'D').

    As I read the first post, the task is to return rows where an item has more than one value.

    Make sense, or am I being dumb here? (It's quite possible)

    It makes sense Paul (and there's a first time for everything).

    I read it then as "get everything which doesn't have a dupe", obviously, but I think you're right. It's just kinda hard to be sure.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Chris Morris-439714 (9/24/2010)


    It makes sense Paul (and there's a first time for everything). I read it then as "get everything which doesn't have a dupe", obviously, but I think you're right. It's just kinda hard to be sure.

    Hmm, you might be right - it is a bit ambiguous. I'm relying on the statement "I need to find the all the item which has different value." but it isn't clear what is to be done about items with only one value. Oh well, on to the next thing...

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply