DISTINCT AND COUNT

  • I have an issue I am using to count the number of records in a table. When i execute the following statement:

    SELECT COUNT (DISTINCT [Name])FROM [zzWembley WIC 2nd QTR 2008]

    I get 2640 records

    WHEN I EXECUTE the following statement:

    SELECT DISTINCT [Name]

    FROM [zzWembley WIC 2ND QTR 2008]

    I get 2641 records with one record null.

    WHEN I EXECUTE the following statement:

    SELECT DISTINCT [Name],[Date Of Birth]

    FROM [zzWembley WIC 2ND QTR 2008]

    I get 2664 records with one record null

    WHEN I EXECUTE the following statement:

    I get 2677 records with one record as null

    Can anybody explain the discrepancy between the figures?

  • Hi,

    SELECT COUNT(DISTINCT ...)

    Will return a count unique of none-null values

    SELECT DISTINCT COUNT(...)

    Will return a distinct value for the count of all none-null values

    Add an extra column to a DISTINCT query I would normally expect more rows as the more columns you look at the more chance of non-uniqueness there is.

    I hope this makes sense?

    Thanks

    Chris

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • SELECT DISTINCT [Name],[Date Of Birth]

    FROM [zzWembley WIC 2ND QTR 2008]

    I get 2664 records with one record null

    this implies that there are people who have the same name, but are born on different days, hence the 23 additional records..

    quite probable you have 2 "Bob Smith" records with two different birthdays, for example.

    1 distinct name "Bob Smith"

    2 distinct records with Bob Smith and different dates.

    the last query, which got lost in the paste, problably adds additional columns to teh distinct, so you get a similar expansion in the number of records.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • What is the best method for identifying those duplicate records when i add extra colums as that iniformaiton would be quite importnat for my manager.

  • try something like this:

    SELECT

    [Col1],

    [Col2],

    COUNT(*) as [Number of Dups]

    FROM [MyTable]

    GROUP BY --Use this for the rows you are comparing

    [Col1],

    [Col2]

    HAVING COUNT(*) > 1 --Only return rows where there are more than 1

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply