selecting TOP row from a result set

  • [font="Comic Sans MS"]Hi all.

    I have question in a query that is returning me a set of data....with following columns:

    account_no / emply_no / dept_no / address / Balance / phone / route / city / ....and 10 more columns...

    now i am getting with some repeated account numbers in my result...

    I want to select only the first row for every account reegardless of values in other columns,,,

    the values in other columns may or may not be same ...

    for example dept no can be D1 for the same account and can also be D2 for the same account..I just want the first row that is returned from the main query...

    any help

    thanks a lot

    [/font]

    [font="Comic Sans MS"]
    ---------------------------------------------------

    Thanks [/font]

  • SQL Learner-684602 (1/9/2010)


    [font="Comic Sans MS"]Hi all.

    I have question in a query that is returning me a set of data....with following columns:

    account_no / emply_no / dept_no / address / Balance / phone / route / city / ....and 10 more columns...

    now i am getting with some repeated account numbers in my result...

    I want to select only the first row for every account reegardless of values in other columns,,,

    the values in other columns may or may not be same ...

    for example dept no can be D1 for the same account and can also be D2 for the same account..I just want the first row that is returned from the main query...

    any help

    thanks a lot

    [/font]

    How about

    SELECT DISTINCT account_no FROM mytable

    If it doesn't matter what the values are in the other columns, then why retrieve them?

    Okay, that doesn't answer your question but it might set you thinking about the problem.

    You could use ROW_NUMBER() OVER(PARTITION BY account_no ORDER BY somecolumn).

    Write this into your query as the first i.e. leftmost column of your SELECT list and eyeball the result.

    It should be clear at this point: if not, please post the new query, your original query, plus some sample data which they will run against. Sample data should be prepared as a create table statement with a series of INSERTS to populate with data.

    Cheers

    ChrisM@home


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • Please post table definitions, sample data and desired output. Read this to see the best way to post this to get quick responses.

    http://qa.sqlservercentral.com/articles/Best+Practices/61537/

    You said you want the 'first row'. First by what criteria? Bear in mind that there's no order of rows in a table.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I think one of the solution is to use Row_number() function but for more precise result, please post table structure and some sample data as GilaMonster said!

    ============================================================
    SELECT YOUR PROBLEM FROM SSC.com WHERE PROBLEM DESCRIPTION =
    http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

  • Here's one way of obtaining the sort of results you seem to be after. It is based on data in the AdventureWorks sample database.

    SELECT TOP (10)

    E.EmployeeID,

    EA.Rate,

    EA.ModifiedDate,

    EA.RateChangeDate

    FROM AdventureWorks.HumanResources.Employee E

    CROSS

    APPLY (

    -- One row returned for each row from Employee

    -- The ORDER BY clause determines which row is returned

    -- Omit the ORDER BY if you don't care which row is returned

    SELECT TOP (1)

    EPH.ModifiedDate,

    EPH.Rate,

    EPH.RateChangeDate

    FROM AdventureWorks.HumanResources.EmployeePayHistory EPH

    WHERE EPH.EmployeeID = E.EmployeeID

    ORDER BY

    EPH.RateChangeDate DESC

    ) EA

    ORDER BY

    E.EmployeeID ASC;

  • This is a great example but what if the employee table had duplicate entries. and one or more columns had non-distinct values and you wanted to return the the first record regardless of whether they had a PayHistory Record?

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Again I'm going to ask for your definition of 'first row'. First by what criteria? Of do you just mean 'any one row, reagardless'?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • While we are waiting for more information, this is how I'd tackle the problem as I understand it... (actually, I'd make sure there was a primary key before I'd do anything else!)

    [font="Courier New"]

    --we are assuming that there is no unique key to this table

    --from what has been said, Department and Account_No together

    --seem to provide a unique key but this has never been provided

    --with an index or constraint

    SELECT

      account_no, emply_no, dept_no, [address], Balance, phone, [route], city --etc.

    FROM

      BadlyDesignedTable

      INNER JOIN (SELECT

                    account_no, MIN(Dept_no) AS FirstDeptNo

                  FROM

                    BadlyDesignedTable

                  GROUP BY

                    account_no

                 ) ChosenRecord

        ON ChosenRecord.account_no=BadlyDesignedTable.account_no

           AND ChosenRecord.FirstDeptNo=BadlyDesignedTable.Dept_no[/font]

    Best wishes,
    Phil Factor

  • Welsh Corgi


    This is a great example but what if:

    Welsh Corgi


    ...the employee table had duplicate entries.

    Use a CTE or derived table in place of the Employee table. A DISTINCT or GROUP BY is all that is required.

    Welsh Corgi


    one or more columns had non-distinct values and you wanted to return the the first record regardless of whether they had a PayHistory Record?

    Change the CROSS APPLY to OUTER APPLY.

    As Gail points out, and as I tried to hint in my code comments, it all rather hinges on the details - especially what criteria one chooses to define 'first record'.

    Phil's example is another example of a valid approach - though personally I do tend to prefer TOP with an APPLY and ORDER BY...but it's often just personal preference.

  • Hi All[font="Comic Sans MS"]

    thank you so much for all your Replies.....

    The criteria is to get the first Row...

    SELECT

    Area

    ,StateName

    ,CityName

    ,ACCOUNT_NO

    ,CUSTOMER_NAME

    ,CounterNo

    ,ROUTE_No

    ,Address

    ,TeleNumber

    FROM FACT F

    ,Dim1 1

    ,Dim2 2

    ,Dim3 3

    WHERE

    F.DW_1_KEY = 1.DW_1_KEY

    AND F.DW_2_KEY = 2.DW_2_KEY

    AND F.DW_3_KEY = 3.3_KEY

    and CAL.[year] = year(DATEADD(month, -1, GETDATE()))

    and CAL.[month] = month(DATEADD(month, -1, GETDATE()))

    order by Area

    Now as I said earlier,

    the output may contain repeated account numbers, but I want to select only the top row for every account number.

    I cant use group by as Same account number may have different Count number or address...I just want to show the first row that comes from the table for every account number....

    I cannot Ignore other columns as I will show the value for other columns too from the first row for that account number...

    I dont know how to use RowNumber...

    Let me try CROSS and OUTER APPLY...

    .Any more help?

    Thanku

    [/font]

    [font="Comic Sans MS"]
    ---------------------------------------------------

    Thanks [/font]

  • Ok, so the 'first' row is the one with the lowest Area?

    If so, try this.

    SELECT

    Area

    ,StateName

    ,CityName

    ,ACCOUNT_NO

    ,CUSTOMER_NAME

    ,CounterNo

    ,ROUTE_No

    ,Address

    ,TeleNumber

    FROM

    (SELECT

    Area

    ,StateName

    ,CityName

    ,ACCOUNT_NO

    ,CUSTOMER_NAME

    ,CounterNo

    ,ROUTE_No

    ,Address

    ,TeleNumber

    , Row_Number() OVER (PARTITION BY ACCOUNT_NO ORDER BY AREA) AS RowNo

    FROM FACT F

    INNER JOIN Dim1 D1 on F.DW_1_KEY = D1.DW_1_KEY

    INNER JOIN Dim2 D2 ON F.DW_2_KEY = D2.DW_2_KEY

    INNER JOIN Dim3 D3 ON F.DW_3_KEY = D3.3_KEY

    WHERE

    CAL.[year] = year(DATEADD(month, -1, GETDATE()))

    and CAL.[month] = month(DATEADD(month, -1, GETDATE()))) sub

    WHERE RowNo = 1

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I had similar need and here is how I got it done. I assume that even though there will be duplication account numbers, there will be unique ACCOUNT_NO and AREA combinations. If not then you will need to find the unique combination that you are interested and change the order by with in the row_number() call to have the unique columns.

    ;WITH My_INFO AS (

    SELECT

    Area

    ,StateName

    ,CityName

    ,ACCOUNT_NO

    ,CUSTOMER_NAME

    ,CounterNo

    ,ROUTE_No

    ,Address

    ,TeleNumber

    ,row_number() over (Partition by ACCOUNT_NO order by Area) as row_num

    FROM FACT F

    ,Dim1 1

    ,Dim2 2

    ,Dim3 3

    WHERE

    F.DW_1_KEY = 1.DW_1_KEY

    AND F.DW_2_KEY = 2.DW_2_KEY

    AND F.DW_3_KEY = 3.3_KEY

    and CAL.[year] = year(DATEADD(month, -1, GETDATE()))

    and CAL.[month] = month(DATEADD(month, -1, GETDATE()))

    )

    SELECT a.Area

    ,a.StateName

    ,a.CityName

    ,a.ACCOUNT_NO

    ,a.CUSTOMER_NAME

    ,a.CounterNo

    ,a.ROUTE_No

    ,a.Address

    ,a.TeleNumber

    FROM My_INFO WHERE row_num = 1

    ORDER BY Area

  • Not to create more confusion, but what is the business need here? If you only want to return one row of data per account number, then why return the address or phone number or any other data? It would be kind of strange that there is a business requirement to obtain random data per customer or account?

    I know there 'are' some needs every once in a while the require this but very seldom. Sometimes you can get around building tricky and excessively advanced code by refining the business requirements 😉

    Link to my blog http://notyelf.com/

  • shannonjk (1/13/2010)


    If you only want to return one row of data per account number, then why return the address or phone number or any other data? It would be kind of strange that there is a business requirement to obtain random data per customer or account?

    But the queries with Row number don't return random data. It's one account with that account's telephone number, address, etc in it.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I just walked into the situation.

    The Databases were created by Non-IT Professionals.

    I have experience with the entire SDLC.

    I appreciate your comments and I hope to change the way things are done but that remains to be seen.

    I'm sort of in a void at this time. I was ask if I could do that so until I can eliminate duplicates, I have done it before. The business owner is out of town and I do not yet have permissions to the Server.

    I believe all that I need is an outer and an inner sub query with a SELECT TOP and an Order by clause to get the first record.

    After I examine each table in the database I hope the database is designed in a way and the process that will afford me the opportunity to create constraints, triggers and what ever else is needed to prevent this from happening in the future so that.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Viewing 15 posts - 1 through 15 (of 27 total)

You must be logged in to reply to this topic. Login to reply