Getting the distinct Rows/row group based on the condition

  • I've the below table and script

    --Create Table dbo.Customer

    CREATE TABLE dbo.Customer(

    FlightNbr varchar(50) NULL,

    DeptDateTime datetime NULL,

    Origin varchar(50) NULL,

    Destination varchar(50) NULL,

    FirstName varchar(50) NULL,

    CustomerID varchar(50) NULL,

    SeatRow varchar(50) NULL,

    HistoryAction char(3) NULL

    ) ON [PRIMARY]

    --INSERT 3 rows into dbo.Customer table

    GO

    INSERT INTO dbo.Customer (FlightNbr, DeptDateTime, Origin, Destination,FirstName, CustomerID, SeatRow)

    SELECT 0111, '2012-05-06 12:00:00.000','LAS','LGA','John','ASDF',1,'SC'

    UNION ALL

    SELECT 0111, '2012-05-06 12:00:00.000','LAS','LGA','John','ASDF',1,'AC'

    UNION ALL

    SELECT 0112, '2012-05-03 12:00:00.000','DEN','LAS','John','ASDF',1, 'HK'

    GO

    Now, when I SELECT * FROM dbo.Customer, I get three rows. However, I want only the third row where the value for HistoryAction is 'HK'. Here, the logic is if the value for the HistoryAction is 'SC' then I don't want to see the entire group of rows even with any other values for HistoryAction column.

    In the above, I want to see the 'HK' because that row has the different FlightNumber and DeptDatetime

    There are millions of rows like this in the actual table.

    Please let me know if this is possible to do.

    Thanks

  • If you just want to exclude the HistoryAction 'SC' records, you can do something like:

    SELECT FlightNbr, DeptDateTime, Origin, Destination, FirstName, CustomerId, SeatRow, HistoryAction

    WHERE HistoryAction <> 'SC'

    ORDER BY FlightNbr

    It sounds like you may want to group these records somehow, though I'm not sure from your initial description.

    HTH,

    Rob

  • Do you want to exclude rows where the value is 'SC'? Or do you only want the rows where the value is 'HK'?

    Rob's query will return 2 rows in your sample data.

  • at this juncture, it may be prudent to provide some further sample detail to fully describe the

    possible permutations of the millions of records you have.

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Yes, I want to GROUP BY on the first 7 columns. However, I'm not sure if that will resolve my issue.

    In my above example, here are the scenarios,

    - if I get distinct values for the first 7 columns and on the 8th column(HistoryAction) as 'SC', I dont want that row to be displayed. If I've other rows which has the similar values for the first 7 columns and just the different HistoryAction values as 'AC' or 'HK' and so on, I don't want to see those rows as well

    -if I get distinct values for the first 7 columns and on the 8th column(HistoryAction) as 'AC' or 'HK', I want that row to be displayed in my output.

    I'm not sure about other HistoryAction, but 'SC' indicates the Schedule Change for the business. Therefore, if there has ever been any schedule change for the distinct group of records (1st 7 columns in this case) the that shouldn't be displayed in the report even if the same group of records have other HistoryActions as AC or HK.

    Sorry for the confusion above.

    Thanks.

  • What are the valid codes for HistoryAction?

  • You are probably looking for something like this, at least based on what you provided.

    CREATE TABLE dbo.Customer(

    FlightNbr varchar(50) NULL,

    DeptDateTime datetime NULL,

    Origin varchar(50) NULL,

    Destination varchar(50) NULL,

    FirstName varchar(50) NULL,

    CustomerID varchar(50) NULL,

    SeatRow varchar(50) NULL,

    HistoryAction char(3) NULL

    );

    --INSERT 3 rows into dbo.Customer table

    GO

    INSERT INTO dbo.Customer (FlightNbr, DeptDateTime, Origin, Destination,FirstName, CustomerID, SeatRow, HistoryAction)

    SELECT 0111, '2012-05-06 12:00:00.000','LAS','LGA','John','ASDF',1,'SC'

    UNION ALL

    SELECT 0111, '2012-05-06 12:00:00.000','LAS','LGA','John','ASDF',1,'AC'

    UNION ALL

    SELECT 0112, '2012-05-03 12:00:00.000','DEN','LAS','John','ASDF',1, 'HK'

    GO

    SELECT

    c1.FlightNbr,

    c1.DeptDateTime,

    c1.Origin,

    c1.Destination,

    c1.FirstName,

    c1.CustomerID,

    c1.SeatRow,

    c1.HistoryAction

    FROM

    dbo.Customer c1

    go

    SELECT

    c1.FlightNbr,

    c1.DeptDateTime,

    c1.Origin,

    c1.Destination,

    c1.FirstName,

    c1.CustomerID,

    c1.SeatRow,

    c1.HistoryAction

    FROM

    dbo.Customer c1

    WHERE

    NOT EXISTS(SELECT 1

    FROM

    dbo.Customer c2

    WHERE

    c2.FlightNbr = c1.FlightNbr and

    c2.DeptDateTime = c1.DeptDateTime and

    c2.Origin = c1.Origin and

    c2.Destination = c1.Destination and

    c2.FirstName = c1.FirstName and

    c2.CustomerID = c1.CustomerID and

    c2.SeatRow = c1.SeatRow and

    c2.HistoryAction = 'SC');

    GO

    DROP TABLE dbo.Customer;

    GO

  • An alternate to Lynn's solution may be something like this:

    SELECT a.FlightNbr, a.DeptDateTime, a.Origin, a.Destination, a.FirstName, a.CustomerID

    ,a.SeatRow, HistoryAction

    FROM (

    SELECT FlightNbr, DeptDateTime, Origin, Destination,FirstName, CustomerID, SeatRow

    FROM dbo.Customer

    GROUP BY FlightNbr, DeptDateTime, Origin, Destination,FirstName, CustomerID, SeatRow

    HAVING COUNT(CASE WHEN HistoryAction = 'SC' THEN 1 END) = 0 AND

    COUNT(CASE WHEN HistoryAction = 'HK' THEN 1 END) > 0) a

    INNER JOIN dbo.Customer b

    ON a.FlightNbr = b.FlightNbr AND a.DeptDateTime = b.DeptDateTime AND a.Origin = b.Origin AND

    a.Destination = b.Destination AND a.FirstName = b.FirstName AND a.CustomerID = b.CustomerID AND

    a.SeatRow = b.SeatRow

    Again, more test data may be useful in flushing out what's appropriate and which performs to your needs.

    I'm not sure whether this part of mine is really needed or not though:

    AND COUNT(CASE WHEN HistoryAction = 'HK' THEN 1 END) > 0


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply