Select first grandchild row of first child

  • I need some help creating a query in SQL Server 2005 that will give me the first (TOP 1 is fine) grandchild row of the first child row of each parent.

    For example, say we have a trip object (that will be the parent). Each trip object can have zero or more track objects. And each track object can have zero or more trackPoint objects. So our table definitions are as follows:

    create table Trip(id int, name varchar(255))

    create table Track(id int, tripID int, name varchar(255))

    create table TrackPoint(id int, trackID int, lat float, lon float)

    Populate those with some rows:

    insert into Trip values(1, 'trip 1')

    insert into Trip values(2, 'trip 2')

    insert into Trip values(3, 'trip 3')

    insert into Trip values(4, 'trip 4')

    insert into Track values(1, 1, 'track 1.1')

    insert into Track values(2, 1, 'track 1.2')

    insert into Track values(3, 2, 'track 2.1')

    insert into Track values(4, 2, 'track 2.2')

    insert into Track values(5, 3, 'track 3.1')

    insert into TrackPoint values(1, 1, 33.1, -111.1)

    insert into TrackPoint values(2, 1, 33.2, -111.2)

    insert into TrackPoint values(3, 1, 33.3, -111.3)

    insert into TrackPoint values(4, 2, 45.1, -75.1)

    insert into TrackPoint values(5, 3, 37.1, -112.1)

    insert into TrackPoint values(6, 3, 37.2, -112.2)

    Writing a query for one trip is easy:

    select top 1 Trip.id, TrackPoint.lat, TrackPoint.lon

    from TrackPoint

    join Track on Track.id = TrackPoint.trackID

    join Trip on Trip.id = Track.tripID

    where Trip.id = 1

    But I'd like to do it for several trips at once. For example, change the WHERE clause to:

    IN (1, 2)

    But that obviously doesn't work. The desired output of that, however, would be:

    1, 33.1, -111.1

    2, 37.1, -112.1

    Can someone help me write a query that will allow me to specify multiple tripIDs and get back the first track point of each trip.

    Thanks,

    Kevin

  • You can use ROW_NUMBER() for the same

    ; WITH cte_Track_Details AS

    (

    SELECT ROW_NUMBER() OVER( PARTITION BY Trip.id ORDER BY Trip.id, TrackPoint.lat, TrackPoint.lon ) RowNum,

    Trip.id, TrackPoint.lat, TrackPoint.lon

    FROM TrackPoint

    INNER JOIN Track ON Track.id = TrackPoint.TrackID

    INNER JOIN Trip on Trip.id = Track.TripID

    )

    SELECT *

    FROM cte_Track_Details

    WHERE RowNum = 1

    AND id IN (1,2)

    1, 33.1, -111.1

    2, 33.3, -111.3

    Please check your expected output, i think its wrong


    Kingston Dhasian

    How to post data/code on a forum to get the best help - Jeff Moden
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • One method would be

    with cteTracks

    as(

    select Trip.id, TrackPoint.lat, TrackPoint.lon,ROW_NUMBER() over ( partition by Trip.id order by trackpoint.id) as RowN

    from TrackPoint

    join Track on Track.id = TrackPoint.trackID

    join Trip on Trip.id = Track.tripID

    where Trip.id in(1,2)

    )

    select * from cteTracks where RowN = 1



    Clear Sky SQL
    My Blog[/url]

  • You can also use a CROSS APPLY for the same

    select T.*

    from Trip Tr

    cross apply (

    select top 1 Trip.id, TrackPoint.lat, TrackPoint.lon

    from TrackPoint

    join Track on Track.id = TrackPoint.trackID

    join Trip on Trip.id = Track.tripID

    where Trip.id = Tr.id ) T

    where Tr.id in (1,2)


    Kingston Dhasian

    How to post data/code on a forum to get the best help - Jeff Moden
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • Can you please explain the logic of your desired output,

    I am not getting what you want.

    Vaibhav K Tiwari
    To walk fast walk alone
    To walk far walk together

  • Thanks guys. I'm sorry, I did have the wrong value in my desired output. I've edited my post to correct it.

    Kingston, I appreciate the help. Your first solution does produce the desired output, but unfortunately the CTE query is selecting all the trips in the system and is inefficient.

    Dave, your solution appears to produce the desired output and does so efficiently by having the IN clause of the trip IDs on the CTE. Thank you very much.

    But now I want to get greedy. 🙂 Say I pass in 100 trip IDs, if each trip has 1 track and each track has 1,000 track points, then the CTE query is selecting 100,000 rows, but I'm ultimately only returning 100. Is there any way to put in a TOP 1 clause for the track points?

    I'll be messing around with it to try and figure it out. Again, thanks for all the help from each of you.

  • kevin77 (4/17/2010)


    Is there any way to put in a TOP 1 clause for the track points?

    Can you explain it ? i got confused with it

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • Sorry , i lost track of this...

    Can you post some code that demonstrates the problem you are having?



    Clear Sky SQL
    My Blog[/url]

  • Thanks, guys. There is no problem per se, just wondering if it could be better. Using Dave's query, with 5 trips in the list, we get the attached execution plan. Dave's query...

    with cteTracks

    as(

    select Trip.id, TrackPoint.lat, TrackPoint.lon, ROW_NUMBER() over ( partition by Trip.id order by trackpoint.id) as RowN

    from TrackPoint

    join Track on Track.id = TrackPoint.trackID

    join Trip on Trip.id = Track.tripID

    where Trip.id IN (224652,225026,225025,225019,225038)

    )

    select * from cteTracks where RowN = 1

    On the index seek of the TrackPoint table there is an estimated 1,665 rows being returned. If we had say 100 trips in the IN clause, the number of rows returned by the CTE could be in the hundreds of thousands, yet only 100 rows are truly needed for the result.

    Obviously I can just work around this "problem" by not querying for 100 trips at a time and instead just keep it to 5 or something. But if you guys have any other ideas, that would be cool.

    Again, doing the query for one trip we use the following query:

    select top 1 Trip.id, TrackPoint.lat, TrackPoint.lon

    from TrackPoint

    join Track on Track.id = TrackPoint.trackID

    join Trip on Trip.id = Track.tripID

    where Trip.id = 225038

    The explain plan for this says that the estimated number of rows returned from the TrackPoint table will be 1, which obviously makes sense.

  • Well this would fall under the great caveat of "It depends".

    The cross apply the kingston gave *could* perform better against a wider selection of rows.

    What indexes do you have on trackpoint ?

    A clustered index on trackID , id would be the obvious one to help this query.

    The other tables would also benefit from indexes.



    Clear Sky SQL
    My Blog[/url]

  • Yeah, indexes are good. There is a clustered index on the id column of each table and there is an index on the trackID column of the TrackPoint and on the tripID column of the Track table (for the foreign keys).

    It's not that the query runs really slow, I was just trying to be greedy.

    The CROSS APPLY method from Kingston is interesting, but it doesn't produce the correct results. For example,

    select T.*

    from Trip Trip

    cross apply (

    select top 1 Trip.id, TrackPoint.lat, TrackPoint.lon

    from TrackPoint

    join Track on Track.id = TrackPoint.trackID

    join Trip on Trip.id = Track.tripID

    where Trip.id = Trip.id ) T

    where Trip.id IN (224652,225026,225025,225019,225038)

    produces the following results:

    59788 37.3997226666667 -122.080576

    59788 37.3997226666667 -122.080576

    59788 37.3997226666667 -122.080576

    59788 37.3997226666667 -122.080576

    59788 37.3997226666667 -122.080576

    Ah, so messing around with that query, I think I see the problem. I've changed it to

    select T.*

    from Trip TripOuter

    cross apply (

    select top 1 Trip.id, TrackPoint.lat, TrackPoint.lon

    from TrackPoint

    join Track on Track.id = TrackPoint.trackID

    join Trip on Trip.id = Track.tripID

    where Trip.id = TripOuter.id ) T

    where TripOuter.id IN (224652,225026,225025,225019,225038)

    That appears to produce the correct results, but I will have to do some more testing. The query plan also looks good.

    Yeah, the above is looking pretty good. I will read up on it and work with it some more.

    Thanks all!

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply