Select first grandchild row of first child

Question

Post reply

Select first grandchild row of first child

kevin77

SSCarpal Tunnel

Points: 4041
More actions
April 16, 2010 at 9:24 pm

#217973

I need some help creating a query in SQL Server 2005 that will give me the first (TOP 1 is fine) grandchild row of the first child row of each parent.

For example, say we have a trip object (that will be the parent). Each trip object can have zero or more track objects. And each track object can have zero or more trackPoint objects. So our table definitions are as follows:

create table Trip(id int, name varchar(255)) create table Track(id int, tripID int, name varchar(255)) create table TrackPoint(id int, trackID int, lat float, lon float)

Populate those with some rows:

insert into Trip values(1, 'trip 1') insert into Trip values(2, 'trip 2') insert into Trip values(3, 'trip 3') insert into Trip values(4, 'trip 4') insert into Track values(1, 1, 'track 1.1') insert into Track values(2, 1, 'track 1.2') insert into Track values(3, 2, 'track 2.1') insert into Track values(4, 2, 'track 2.2') insert into Track values(5, 3, 'track 3.1') insert into TrackPoint values(1, 1, 33.1, -111.1) insert into TrackPoint values(2, 1, 33.2, -111.2) insert into TrackPoint values(3, 1, 33.3, -111.3) insert into TrackPoint values(4, 2, 45.1, -75.1) insert into TrackPoint values(5, 3, 37.1, -112.1) insert into TrackPoint values(6, 3, 37.2, -112.2)

Writing a query for one trip is easy:

select top 1 Trip.id, TrackPoint.lat, TrackPoint.lon from TrackPoint join Track on Track.id = TrackPoint.trackID join Trip on Trip.id = Track.tripID where Trip.id = 1

But I'd like to do it for several trips at once. For example, change the WHERE clause to:

IN (1, 2)

But that obviously doesn't work. The desired output of that, however, would be:

1, 33.1, -111.1 2, 37.1, -112.1

Can someone help me write a query that will allow me to specify multiple tripIDs and get back the first track point of each trip.

Thanks,

Kevin

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Kingston Dhasian SSCoach Points: 19795 More actions · Answer 1

You can use ROW_NUMBER() for the same

; WITH cte_Track_Details AS



(

	SELECT	ROW_NUMBER() OVER( PARTITION BY Trip.id ORDER BY Trip.id, TrackPoint.lat, TrackPoint.lon ) RowNum,

		Trip.id, TrackPoint.lat, TrackPoint.lon

	FROM	TrackPoint

	INNER JOIN Track ON Track.id = TrackPoint.TrackID

	INNER JOIN Trip on Trip.id = Track.TripID

)

SELECT	*

FROM	cte_Track_Details

WHERE	RowNum = 1

AND id IN (1,2)

1, 33.1, -111.1

2, 33.3, -111.3

Please check your expected output, i think its wrong

Kingston Dhasian

How to post data/code on a forum to get the best help - Jeff Moden
http://www.sqlservercentral.com/articles/Best+Practices/61537/

Dave Ballantyne SSC-Dedicated Points: 33556 More actions · Answer 2

One method would be



with cteTracks

as(

select Trip.id, TrackPoint.lat, TrackPoint.lon,ROW_NUMBER() over ( partition by Trip.id order by trackpoint.id) as RowN

  from TrackPoint

  join Track on Track.id = TrackPoint.trackID

  join Trip on Trip.id = Track.tripID

 where Trip.id in(1,2)

 )

select * from cteTracks where RowN = 1

Clear Sky SQL
My Blog[/url]

Kingston Dhasian SSCoach Points: 19795 More actions · Answer 3

You can also use a CROSS APPLY for the same

select T.*



from	Trip Tr

cross apply (

select	top 1 Trip.id, TrackPoint.lat, TrackPoint.lon

  from TrackPoint

  join Track on Track.id = TrackPoint.trackID

  join Trip on Trip.id = Track.tripID

 where Trip.id = Tr.id ) T

where Tr.id in (1,2)

Kingston Dhasian

How to post data/code on a forum to get the best help - Jeff Moden
http://www.sqlservercentral.com/articles/Best+Practices/61537/

vaibhav.tiwari SSCommitted Points: 1946 More actions · Answer 4

vaibhav.tiwari

SSCommitted

Points: 1946

April 17, 2010 at 3:13 am

#1153292

Can you please explain the logic of your desired output,

I am not getting what you want.

Vaibhav K Tiwari
To walk fast walk alone
To walk far walk together

kevin77 SSCarpal Tunnel Points: 4041 More actions · Answer 5

Thanks guys. I'm sorry, I did have the wrong value in my desired output. I've edited my post to correct it.

Kingston, I appreciate the help. Your first solution does produce the desired output, but unfortunately the CTE query is selecting all the trips in the system and is inefficient.

Dave, your solution appears to produce the desired output and does so efficiently by having the IN clause of the trip IDs on the CTE. Thank you very much.

But now I want to get greedy. 🙂 Say I pass in 100 trip IDs, if each trip has 1 track and each track has 1,000 track points, then the CTE query is selecting 100,000 rows, but I'm ultimately only returning 100. Is there any way to put in a TOP 1 clause for the track points?

I'll be messing around with it to try and figure it out. Again, thanks for all the help from each of you.

Bhuvnesh SSC Guru Points: 59351 More actions · Answer 6

Bhuvnesh

SSC Guru

Points: 59351

April 19, 2010 at 7:54 am

#1153837

kevin77 (4/17/2010)

Is there any way to put in a TOP 1 clause for the track points?

Can you explain it ? i got confused with it

-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

Dave Ballantyne SSC-Dedicated Points: 33556 More actions · Answer 7

Dave Ballantyne

SSC-Dedicated

Points: 33556

April 19, 2010 at 8:16 am

#1153863

Sorry , i lost track of this...

Can you post some code that demonstrates the problem you are having?

Clear Sky SQL
My Blog[/url]

kevin77 SSCarpal Tunnel Points: 4041 More actions · Answer 8

Thanks, guys. There is no problem per se, just wondering if it could be better. Using Dave's query, with 5 trips in the list, we get the attached execution plan. Dave's query...



with cteTracks

as(

select Trip.id, TrackPoint.lat, TrackPoint.lon, ROW_NUMBER() over ( partition by Trip.id order by trackpoint.id) as RowN

  from TrackPoint

  join Track on Track.id = TrackPoint.trackID

  join Trip on Trip.id = Track.tripID

 where Trip.id IN (224652,225026,225025,225019,225038)

 )

 select * from cteTracks where RowN = 1

On the index seek of the TrackPoint table there is an estimated 1,665 rows being returned. If we had say 100 trips in the IN clause, the number of rows returned by the CTE could be in the hundreds of thousands, yet only 100 rows are truly needed for the result.

Obviously I can just work around this "problem" by not querying for 100 trips at a time and instead just keep it to 5 or something. But if you guys have any other ideas, that would be cool.

Again, doing the query for one trip we use the following query:



select top 1 Trip.id, TrackPoint.lat, TrackPoint.lon

  from TrackPoint

  join Track on Track.id = TrackPoint.trackID

  join Trip on Trip.id = Track.tripID

 where Trip.id = 225038

The explain plan for this says that the estimated number of rows returned from the TrackPoint table will be 1, which obviously makes sense.

Dave Ballantyne SSC-Dedicated Points: 33556 More actions · Answer 9

Well this would fall under the great caveat of "It depends".

The cross apply the kingston gave *could* perform better against a wider selection of rows.

What indexes do you have on trackpoint ?

A clustered index on trackID , id would be the obvious one to help this query.

The other tables would also benefit from indexes.

Clear Sky SQL
My Blog[/url]

kevin77 SSCarpal Tunnel Points: 4041 More actions · Answer 10

Yeah, indexes are good. There is a clustered index on the id column of each table and there is an index on the trackID column of the TrackPoint and on the tripID column of the Track table (for the foreign keys).

It's not that the query runs really slow, I was just trying to be greedy.

The CROSS APPLY method from Kingston is interesting, but it doesn't produce the correct results. For example,



select  T.*

from    Trip Trip

cross apply (

select  top 1 Trip.id, TrackPoint.lat, TrackPoint.lon

  from TrackPoint

  join Track on Track.id = TrackPoint.trackID

  join Trip on Trip.id = Track.tripID

 where Trip.id = Trip.id ) T

where Trip.id IN (224652,225026,225025,225019,225038)

produces the following results:



59788	37.3997226666667	-122.080576

59788	37.3997226666667	-122.080576

59788	37.3997226666667	-122.080576

59788	37.3997226666667	-122.080576

59788	37.3997226666667	-122.080576

Ah, so messing around with that query, I think I see the problem. I've changed it to



select  T.*

from    Trip TripOuter

cross apply (

select  top 1 Trip.id, TrackPoint.lat, TrackPoint.lon

  from TrackPoint

  join Track on Track.id = TrackPoint.trackID

  join Trip on Trip.id = Track.tripID

 where Trip.id = TripOuter.id ) T

where TripOuter.id IN (224652,225026,225025,225019,225038)

That appears to produce the correct results, but I will have to do some more testing. The query plan also looks good.

Yeah, the above is looking pretty good. I will read up on it and work with it some more.

Thanks all!