Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 10 of 10 total
Thread How to get Rcd_Id from Grouped rows?
Mon, Dec 22 2014 3:18 PMPermanent Link

Barry

I have a Trees table that tracks the growth of trees in a nursery, with three columns:

Trees Table:
Rcd_Id GUID
Location_Id GUID
Day_Num Integer

I would like to get the Rcd_Id's for the rows with the highest Day_Num for each Location_Id. There is one row for each day the tree grows in a location. So Day_Num starts at 0 and increments by 1 for each day. There is only one row for each Location_Id, Day_Num pair.

Select Rcd_Id, Max(Day_Num) from Trees Group By Location_Id

seems to work but it might just be because EDB traverses the rows using a particular index. Am I guaranteed that Rcd_Id will always be for the row with the Max(Day_Num)? Or is there a better way to do it (without using a view)?

TIA

Barry
Mon, Dec 22 2014 5:18 PMPermanent Link

Malcolm Taylor

Hi Barry

It is a good question.  Smile
I asked Tim a very similar question many years ago.

I think that in your specific case it may well work, as you have found.
Back then, however, Tim said there was no *guarantee* if you could end
up with multiple rows.

To be sure, when dealing with aggregates and equivalent, I think you
have first to identify the max row(s), then read the rec-id(s).
In other words you probably have to do it in two stages.

But maybe someone more expert than I can contradict me.  Smile

Malcolm
Tue, Dec 23 2014 3:12 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Barry


As Malcolm says I'm pretty sure its not guaranteed.

If you want some help coming up with the sql to achieve this then post some sample data and I'll have a go.

I'm not sure just what you're trying to achieve with this query.

You're grouping by Location_Id which does not necessarily produce a single row but your statement "Am I guaranteed that Rcd_Id will always be for the row with the Max(Day_Num)" indicates that you are expecting a single row, or does it?

It may well be just that I don't understand what's going on.


Roy Lambert
Tue, Dec 23 2014 8:19 AMPermanent Link

Raul

Team Elevate Team Elevate

On 12/22/2014 3:18 PM, Barry wrote:
> I would like to get the Rcd_Id's for the rows with the highest Day_Num for each Location_Id. There is one row for each day the tree grows in a location. So Day_Num starts at 0 and increments by 1 for each day. There is only one row for each Location_Id, Day_Num pair.
> Select Rcd_Id, Max(Day_Num) from Trees Group By Location_Id
> seems to work but it might just be because EDB traverses the rows using a particular index. Am I guaranteed that Rcd_Id will always be for the row with the Max(Day_Num)? Or is there a better way to do it (without using a view)?

Logically this does not do what you want - the Rcd_Id and Day_Num will
not necessarily be from the same row in the table. They might be but
that is not guaranteed.

Will you have duplicate entries ? (i.e. same Location_Id and Day_Num
values bu different rcd_id).

Standard SQL solution is one of these except it gives you dupes if
Day_Num values are same:

select T.Rcd_Id, t.Location_Id, T.Day_Num
from Trees T
inner join(
    select Location_Id, max(Day_Num) MaxDay from Trees group by
Location_Id
) ss on t.Location_Id = ss.Location_Id and t.Day_Num = ss.maxday


or

select T.Rcd_Id, t.Location_Id, T.Day_Num from Trees T1
left outer join trees T2
on (t1.Location_Id = t2.Location_Id and t1.Day_Num < t2.Day_Num)
where t2.Location_Id is null


If you do have dupes then you need to do bit more work

Raul
Wed, Dec 24 2014 3:00 PMPermanent Link

Barry

Roy Lambert wrote:

>I'm not sure just what you're trying to achieve with this query.<

I am trying to find the Rcd_Id for the row with the Max(Day_Num) for each Location_Id. If there are 4 unique Location_Id's, then  it will return 4 rows; one row with the Rcd_Id corresponding to Max(Day_Num) of each Location_Id.

>You're grouping by Location_Id which does not necessarily produce a single row but your statement "Am I guaranteed that Rcd_Id will always be for the row with the Max(Day_Num)" indicates that you are expecting a single row, or does it?<

There is only one row returned for each Location_Id because Day_Num is unique within each Location_Id.

>It may well be just that I don't understand what's going on.<
No problem. I couldn't wrap my head around the SQL to do it in one statement.

It looks like Raul has solved it. Smile

Barry
Wed, Dec 24 2014 3:13 PMPermanent Link

Barry

Raul wrote:

>Will you have duplicate entries ? (i.e. same Location_Id and Day_Num  values bu different rcd_id).<
No duplicates. Day_Num is unique for each Location_Id.

Both of your solutions work! I will do some further testing on large tables to see which one is faster.

Thanks again and Merry XMAS to you and Roy.

Barry
Thu, Dec 25 2014 9:04 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Barry



>There is only one row returned for each Location_Id because Day_Num is unique within each Location_Id.
OK, you are really baffling me. Taking this and your other statements it seems to me that you are saying there is only ever one tree in each location at a time, or that you never put two or more trees into a location at the same time.

I know it doesn't matter but I hate having something like this dangled in front of me - my brain just refuses to give up on it.

Roy
Thu, Dec 25 2014 11:20 PMPermanent Link

Barry

Roy Lambert wrote:

>>There is only one row returned for each Location_Id because Day_Num is unique within each Location_Id.<<
>OK, you are really baffling me. Taking this and your other statements it seems to me that you are saying there is only ever one tree in each location at a time, or that you never put two or more trees into a location at the same time.<

Not quite. Maybe it would make more sense if I said "plants" instead of "trees". Each record represents a group of plants grown at a specofic location. There is a Day_Num for each location,starting at Day_Num=0 to Day_Num=99 etc.. The last Day_Num will be for when the plants are "harvested".

>I know it doesn't matter but I hate having something like this dangled in front of me - my brain just refuses to give up on it.>

Hopefully this clears it up better.

There are several locations that the plants can be grown. It can be in Greenhouse#1, Greenhouse#2, ... Greenhouse#8 etc..The plants are planted at a particular greenhouse (Location_Id GUID) at the same time, and harvested days later on the same day. So Greenhouse #2 could have the plants planted on July 1st 2014 and be harvested on Aug 15 2014. There is a record created for each Greenhouse (Location_Id) and Day_Num in the life of the crop. This record counts the # of plants that died that day, the humidity and temperature range that day etc..

Since each greenhouse is planted at a different Date, each greenhouse will have a different Day_Num for a given date. Greenhouse #1's crop may be at Day_Num 10 whereas Greenhouse#2's crop may be at Day_Num 35.

To get the current status of all the crops, I need to get the Rcd_Id for Max(Day_Num) for each Location_Id. This is easy enough to do with TEDBTable and Delphi, but much harder to achieve with SQL in one statement because I am not guaranteed that the Rcd_Id returned by:

  Select Rcd_Id, Max(Day_Num) from Trees Group By Location_Id;

would be the Rcd_Id of the row with the Max(Day_Num) for that location_Id. It just so happens EDB *usually* returns the proper Rcd_Id, but other SQL databases would not be so considerate.

Raul's solution works nicely and guarantees the correct Rcd_Id is always returned when using an Aggregate function in a Group By.

Barry
Fri, Dec 26 2014 3:45 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Barry


Thank ypu - I can now sleep again.

Roy Lambert
Fri, Dec 26 2014 3:31 PMPermanent Link

Richard Harding

Wise Nutrition Coaching

The SELECT list should only consist of Grouping columns and aggregate function expressions.

SO, it is not valid to have the Rec_ID column in the SELECT list and there is no guarantee that it will give the correct result.  It should result in an error.

Another solution is:

SELECT
  Rcd_Id,
  Location_ID,
  Day_Num
FROM Trees T1
WHERE Day_Num =
  (SELECT MAX(Day_Num) FROM Trees WHERE Location_Id = T1.Location_Id)


Richard
Image