Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 10 of 18 total
Thread Looking for fast efficient SQL to create 7 period trailing sums or ratios
Sat, Oct 6 2012 6:01 PMPermanent Link

Barry

I don't think there is an efficient way to do this in SQL and I have resigned myself that I may have to do this using Delphi code. But I thought I'd ask here just in case someone has a simple solution.

Here is my table:

Table1:
 Ndx : Integer;
 DateX: Date
 ColA: Decimal;
 ColB: Decimal;
 ColC...ColX: Decimal;

I need to create trailing ratios of several column pairs in a table. A trailing ratio is simply the sum of the last 'n' values of column A  divided by the sum of column B for the same set of rows. Both columns, "A" & "B" are in the same table. The rows are sorted in ascending order by date. So if 'n' is 7, then I'd like like to sum the first 7 "ColA" divided by the sum of the first 7 "ColB" as in Sum(ColA)/Sum(ColB). The next ratio will be Sum(ColA)/SUM(ColB) for rows 2 to 8, then SUM(ColA)/SUM(ColB) for rows 3 to 9 etc..

I will even make things easier by adding an Ndx column that is sequentially numbered from 1 to # of rows in table without any holes in the sequence.

Solution #1:
The obvious solution would be to do a subselect like:

select DateX, (select Sum(ColA)/Sum(ColB) from Table1 t2 where t2.Ndx> t1.Ndx-7 and t2.Ndx<=t1.Ndx) as RatioAB from Table1 t1;

This should work. But I have to calculate a dozen or more ratios like "RatioCD", "RatioEF", .. "RatioZY" etc..

I am concerned the number of subselects for each ratio pair will slow things down considerably.

Solution #2:
I would rather not use Subselects because of their speed. I'd prefer to use RunSum and have it reset after every 7 rows. But I don't see any way to do that. RunSum keeps accumulating values and never resets. I'd like to see RunSum with a 2nd parameter like RunSum(ColA,7) so it resets after every 7 rows.

Then I could do something like this:

Select Max(Ndx), Max(Datex), RunSum(ColA,7)/RunSum(ColB,7) as RatioAB from Table1 group by Ndx;

Solution #3:
   Select Max(Ndx), Max(Datex), Sum(ColA)/Sum(ColB) as RatioAB from Table1 group by Floor((Ndx-1)/7);
This will generate the ratios for days: 7,14,31,28 ....

But I would have to run this SQL 7 times to get a trailing ratio for all days 8,9,10,11,12,13 as in:
   Select Max(Ndx), Max(Datex), Sum(ColA)/Sum(ColB) as RatioAB from Table1 where Ndx >=2 group by Floor((Ndx-2)/7);

followed by:
   Select Max(Ndx), Max(Datex), Sum(ColA)/Sum(ColB) as RatioAB from Table1 where Ndx >=3 group by Floor((Ndx-3)/7);

etc.

Solution #3 seems to be the most efficient even though I have to run it 7 times. (Remember I have to calculate around a dozen ratios). I would probably create a SP and build a table out of the 7 runs of the SQL.

Solution#4:

Write Delphi code to load the data into a TList and do my own sum over the last 'n' rows for the ratio pairs that I need. This will be faster because it requires only 1 pass through the table. But if I need to calculate more ratios, then it means changing the code and recompiling the program.

So there you have it. 4 imperfect solutions.
Can anyone think of solution #5? Something that is fast and can be done using SQL?

TIA
Barry
Tue, Oct 9 2012 4:20 AMPermanent Link

Adam Brett

Orixa Systems

I would really like your idea of:

>>I'd like to see RunSum with a 2nd parameter
>>like RunSum(ColA,7) so it resets after every 7 rows.

There are some other features which could be added to RunSum (such as resetting it when a value, such as a customer-ID changed)

I might post it on the Suggestions Group for Tim to see.

--

A way around your problem which would work & not be too slow would be to add a dummy column in the table, "PeriodCounter" INTEGER, and increment it every 7 rows, put an INDEX on the column & then GROUP BY it. Setting the value of this column would slow things down a tiny bit, but once set the GROUP BY would be fast.

You should be able to write a TRIGGER in EDB to set the value of this PeriodCounter row, so that there were 7 values of "1" then 7 of "2" etc.

This could be done with

COUNT (*) FROM ***** WHERE PeriodCounter = (SELECT Max(PeriodCounter) FROM ****)

IF COUNT(*) >=7 then
SET PeriodCounter = Max(PeriodCounter)
ELSE SET PeriodCounter = Max(PeriodCounter)+1

... sorry the above SQL is a mess & won't really run in a trigger, you have to use Row.NewValue & stuff ... but all the key steps are there I think & the Help should make it clear.

--

As a first step you could write a similar procedure to update the PeriodCounter column for all existing data, using identical logic.

If you don't have the right to change the data-structure, create a DummyTable with a ForeignKey based on the primary key of the main table.
Tue, Oct 9 2012 4:53 AMPermanent Link

John Hay


Barry

> I don't think there is an efficient way to do this in SQL and I have resigned myself that I may have to do this using
Delphi code. But I thought I'd ask here just in case someone has a simple solution.
>
> Here is my table:
>
> Table1:
>   Ndx : Integer;
>   DateX: Date
>   ColA: Decimal;
>   ColB: Decimal;
>   ColC...ColX: Decimal;
>
> I need to create trailing ratios of several column pairs in a table. A trailing ratio is simply the sum of the last
'n' values of column A  divided by the sum of column B for the same set of rows. Both columns, "A" & "B" are in the same
table. The rows are sorted in ascending order by date. So if 'n' is 7, then I'd like like to sum the first 7 "ColA"
divided by the sum of the first 7 "ColB" as in Sum(ColA)/Sum(ColB). The next ratio will be Sum(ColA)/SUM(ColB) for rows
2 to 8, then SUM(ColA)/SUM(ColB) for rows 3 to 9 etc..
>
> I will even make things easier by adding an Ndx column that is sequentially numbered from 1 to # of rows in table
without any holes in the sequence.

How about

SELECT Datex,T1.* FROM Table1
JOIN
(SELECT TRUNC(Ndx-1)/7) AS PeriodNum,SUM(ColA)/SUM(Colb) FROM Table1
GROUP BY PeriodNum) T1
ON (T1.PeriodNum*7)+1 = Table1.Ndx

John

Wed, Oct 10 2012 4:59 PMPermanent Link

Barry

Adam Brett wrote:

>I would really like your idea of:

>>I'd like to see RunSum with a 2nd parameter
>>like RunSum(ColA,7) so it resets after every 7 rows.

>There are some other features which could be added to RunSum (such as resetting it when a value, such as a >customer-ID changed)

I like your idea of resetting the RunSum when a column value, like CustId changes.

In MySQL we had the ability to create session variables like:
set @LastCustId := '';
set @Total := 0;
select CustId, IF(CustId=@LastCustId, @Total:=@Total+InvoiceAmt, @Total:=InvoiceAmt) as RunningTotal from Invoices order by CustId, InvoiceDate;

This SQL statement would reset the @Total when the CustId changes. I'm not suggesting Tim do this, because this is more suited to using an EDB script. But it was one way we worked around the problem in MySQL.

>I might post it on the Suggestions Group for Tim to see.

Sure, no problem.


--

>A way around your problem which would work & not be too slow would be to add a dummy column in the table, >"PeriodCounter" INTEGER, and increment it every 7 rows, put an INDEX on the column & then GROUP BY it. >Setting the value of this column would slow things down a tiny bit, but once set the GROUP BY would be fast.

>You should be able to write a TRIGGER in EDB to set the value of this PeriodCounter row, so that there were 7 >values of "1" then 7 of "2" etc.

>This could be done with

>COUNT (*) FROM ***** WHERE PeriodCounter = (SELECT Max(PeriodCounter) FROM ****)

>IF COUNT(*) >=7 then
>SET PeriodCounter = Max(PeriodCounter)
>ELSE SET PeriodCounter = Max(PeriodCounter)+1

>... sorry the above SQL is a mess & won't really run in a trigger, you have to use Row.NewValue & stuff ... but all >the key steps are there I think & the Help should make it clear.
--
>As a first step you could write a similar procedure to update the PeriodCounter column for all existing data, using >identical logic.
>If you don't have the right to change the data-structure, create a DummyTable with a ForeignKey based on the >primary key of the main table.

Thanks for the suggestions. Table1 would hold at most a few hundred rows at present. If the query does slow down, I will definitely do as you suggested and add an index to PeriodCounter. This should help quite a bit.

Thanks
Barry
Wed, Oct 10 2012 5:05 PMPermanent Link

Barry

"John Hay" wrote:

>How about

>SELECT Datex,T1.* FROM Table1
>JOIN
>(SELECT TRUNC(Ndx-1)/7) AS PeriodNum,SUM(ColA)/SUM(Colb) FROM Table1
> GROUP BY PeriodNum) T1
>ON (T1.PeriodNum*7)+1 = Table1.Ndx

That actually worked. Wow! I will play with it some more to see if I can tweak it. The example below doesn't even need to reference Table1 in the column list.

SELECT T1.* FROM Table1
JOIN
(SELECT min(datex), max(datex), TRUNC((Ndx-1)/7) AS PeriodNum,SUM(ColA)/SUM(Colb) FROM Table1 GROUP BY PeriodNum) T1
ON (T1.PeriodNum*7)+1 = Table1.Ndx

Barry
Wed, Oct 10 2012 5:38 PMPermanent Link

Fernando Dias

Team Elevate Team Elevate

Barry,

Have you considered SQL/PSM to do it in a stored procedure using SQL cursors?

--
Fernando Dias
[Team Elevate]
Wed, Oct 10 2012 6:08 PMPermanent Link

Barry

Barry wrote:

"John Hay" wrote:

>How about

>SELECT Datex,T1.* FROM Table1
>JOIN
>(SELECT TRUNC(Ndx-1)/7) AS PeriodNum,SUM(ColA)/SUM(Colb) FROM Table1
> GROUP BY PeriodNum) T1
>ON (T1.PeriodNum*7)+1 = Table1.Ndx

Well, I thought it would work. Here is the problem.

The average for Ndx=7 is made up of rows with Ndx=1 to 7. Simple enough.
The average for Ndx=8 is the LAST 7 rows or rows with Ndx=2 to 8,
The average for Ndx=9 is the LAST 7 rows or rows with Ndx=3 to 9,
The average for Ndx=n is the LAST 7 rows or rows with Ndx=n-7+1 to n,

So the join would have to be ON(T1.Ndx <= Table1.Ndx and T1.Ndx>Table1.Ndx-7) without a Group By on T1. This is where Adam's index on Ndx comes in.

I will have to brainstorm this some more.

Barry
Wed, Oct 10 2012 6:12 PMPermanent Link

Barry

Fernando Dias wrote:

>Have you considered SQL/PSM to do it in a stored procedure using SQL cursors?


Yes I have. But I would rather do it in Delphi because it will be easier to create moving averages with TList. In fact, I've already done this on a much simpler scale and it works. I thought there should be an easy SQL equivalent before I commit to doing it all in Delphi. It looks like Delphi will be much faster because there is only 1 pass through the table. I'm still brainstorming a fast SQL solution.

Barry
Wed, Oct 10 2012 7:11 PMPermanent Link

Fernando Dias

Team Elevate Team Elevate

Barry,

With a cursor you would also need only one scan of the table.
As well as if you use navigational methods in Delphi, but if you are using Client/Server a stored procedure would be way better as you would only send the results to the client, not the whole table.

--
Fernando Dias
[Team Elevate]
Wed, Oct 10 2012 7:15 PMPermanent Link

Fernando Dias

Team Elevate Team Elevate

Barry,

Thinking better, the results *are* columns in the table, right?
So what I have said about sending the results to the client is not correct ...

--
Fernando Dias
[Team Elevate]
Page 1 of 2Next Page »
Jump to Page:  1 2
Image