Icon View Thread

The following is the text of the current message along with any replies.
Messages 11 to 18 of 18 total
Thread Looking for fast efficient SQL to create 7 period trailing sums or ratios
Wed, Oct 10 2012 9:12 PMPermanent Link

Barry

Fernando Dias wrote:

>Thinking better, the results *are* columns in the table, right?

Yes, the results are moving averages of the columns. Every row in Table1 will generate a row in TableMA, which is a row that has the 12 column averages for the past 7 rows. If I was averaging only 1 or 2 columns, then using a subselect would be simple enough. But with at least 12 columns to average, and maybe twice that in the next 6 months, subselects will be too slow.

>So what I have said about sending the results to the client is not correct ...

The tables will have only a few hundred rows in them so sending the results to the client won't matter. I was thinking of building a table out of the summarized columns so it only has to be done occasionally (once a day). I have one more SQL trick to try tomorrow to see if I can pull a rabbit out of my hat, otherwise it is back to Delphi.

Barry
Thu, Oct 11 2012 4:53 AMPermanent Link

John Hay

Hi Barry,

Unless I'm missing something, I don't actually see what the problem is with a shorter last period.

However, your adjustment has shown that the query I posted was overcomplicated - the subquery on its own should do what
is required.

SELECT min(datex) StartDate, max(datex) EndDate, TRUNC((Ndx-1)/7) AS PeriodNum,SUM(ColA)/SUM(ColB),SUM(ColC)/SUM(ColD)
etc FROM Table1 GROUP BY PeriodNum

John







Thu, Oct 11 2012 11:19 AMPermanent Link

Barry

John,

>Unless I'm missing something, I don't actually see what the problem is with a shorter last period.

It is not obvious at first, but your SQL statement will have to be executed 7 times, once for every row of the Period. Since the period is rather short (only 7), then this may not be a problem. Remember, the SUM  for the row Ndx=n, will sum the rows n-period+1 to n. There is a sum calculated for *every* row of Table1 (1:1 correspondence). Your SQL statement will sum the rows for every 7th row of Table1. We need to repeat that by tweaking the offset of the query so every row has a sum, not just every 7th row. In my case, I would create a temporary table and append each run to it. I will then end up with a table that has the same number of rows as Table1.

From my original post:

Solution #3:
  Select Max(Ndx), Max(Datex), Sum(ColA)/Sum(ColB) as RatioAB from Table1 group by Floor((Ndx-1)/7);
This will generate the ratios for days: 7,14,31,28 ....

But I would have to run this SQL 7 times to get a moving ratio for all days 8,9,10,11,12,13 as in:
  Select Max(Ndx), Max(Datex), Sum(ColA)/Sum(ColB) as RatioAB from Table1 where Ndx >=2 group by Floor((Ndx-2)/7);

followed by:
  Select Max(Ndx), Max(Datex), Sum(ColA)/Sum(ColB) as RatioAB from Table1 where Ndx >=3 group by Floor((Ndx-3)/7);

etc.

The solution to this problem is a bit harder than it looks. I call it a bit of an SQL mind bender. Smile

If Period was 100, then we would have to execute the query 100 times which would be impractical. I could do it in a script or Delphi code in one pass of the table. I can't see an efficient way to do it in SQL though.

The best solution Adam and I could come up with is to persuade Tim to modify RUNSUM() or come up with RunSumPeriod() so it accepts two more parameters, a period parameter and a break column(s) parameter that would reset the sum when the column(s) change.

The syntax would look like this: RunSum(Colx [,Period] [,(BreakCol1, BreakCol2,..)]). If I had this function, then it could all be done in one SQL statement. (I don't need the break columns for this problem, but it will come in handy elsewhere.)

All I would have to do is execute the following SQL statement once:

    select Ndx, RunSumPeriod(ColA,7)/RunSumPeriod(ColB,7) as RatioAB from Table1 order by Ndx;

And this would give me a running sum for all the rows in Table1. A Group By is not needed and the table would still be sensitive if Ndx is an indexed column.

So there you have it. Creating running sums for a period of rows (7), is not that simple in SQL. Either a new function has to be created/modified, or it has to be done in script or Delphi code.

Barry
Thu, Oct 11 2012 12:24 PMPermanent Link

John Hay

Barry

I do apologise for not reading the original posting carefully enough.  Thanks for the explanation.

The average for Ndx=7 is made up of rows with Ndx=1 to 7. Simple enough.
The average for Ndx=8 is the LAST 7 rows or rows with Ndx=2 to 8,
The average for Ndx=9 is the LAST 7 rows or rows with Ndx=3 to 9,
The average for Ndx=n is the LAST 7 rows or rows with Ndx=n-7+1 to n,

So presuming ndx starts at 1 for a period of 7 we will have max(ndx)-6 rows .  eg for 1-16 we have 10 rows (7 to 16).
Ignoring dates etc at the moment does the following give the correct ratios?

SELECT MAX(Ndx) Ndx,SUM(ColA)/SUM(ColB) AvgAB FROM Table1 WHERE Ndx < 8 /* edge case for first 7 rows */
UNION ALL
SELECT T2.Ndx,(T2.TotA-T1.TotA)/(T2.TotB-T1.TotB) FROM
(SELECT Ndx,RUNSUM(ColA) TotA,RUNSUM(ColB) TotB FROM Table1) T1
JOIN
(SELECT Ndx,RUNSUM(ColA) TotA,RUNSUM(ColB) TotB FROM Table1) T2
ON T2.Ndx=T1.Ndx+7

John

Thu, Oct 11 2012 12:27 PMPermanent Link

John Hay

oops missed group by ndx

SELECT MAX(Ndx) Ndx,SUM(ColA)/SUM(ColB) AvgAB FROM Table1 WHERE Ndx < 8 /* edge case for first 7 rows */
UNION ALL
SELECT T2.Ndx,(T2.TotA-T1.TotA)/(T2.TotB-T1.TotB) FROM
(SELECT Ndx,RUNSUM(ColA) TotA,RUNSUM(ColB) TotB FROM Table1 GROUP BY Ndx) T1
JOIN
(SELECT Ndx,RUNSUM(ColA) TotA,RUNSUM(ColB) TotB FROM Table1 GROUP BY Ndx) T2
ON T2.Ndx=T1.Ndx+7

Thu, Oct 11 2012 3:40 PMPermanent Link

Barry

John,

That's brilliant! You've taken the RunSum(ColA) of the current row and subtracted off the RunSum(ColA) from 7 rows ago to get the trailing 7 row sum for ColA. With this I can now create trailing ratios and moving averages in one pass.

Many thanks. Smile

Barry
Fri, Oct 19 2012 8:26 AMPermanent Link

Fernando Dias

Team Elevate Team Elevate

Barry,

Here is another suggestion for moving averages using a stored procedure and
cursors.
I've just seen John Hay's procedure in another thread, that can be used with
any table/column, and this one can't because the table and column names are
fixed, but it wouldn't be hard to change them. Also, because it uses arrays,
there must me a maximum period allowed, in my example it's 100.
I believe that this would be more or less what you would do with Delphi
code, but using PSM and thus running on the server.

CREATE PROCEDURE "MAVG" (IN "N" INTEGER)
BEGIN

DECLARE DataCur SENSITIVE CURSOR WITHOUT RETURN FOR SqlData ;
DECLARE MAvgCur SENSITIVE CURSOR WITH RETURN FOR SqlMAvg ;

DECLARE Buf FLOAT ARRAY[100] DEFAULT 0;         -- Circular list : latest N
values
DECLARE Head INTEGER DEFAULT 0;                 -- Head position in the list
DECLARE I INTEGER DEFAULT 0;                    -- Counter
DECLARE NRows INTEGER DEFAULT 0;                -- N. of rows in the source
table
DECLARE TblVersion DECIMAL(19,2) DEFAULT NULL ; -- Table version: NULL if
temp table doesnt exist
DECLARE mId INTEGER ;                           -- "Id" column buffer
DECLARE mX FLOAT ;                              -- "X" column buffer
DECLARE MovSum FLOAT DEFAULT 0;                 -- Moving Sum

SET PROGRESS TO 0;
-- Maximum period is 100;
IF N < 1 OR N > 100 THEN
 RAISE ERROR CODE 10000 MESSAGE 'N must be between 1 and 100';
END IF ;

-- Check if temporary table exists and create it or delete all rows
EXECUTE IMMEDIATE
 'SELECT Version INTO ? FROM Information.TemporaryTables WHERE Name =
''tmp__mavg''' USING TblVersion ;
IF TblVersion IS NULL THEN
 EXECUTE IMMEDIATE 'CREATE TEMPORARY TABLE "tmp__mavg" (LIKE DataTbl)';
 EXECUTE IMMEDIATE 'ALTER TABLE "tmp__mavg" ADD CONSTRAINT "PrimaryKey"
PRIMARY KEY (Id) ' ;
ELSE
 EXECUTE IMMEDIATE 'DELETE FROM "tmp__mavg"';
END IF;

-- Calculate moving averages and store them to a temporary table
-- The temporary table is also returned as a sensitive cursor
PREPARE SqlMAvg FROM 'SELECT * FROM tmp__mavg ORDER BY Id';
PREPARE SqlData FROM 'SELECT * FROM DataTbl ORDER BY Id';
OPEN MAvgCur;
OPEN DataCur;
SET NRows = ROWCOUNT(DataCur) ;
FETCH FIRST FROM DataCur INTO mId, mX ;
WHILE NOT EOF(DataCur) DO
 SET I = I + 1;
 SET Head = Head + 1 ; IF Head > N THEN SET Head = 1 ; END IF;
 SET MovSum = MovSum - Buf[Head];
 SET Buf[Head] = mX ;
 SET MovSum = MovSum + mX;
 IF I >= N THEN
    INSERT INTO MAvgCur VALUES (mId, MovSum/N) ;
 END IF;
 FETCH NEXT FROM DataCur INTO mId, mX ;
 SET PROGRESS TO TRUNC(100 * I / NRows) ;
END WHILE;
CLOSE DataCur;
SET PROGRESS TO 100;
END

--
Fernando Dias
[Team Elevate]
Mon, Nov 5 2012 4:56 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Fernando,

<< Here is another suggestion for moving averages using a stored procedure
and cursors. >>

Ha !  I read/answered the suggestions newsgroup first and suggested this
exact thing and voila !, here it is.  Anyone want me to do some more neat
magic tricks ? Wink

In general, to Barry and others - if you need "session variables", the best
way to emulate this in EDB is to use a temporary table.  In fact, you can
just use one temporary table for the entire session called
"SessionVariables" and give it all of the columns that you want to use as
variables in the application SQL (at least until I implement the
equivalent).

Thanks,

Tim Young
Elevate Software
www.elevatesoft.com
« Previous PagePage 2 of 2
Jump to Page:  1 2
Image