Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 10 of 13 total
Thread RunSum enhancements
Tue, Oct 9 2012 4:27 AMPermanent Link

Adam Brett

Orixa Systems

It would be great if the capability of RUNSUM was expanded in a couple of ways.

At the moment RUNSUM just creates a running sum for any column in a table. It would be brilliant if it could be reset (i.e. restart from zero) in circumstances picked by the user. The ideal would be to have a RESET CONDITION which could be any valid SQL Statement. It would also be great to be able to put a simple COUNTER variable in so that the RUNSUM reset after, say 5 or 7 rows.

This would allow some net stuff:

1. Resetting when a Date field moves to a new year.
2. Resetting when records start to deal with a new customer.

Adam
Tue, Oct 9 2012 3:03 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Adam,

<< At the moment RUNSUM just creates a running sum for any column in a
table. It would be brilliant if it could be reset (i.e. restart from zero)
in circumstances picked by the user. The ideal would be to have a RESET
CONDITION which could be any valid SQL Statement. It would also be great to
be able to put a simple COUNTER variable in so that the RUNSUM reset after,
say 5 or 7 rows.

This would allow some net stuff:

1. Resetting when a Date field moves to a new year.
2. Resetting when records start to deal with a new customer. >>

The question is: when do you expect this condition to be evaluated ?  For
every group in a grouped query, or for every row ?  Right now, RUNSUM is
evaluated *after* groups are calculated and according to the active ORDER
BY, if one is present.

Thanks,

Tim Young
Elevate Software
www.elevatesoft.com


Mon, Oct 15 2012 3:04 PMPermanent Link

Barry

>The question is: when do you expect this condition to be evaluated ?  For
>every group in a grouped query, or for every row ?  Right now, RUNSUM is
>evaluated *after* groups are calculated and according to the active ORDER
>BY, if one is present.

Tim,

I've given this a lot of thought lately and I'm wondering if it would be easier to add new functions instead of "fixing" RunSum.

I've been through an SQL nightmare for the past week trying to use SQL to calculate weighted moving averages efficiently. A simple moving average function like SMA(ColA,NumPeriods [,ResetColName]) would solve a lot of problems, and a weighted moving average  WMA(ColA,ColWeight,NumPeriods[,ResetColName]) would be even better. The ResetColName would reset the average if the ResetColName changes value like Cust_Id. If #Rows < NumPeriods, then the result is NULL.

To accomplish SMA or WMA in SQL without these functions is a real headache. Even something "simple" as SumPeriod(ColA,NumPeriods[,ResetColName]) to sum ColA for the past 'NumPeriod' rows is difficult. I have to do this for more than 12 different columns (actually 24 columns if I count the weighted divisor) to create 12 weighted moving averages.

In the example I posted in another thread http://www.elevatesoft.com/forums?action=view&category=edb&id=edb_sql&msg=5464&page=1

all I needed to do was to calculate a Sum on a column for the last 7 rows including the current row, for each row of the table. This is not so simple with SQL without resorting to slow subselects. John Hay finally stepped in and gave me an elegant solution (thanks John!):

SELECT MAX(Ndx) Ndx,SUM(ColA)/SUM(ColB) AvgAB FROM Table1 WHERE Ndx < 8 /* edge case for first 7 rows */
UNION ALL
SELECT T2.Ndx,(T2.TotA-T1.TotA)/(T2.TotB-T1.TotB) FROM
(SELECT Ndx,RUNSUM(ColA) TotA,RUNSUM(ColB) TotB FROM Table1 GROUP BY Ndx) T1
JOIN
(SELECT Ndx,RUNSUM(ColA) TotA,RUNSUM(ColB) TotB FROM Table1 GROUP BY Ndx) T2
ON T2.Ndx=T1.Ndx+7

Of course in my case, this means orchestrating the summing of 24 different columns over 3 Select statements whose columns all have to match. I did get it to work, but maintaining it is going to be difficult because my client will want more averages added over time. All of the averages have to be checked by hand to make sure there are no mistakes.

All of this could have been eliminated if the functions I mentioned earlier had existed. None of these functions should require a Group By clause.  I should be able to execute:

 select Stock_Price, SMA(Stock_Price, 10, Stock_Symbol) from StockHistory order by Stock_Symbol, Stock_Date;

and this gives me a simple moving average for Stock_Price and the average gets reset when the Stock_Symbol changes. The first 9 rows will have SMA as Null because there are not enough rows (10) to calculate the SMA.

A weighted moving average would look something like:
 select CollegeName, Year, WMA(GPA, NumStudents, CollegeName) from Colleges order by CollegeName, Year
 
Of course the problem with any moving average (or moving 'anything') is you have to store the last 'n' values in an array and replace the oldest value with the newest value to get the new moving average. So I'm not sure if storing an array is even possible in an external module. (I've never written an external module before.)

That's the problem as I see it.
I hope it helps.
 
Barry  
Tue, Oct 16 2012 3:56 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Barry


Can you write the function you want in Delphi? If so, and its doesn't use any functions beyond D2006 I can easily produce the external module code BUT I suspect its going to be more performant to produce as pure DElphi and not integrate into SQL.

The problem comes with the fact that either as an SQL based or Delphi based external module I can see no alternative to reading x rows from the table for each row. Built into the SQL engine core Tim can pass data between iterations. I know of no method of doing that in an external module (well I can dor a Delphi based one but initialising for each run may prove tricky - very tricky).

Roy Lambert [Team Elevate]
Tue, Oct 16 2012 12:56 PMPermanent Link

John Hay

Barry

In general I suspect that this will require a different approach in the engine using OVER/PARTITION type syntax which
looks like a pretty big task.

>   select Stock_Price, SMA(Stock_Price, 10, Stock_Symbol) from StockHistory order by Stock_Symbol, Stock_Date;
>
> and this gives me a simple moving average for Stock_Price and the average gets reset when the Stock_Symbol changes.
The first 9 rows will have SMA as Null because there are not enough rows (10) to calculate the SMA.

Having a think about it I am pretty sure that a Stored Procedure could generically handle the SMA task.  The parameter
list would be

TableName (eg StockHistory)
Order Columns (eg Stock_Symbol,Stock_Date)
Reset Column (first column of order columns?)
Value Column (eg Stock_Price)
RowCount (eg 10)

The dataset/cursor returned would be

Order Columns,Value Column,Moving Average

Would this be useful?

>
> A weighted moving average would look something like:
>   select CollegeName, Year, WMA(GPA, NumStudents, CollegeName) from Colleges order by CollegeName, Year
>

I would have to see a worked example of this to understand what needs to be reset where to see if it can also be handled
generically.

John

Tue, Oct 16 2012 1:17 PMPermanent Link

Barry

Roy Lambert wrote:

>Can you write the function you want in Delphi? If so, and its doesn't use any functions beyond D2006 I can easily >produce the external module code BUT I suspect its going to be more performant to produce as pure DElphi and >not >integrate into SQL.

Yes, but if I do it in pure Delphi that means creating a temporary table (or use calculated fields to look up the average in a TStringList etc.). By keeping it in SQL I can have it as a view. John Hay's solution is quite fast, but of course the SQL is complicated. I guess I can't have my cake and eat it too. Smile

>The problem comes with the fact that either as an SQL based or Delphi based external module I can see no >alternative to reading x rows from the table for each row.
>Built into the SQL engine core Tim can pass data between >iterations. I know of no method of doing that in an external module (well I can dor a Delphi based one but initialising >for each run may prove tricky - very tricky).

That's what I was afraid of. The ability to store the last 'n' row values is going to be a pain to try and achieve in an external module. 'n' is only 7 in my case, but it could be 25, 50, or even 250. It is much better if it were done at the database level, under the hood sort of speak. I don't have my mechanics license, so I better leave it to the mechanic Tim to see if it is feasible.

Barry
Wed, Oct 17 2012 2:54 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Barry

>Yes, but if I do it in pure Delphi that means creating a temporary table (or use calculated fields to look up the average in a TStringList etc.). By keeping it in SQL

No it doesn't. An external module would have a function in it which can be called from SQL. In that function you'd simply query the appropriate table to get the necessary information to calculate the average, calculate it and return it as a float.

If you look in the extensions ng you'll see samples of functions written in Delphi that I call as SQL functions.

Not as fast as raw SQL though Frown

Roy Lambert [Team Elevate]
Wed, Oct 17 2012 12:41 PMPermanent Link

Barry

Roy,
>>Yes, but if I do it in pure Delphi that means creating a temporary table (or use calculated fields to look up the
>>average in a TStringList etc.). By keeping it in SQL

>No it doesn't. An external module would have a function in it which can be called from SQL. In that function you'd >simply query the appropriate table to get the necessary information to calculate the average, calculate it and >return it as a float.

Sorry Roy. I meant the "pure Delphi" solution would be to write all of the code to calculate the column averages in a Delphi procedure, inside of my application. I'd use an EDBTable to average the table columns in one pass, building the array/stringlist as I go, and either:

1) write these rows and averages out to a temporary table,
2) or use a calculated field on a query/table to pull in the moving average from my array/stringlist using the primary key of the row to do a lookup into the array/stringlist.

It would be too slow to summarize 7 rows for each row in the table (via external module or subselects). If the average period increases from 7 to say to 50 or 100 or 250, the process would slow down dramatically. But doing it in one pass as I described above, the length of the period won't affect the speed of the calculations.

The only SQL solution I see (other than John Hay's) is to have SMA, WMA, PeriodSum functions in the SQL language so it can be calculated at the database level (under the hood) in one pass.

Now if there was a way to have an external module retain values in an array from one call to the next, then that would be something. This would of course eliminate the need to re-query the prior rows. But I think this is impossible.

Off of the top of my head, the only "solution" I see with for external Delphi module is to create a temporary (memory) table and store the last 'n' values in that, probably as a comma delimited string in a CLOB field. Then when a new row is processed, drop off the first column in the string and add the new one to the end of the string, store it back to the table, then sum the comma delimited values of the string and return the average. Actually the sum only has to be calculated once; when the number of elements='n' the number of periods. After that we only need to subtract the number being dropped from the list and add the new number being added and store that as a running sum.

For a weighted moving average WMA there would be 2 pairs of numbers, the column value and the column weight. The function would have a unique id parameter so multiple averages can be done at the same time in the same SQL statement.

Hmmmm. Do you see any problem doing it this way?

Barry
Thu, Oct 18 2012 8:30 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Barry


>Now if there was a way to have an external module retain values in an array from one call to the next, then that would be something. This would of course eliminate the need to re-query the prior rows. But I think this is impossible.

>Off of the top of my head, the only "solution" I see with for external Delphi module is to create a temporary (memory) table and store the last 'n' values in that, probably as a comma delimited string in a CLOB field. Then when a new row is processed, drop off the first column in the string and add the new one to the end of the string, store it back to the table, then sum the comma delimited values of the string and return the average. Actually the sum only has to be calculated once; when the number of elements='n' the number of periods. After that we only need to subtract the number being dropped from the list and add the new number being added and store that as a running sum.
>
>For a weighted moving average WMA there would be 2 pairs of numbers, the column value and the column weight. The function would have a unique id parameter so multiple averages can be done at the same time in the same SQL statement.
>
>Hmmmm. Do you see any problem doing it this way?

The big problem I can see with any approach I can come up with, and with the approach you've suggested is making it general.

With your approach I think I can see two problems. Firstly you have to create and subsequently free the in-memory table and you can only have one instance running at a time in a given session otherwise there will be a name clash unless you com up with a way to pass the memory table name in. Secondly you have to have a strategy that will return zero until the magic number of rows has passed which has implications for the table structure.

Roy Lambert [Team Elevate]
Fri, Oct 19 2012 7:32 AMPermanent Link

John Hay

Roy
> The big problem I can see with any approach I can come up with, and with the approach you've suggested is making it
general.

I think the main problem is that it is not really a function - more set manipulation.

Just for fun (I know I need to get a life!) I have made an attempt at a generalised solution in a Stored Proc

For a stock price history the parameters would be something like (Identity field must be unique)

StockHistory
Ticker,StockPriceDate
Open,Close,High,Low,Volume
Ticker,StockPriceDate
10
StockPriceDate BETWEEN Date'2012-01-01' AND Date'2012-03-31'

CREATE PROCEDURE "SimpleMovingAverage" (IN "TableName" VARCHAR COLLATE ANSI, IN "OrderFields" VARCHAR COLLATE ANSI, IN
"ValueField" VARCHAR COLLATE ANSI, IN "IdentityField" VARCHAR COLLATE ANSI, IN "Period" INTEGER,  IN "WhereClause"
VARCHAR COLLATE ANSI)
BEGIN
DECLARE cur cursor WITH RETURN FOR stmt;
DECLARE selectlist VARCHAR;
DECLARE joinfield VARCHAR;
DECLARE temp VARCHAR;
DECLARE temp1 VARCHAR;
DECLARE periodst VARCHAR;
DECLARE valuefieldlist VARCHAR;
DECLARE averagevaluefieldlist VARCHAR;

BEGIN
 EXECUTE IMMEDIATE 'drop table temp_sma';
EXCEPTION
END;
IF POS(',',OrderFields) = 0 THEN
 SET joinfield=OrderFields;
ELSE
 SET joinfield=SUBSTRING(OrderFields,1,POS(',',OrderFields)-1);
END IF;
SET periodst = CAST(period as VARCHAR);
SET temp=valuefield;
SET valuefieldlist='';
SET averagevaluefieldlist='';

WHILE POS(',',temp) > 0 do
 SET temp1 = SUBSTRING(temp,1,POS(',',temp)-1);
 SET valuefieldlist = valuefieldlist+'RUNSUM('+temp1+') '+temp1+'_Avg,';
 SET averagevaluefieldlist = averagevaluefieldlist+'(tmp1.'+temp1+'_Avg-IF(ndx1-ndx2+1 >= '+periodst+' THEN
coalesce(tmp2.'+temp1+'_Avg,0) ELSE null))/'+periodst+' '+temp1+'_Avg,';
 SET temp = SUBSTRING(temp,POS(',',temp)+1,1000);
END WHILE;
SET valuefieldlist = valuefieldlist+'RUNSUM('+temp+') '+temp+'_Avg';
SET averagevaluefieldlist = averagevaluefieldlist+'(tmp1.'+temp+'_Avg-IF(ndx1-ndx2+1 >= '+periodst+' THEN
coalesce(tmp2.'+temp+'_Avg,0) ELSE null))/'+periodst+' '+temp+'_Avg';
SET selectlist='tmp1.'+OrderFields;
SET selectlist=replace(',' WITH ',tmp1.' in selectlist);
EXECUTE IMMEDIATE 'CREATE TEMPORARY TABLE temp_sma as SELECT CAST(NULL AS integer) ndx1,* FROM '+tablename+' WITH NO
data';
EXECUTE IMMEDIATE 'INSERT INTO temp_sma SELECT RUNSUM(1) AS NDX1,* FROM '+tablename+
IF(whereclause <> '' THEN ' WHERE '+whereclause ELSE '')+' GROUP BY '+IdentityField+' ORDER BY '+OrderFields;

PREPARE stmt FROM 'SELECT '+SELECTlist+','+ValueField+','+averagevaluefieldlist+' FROM '+
'(SELECT NDX1,'+OrderFields+','+ValueField+','+valuefieldlist+' FROM temp_sma '+
'GROUP BY NDX1,'+OrderFields+','+ValueField+') tmp1 '+
' LEFT OUTER JOIN '+
'(SELECT NDX1,'+OrderFields+','+ValueField+','+valuefieldlist+' FROM temp_sma '+
'GROUP BY NDX1,'+OrderFields+','+ValueField+') tmp2 on '+
'tmp2.NDX1=tmp1.NDX1-'+periodst+
' LEFT OUTER JOIN '+
'(SELECT '+joinfield+',min(ndx1) as ndx2 FROM temp_sma GROUP BY '+joinfield+') tmp3 '+
' on tmp3.'+joinfield+'=tmp1.'+joinfield;

OPEN cur;

END

Page 1 of 2Next Page »
Jump to Page:  1 2
Image