Support Forums - View Thread

View Thread

The following is the text of the current message along with any replies.

Messages 1 to 7 of 7 total

Sun, Nov 19 2006 11:45 AM	Permanent Link
Roy Lambert NLH Associates Team Elevate	I'm probably doing something wrong. I thought sql would be faster but navigationaly I can do the job far faster (sql c6s table c94ms). Anyone have any suggestion? UPDATE Bayesian SET _Spam = _Spam + _Count, _NonSpam = _NonSpam - _Count, _Total = _Spam + _NonSpam, _Probability = COALESCE((_Spam / 453 / ((2 * _NonSpam )/ 3705 + _Spam / 453)),0) FROM Bayesian JOIN "Memory\TokenCounts" on "Memory\TokenCounts"._Token = _Token while not bay2.Eof do begin <<< disk based version of "Memory\TokenCounts" if Bayesian.FindKey([bay2.fieldbyname('_Token').asstring]) then begin Bayesian.Edit; Bayesian.FieldByName('_Spam').AsInteger := Bayesian.FieldByName('_Spam').AsInteger + Bay2.fieldbyname('_count').AsInteger; Bayesian.FieldByName('_NonSpam').AsInteger := Bayesian.FieldByName('_NonSpam').AsInteger - Bay2.fieldbyname('_count').AsInteger; Bayesian.FieldByName('_Total').AsInteger := Bayesian.FieldByName('_Spam').AsInteger + Bayesian.FieldByName('_NonSpam').AsInteger; if Bayesian.FieldByName('_Total').AsInteger > 0 then begin Bayesian.FieldByName('_probability').AsFloat := (Bayesian.FieldByName('_Spam').AsInteger / 453) / ((2 * Bayesian.FieldByName('_NonSpam').AsInteger) / 3705 + Bayesian.FieldByName('_Spam').AsInteger / 453) end else Bayesian.FieldByName('_probability').AsInteger := 0; Bayesian.Post; end; Bay2.Next; end; Time taken c94 ms ================================================================================ SQL statement (Executed with 4.24 Build 1) ================================================================================ UPDATE Bayesian SET _Spam = _Spam + _Count, _NonSpam = _NonSpam - _Count, _Total = _Spam + _NonSpam, _Probability = COALESCE((_Spam / 453 / ((2 * _NonSpam )/ 3705 + _Spam / 453)), 0) FROM Bayesian JOIN "Memory\TokenCounts" on "Memory\TokenCounts"._Token = _Token Tables Involved --------------- Bayesian (Bayesian) table opened shared, has 165352 rows TokenCounts (TokenCounts) table opened shared, has 384 rows Join Ordering ------------- The driver table is the Bayesian table (Bayesian) The Bayesian table (Bayesian) is joined to the TokenCounts table (TokenCounts) with the INNER JOIN expression: _Token = "Memory\TokenCounts"._Token Optimizer will attempt to re-order the joins to a more optimal order Use the NOJOINOPTIMIZE clause at the end of the SQL statement to force the optimizer to leave the joins in their declared order The joins are already in optimal order and cannot be optimized any further The expression: _Token = "Memory\TokenCounts"._Token is OPTIMIZED ================================================================================ >>>>> 384 rows affected in 6.172 seconds ================================================================================ Roy Lambert
Mon, Nov 20 2006 5:02 PM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Roy, << I'm probably doing something wrong. I thought sql would be faster but navigationaly I can do the job far faster (sql c6s table c94ms). Anyone have any suggestion? UPDATE Bayesian SET _Spam = _Spam + _Count, _NonSpam = _NonSpam - _Count, _Total = _Spam + _NonSpam, _Probability = COALESCE((_Spam / 453 / ((2 * _NonSpam )/ 3705 + _Spam / 453)),0) FROM Bayesian JOIN "Memory\TokenCounts" on "Memory\TokenCounts"._Token = _Token >> The SQL has to process the UPDATE by joining the Bayesian table to the in-memory table, i.e. it has to scan all 165352 rows, whereas the navigational code only has to navigate 384 rows. This is a limitation of UPDATEs with joins, i.e. the first driver table in the joins must be the UPDATE table, and the optimizer cannot flip them around. -- Tim Young Elevate Software www.elevatesoft.com
Tue, Nov 21 2006 4:30 AM	Permanent Link
Roy Lambert NLH Associates Team Elevate	Tim That's pretty much the conclusion I'd come to. Thank TIM for TDBISAMTables Roy Lambert
Tue, Nov 21 2006 8:07 AM	Permanent Link
"Robert"	"Tim Young [Elevate Software]" <timyoung@elevatesoft.com> wrote in message news:891212AA-C06F-4324-BFEB-96805271D3F5@news.elevatesoft.com... > Roy, > > << I'm probably doing something wrong. I thought sql would be faster but > navigationaly I can do the job far faster (sql c6s table c94ms). Anyone > have any suggestion? > > UPDATE Bayesian SET _Spam = _Spam + _Count, _NonSpam = _NonSpam - _Count, > _Total = _Spam + _NonSpam, > _Probability = COALESCE((_Spam / 453 / ((2 * _NonSpam )/ 3705 + _Spam / > 453)),0) > FROM Bayesian JOIN "Memory\TokenCounts" on "Memory\TokenCounts"._Token = > _Token >> > > The SQL has to process the UPDATE by joining the Bayesian table to the > in-memory table, i.e. it has to scan all 165352 rows, whereas the Is this true even if both tables have the correct indexes? Robert
Tue, Nov 21 2006 8:23 AM	Permanent Link
Roy Lambert NLH Associates Team Elevate	Robert They're both indexed on _Token Roy Lambert
Tue, Nov 21 2006 10:03 AM	Permanent Link
"Robert"	"Roy Lambert" <roy.lambert@skynet.co.uk> wrote in message news:8132682A-7867-4365-AE8B-51E517B14F6A@news.elevatesoft.com... > Robert > > > They're both indexed on _Token > Shame, it would be nice if DBISAM optimized the JOIN. Robert
Tue, Nov 21 2006 1:30 PM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Robert, << Is this true even if both tables have the correct indexes? >> Yes. The join re-ordering is turned off for the driver table with respect to the UPDATE and DELETE statements. -- Tim Young Elevate Software www.elevatesoft.com