Support Forums - View Thread

View Thread

The following is the text of the current message along with any replies.

Messages 1 to 10 of 11 total

Unicode Table joins using GUID are extremely slow

Sat, Aug 25 2012 9:17 PM	Permanent Link
Barry	I have 2 tables (14k rows each) and any time I try and join 2 columns of the tables together by using the GUID columns File_Id (indexed), it will take a few hours! select t1.file_id, t2.file_id, from table1 t1, table2 t2 where t1.file_id=t2.file_id This should only take 5 seconds. Both tables have an index on file_id. I have also optimized the tables. It seems when GUID's are used in a table join, I might as well go home. Maybe the problem is Unicode related? Has anyone else run into this problem with GUID in table joins on Unicode tables? TIA Barry v2.09 with 3gb RAM on i5 machine
Sat, Aug 25 2012 11:29 PM	Permanent Link
Barry	Here is the Execution Plan. What sticks out is the Total Rows Visited=215,752,032 which = 14689 x 14688 which are the # of rows in each table. So it appears it is NOT using the ix_FileId index at all (on the File_Id GUID column), and for each File_Id in Table1, it is searching for a corresponding File_Id in Table2 by traversing the entire table row by row (without using the ix_FileId index). No wonder it is slowww. I have rebuilt the index in each table using: CREATE INDEX "ix_FileId" ON "Table1" ("File_Id" COLLATE "UNI"); CREATE INDEX "ix_FileId" ON "Table2" ("File_Id" COLLATE "UNI"); Still it takes over an hour, or 14688x slower than it should be. How do I force it to use the index ix_FileId for the table join? TIA Barry EDBMgr v2.09 B1 ================================================================================ SQL Query (Executed by ElevateDB 2.09 Build 1) Note: The SQL shown here is generated by ElevateDB and may not be exactly the same as the SQL that was originally entered. However, none of the differences alter the execution results in any way. ================================================================================ SELECT ALL "t1"."file_id" AS "file_id", "t2"."file_id" AS "file_id1" FROM "Table1" AS "t1", "Table2" AS "t2" WHERE "t1"."file_id" = "t2"."file_id" Source Tables ------------- Table1 (t1): 14689 rows Table2 (t2): 14688 rows Result Set ---------- The result set was insensitive and read-only The result set consisted of zero or more rows Filtering --------- The following filter condition was applied to the result set rows as they were generated: "t1"."file_id" = "t2"."file_id" Joins ----- The driver table was the Table1 (t1) table The optimizer attempted to re-order the joins to a more optimal order The joins were already in the most optimal order Result set I/O statistics ------------------------- Total rows visited: 215752032 Row buffer manager Max buffer size: 1048432 Buffer size: 2585088 Hits: 14688 Misses: 0 Hit ratio: 1 Reads: 0 Bytes read: 0 Writes: 0 Bytes written: 0 Index Page buffer manager Max buffer size: 2097152 Buffer size: 442368 Hits: 28883 Misses: 0 Hit ratio: 1 Reads: 0 Bytes read: 0 Writes: 0 Bytes written: 0 ================================================================================ 14688 row(s) returned in 4409.821 secs ================================================================================
Sun, Aug 26 2012 12:03 AM	Permanent Link
Raul Team Elevate	Barry, This is a full join so no wonder it's slow. How about this SQL : SELECT "t1"."file_id" AS "file_id", "t2"."file_id" AS "file_id1" FROM "Table1" AS "t1" INNER JOIN "Table2" AS "t2" ON "t1"."file_id" = "t2"."file_id" Raul On 8/25/2012 11:29 PM, Barry wrote: > Here is the Execution Plan. What sticks out is the Total Rows Visited=215,752,032 which = 14689 x 14688 which are the # of rows in each table. So it appears it is NOT using the ix_FileId index at all (on the File_Id GUID column), and for each File_Id in Table1, it is searching for a corresponding File_Id in Table2 by traversing the entire table row by row (without using the ix_FileId index). No wonder it is slowww. > > I have rebuilt the index in each table using: > CREATE INDEX "ix_FileId" ON "Table1" ("File_Id" COLLATE "UNI"); > CREATE INDEX "ix_FileId" ON "Table2" ("File_Id" COLLATE "UNI"); > > Still it takes over an hour, or 14688x slower than it should be. How do I force it to use the index ix_FileId for the table join? > > TIA > Barry > > EDBMgr v2.09 B1 > > ================================================================================ > SQL Query (Executed by ElevateDB 2.09 Build 1) > > Note: The SQL shown here is generated by ElevateDB and may not be exactly the > same as the SQL that was originally entered. However, none of the differences > alter the execution results in any way. > ================================================================================ > > SELECT ALL > "t1"."file_id" AS "file_id", > "t2"."file_id" AS "file_id1" > FROM "Table1" AS "t1", "Table2" AS "t2" > WHERE "t1"."file_id" = "t2"."file_id" > > Source Tables > ------------- > > Table1 (t1): 14689 rows > Table2 (t2): 14688 rows > > Result Set > ---------- > > The result set was insensitive and read-only > The result set consisted of zero or more rows > > Filtering > --------- > > The following filter condition was applied to the result set rows as they were > generated: > > "t1"."file_id" = "t2"."file_id" > > Joins > ----- > > The driver table was the Table1 (t1) table > > The optimizer attempted to re-order the joins to a more optimal order > > The joins were already in the most optimal order > > Result set I/O statistics > ------------------------- > > Total rows visited: 215752032 > > Row buffer manager > > Max buffer size: 1048432 Buffer size: 2585088 > > Hits: 14688 Misses: 0 Hit ratio: 1 > > Reads: 0 Bytes read: 0 > > Writes: 0 Bytes written: 0 > > Index Page buffer manager > > Max buffer size: 2097152 Buffer size: 442368 > > Hits: 28883 Misses: 0 Hit ratio: 1 > > Reads: 0 Bytes read: 0 > > Writes: 0 Bytes written: 0 > > ================================================================================ > 14688 row(s) returned in 4409.821 secs > ================================================================================ >
Sun, Aug 26 2012 12:22 AM	Permanent Link
Barry	Raul wrote: >Barry, >This is a full join so no wonder it's slow. >How about this SQL : >SELECT >"t1"."file_id" AS "file_id", >"t2"."file_id" AS "file_id1" >FROM "Table1" AS "t1" INNER JOIN "Table2" AS "t2" >ON "t1"."file_id" = "t2"."file_id" Raul, I had tried that and was just going to post the results when I saw your reply. Using the inner join dropped it down to 17 seconds which is pretty good. Not as fast as the 5 seconds I was hoping for, but fast enough. The inner-join on the GUID column was almost 2x faster than an inner-join on an Integer column (32 seconds), which was surprising. The reason I used: select t1.col1, t2.col2 from table1 t1, table2 t2 where t1.col1=t2.col2 is because that is how MySQL executes an equi-join. It is somewhat similar to an inner-join except MySQL can use either the t2.col2 index or t1.col1 index. From reading the EDB manual a few minutes ago, I see the EDB inner-join can also choose the optimal index for the inner-join which is great. All I have to do now is remember the "new" inner join syntax and stop "equi-joins" like I've been doing for the past 10 years. Barry BTW, MySQL doesn't have a full outer join so I think that's what threw me.
Sun, Aug 26 2012 1:29 PM	Permanent Link
Raul Team Elevate	Hi Barry, The join syntax is pretty standard so i do use the "JOIN table ON ..." syntax with the MySQL on the times i need to. This way all my sql always remains fairly similar between different back ends. I'm not sure but i think the MySQL optimizer likely resolves it back to proper inner join join if you just use where clause like you did - it might be worthwhile to email elevate support and see if Tim can implement something like this in optimizer as well. There is no full outer join in EDB either - only left or right. Raul On 8/26/2012 12:22 AM, Barry wrote: > All I have to do now is remember the "new" inner join syntax and stop "equi-joins" like I've been doing for the past 10 years. > > Barry > > BTW, MySQL doesn't have a full outer join so I think that's what threw me. >
Tue, Aug 28 2012 1:20 AM	Permanent Link
Barry	Raul wrote: >>This is a full join so no wonder it's slow. Why would a full join need to traverse 215,752,032 rows? Maybe if it were trying to do a cartesian product but my join only returned 14,688 rows when it finished. Is it a bug? Barry
Tue, Aug 28 2012 4:08 AM	Permanent Link
Fernando Dias Team Elevate	Barry, In fact, this syntax produces a CROSS JOIN, that is the Cartesian product of the 2 tables. -- Fernando Dias [Team Elevate]
Tue, Aug 28 2012 6:05 AM	Permanent Link
John Hay	Barry > I had tried that and was just going to post the results when I saw your reply. Using the inner join dropped it down to 17 seconds which is pretty good. Not as fast as the 5 seconds I was hoping for, but fast enough. The inner-join on the GUID column was almost 2x faster than an inner-join on an Integer column (32 seconds), which was surprising. Is this with tables with circa 14K rows? If so this seems to be very slow indeed. I tried it on an ancient (2003!!) laptop with 17K records and the guid query took just over 2 seconds and the integer less than 1.5. John
Tue, Aug 28 2012 10:24 AM	Permanent Link
Raul Team Elevate	What Fernando said - the join is cartesian but the where clause then filters it down again to only matching rows. So if there are 14,688 matching GUID's between tables it's not a bug but just result of where filter. Raul On 8/28/2012 1:20 AM, Barry wrote: > Raul wrote: > >>> This is a full join so no wonder it's slow. > > Why would a full join need to traverse 215,752,032 rows? Maybe if it were trying to do a cartesian product but my join only returned 14,688 rows when it finished. Is it a bug? > > Barry >
Tue, Aug 28 2012 10:43 AM	Permanent Link
Barry	Raul wrote: >What Fernando said - the join is cartesian but the where clause then >filters it down again to only matching rows. >So if there are 14,688 matching GUID's between tables it's not a bug but >just result of where filter. Fernando & Raul, Ok, that makes sense. Thanks. Barry

	Page 1 of 2	Next Page »
Jump to Page: 1 2