Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 8 of 8 total
Thread Impossible Problem?
Sun, Jan 22 2017 8:47 AMPermanent Link

Adam Brett

Orixa Systems

I am trying to speed up a search.

Obviously it would be easiest to create an index on the table, but in this case the search covers 2 tables.

I have a "People" table (with a "Name" field) and a "Farmers" table with a "Code" field (a letters / numbers combination).

Users like to be able to type into a single box with EITHER the Name OR the Code (or part of either).

The tables are both > 500,000 records.

Searching just the Name or Code returns data in a couple of seconds. With the more complex search it runs to 30+ seconds everytime!

--

I use the following SQL, subbing in the Search to replace the STR PARAM text shown below.

SELECT
 --some fields
FROM People P
LEFT JOIN Farmers F ON F.ID = P.ID
WHERE P.Name + ' ' + F.Code LIKE '%[STR PARAM]%'

--

Can I reformulate the SQL so it is faster, or can I add indexes or something to the database to speed up the process?
Sun, Jan 22 2017 11:12 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Adam


I can think of two ways to add that sort of compound index and neither are brilliant. First a cross reference table with one column = P.Name + ' ' + F.Code. Second (and one of my favourite approaches) have a full text index (I think Tim has built in wildcard for both ends now). For the second idea you'll need a custom word generator / Text Filter, and it will be quite slow for building at the size of table you have.

However, I just came up with another approach - a search on either table only takes a couple of seconds - why not simply search one, and if not found - search the other?

I can't think of a way to write that as a simple sql statement but as SQL/PSM or Delphi its a doddle. If you want to be clever you can keep track of which is more successful and arrange to do that first, and it might also be possible to preprocess [STR PARAM] to guess if it better matches a name or a code.


Roy Lambert
Sun, Jan 22 2017 12:29 PMPermanent Link

Adam Brett

Orixa Systems

Thanks Roy,

The "2 bites" route is interesting. I might even be able to work it with a simple "OR" in the SQL. I will try this.

Not sure about Full Text ... haven't been good at using it with mixed text + numbers + other characters (which appear in the Code).

I also _tried_ creating a VIEW, which combined the 2 tables, and adding indexes to this view ... but it worked out virtually as slow as the simple SQL.

Have just tried simple "OR" and it is about 6x faster ... I think that is probably good for now. Thanks!

(Still curious about other ways to solve the problem though!)
Mon, Jan 23 2017 2:48 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Adam


>The "2 bites" route is interesting. I might even be able to work it with a simple "OR" in the SQL. I will try this.

>Not sure about Full Text ... haven't been good at using it with mixed text + numbers + other characters (which appear in the Code).

Its easy when you roll your own filter/generator but not as standard

>I also _tried_ creating a VIEW, which combined the 2 tables, and adding indexes to this view ... but it worked out virtually as slow as the simple SQL.

One the view has been created it should be pretty good. A separate table would probably be better though

>Have just tried simple "OR" and it is about 6x faster ... I think that is probably good for now. Thanks!

Can you post the sql you came up - just curious. I think that if you can get the most frequently used (my guess would be name) first and physically split the queries it should be a bit faster "most" of the time

>(Still curious about other ways to solve the problem though!)

I have had a wild and whacky idea this morning, I need to test it to see if it will work though - but before I do - do you need to return any columns from FARMERS or are they all from PEOPLE?

Also how is the code structured (I can guess at names)?

Roy Lambert



Mon, Jan 23 2017 3:06 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Adam


Just had a quick play with a lot smaller tables than yours

SELECT * FROM
Contacts P
LEFT JOIN Career F ON P._ID = F._fkContacts
WHERE

COALESCE(_Surname LIKE '%WATLI%',_fkJobCodes LIKE '%WATLI%')

c0.8 seconds

SELECT * FROM
Contacts P
LEFT JOIN Career F ON P._ID = F._fkContacts
WHERE

_Surname LIKE '%WATLI%'
OR
_fkJobCodes LIKE '%WATLI%'

c 0.2 seconds

SELECT * FROM
Contacts P
LEFT JOIN Career F ON P._ID = F._fkContacts
WHERE

_Surname+' '+F._fkJobCodes LIKE '%WATLI%'

c 2.4 seconds


I assume that you get multiple results returned and present a list to select from.


Roy Lambert
Mon, Jan 23 2017 5:46 AMPermanent Link

Adam Brett

Orixa Systems

Thanks Roy

Your SQL trials on your db, were very similar to mine & came back more-or-less the same time-variations as mine.

The "OR" means you search twice, but on well-formed Indexes ... so each search is really fast. If you concatenate the fields the index goes out the window, making the process slow.

For ease and speed I think the "OR" is the best option.

It would be possible to add the "FarmerCode" field to the People table as a Generated field which would speed up the search much more, as you could then create an index on both fields. However there would be a bit of a performance hit returning the FarmerCode to the People Table.
Tue, Jan 24 2017 3:15 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Adam,

<< Searching just the Name or Code returns data in a couple of seconds. With the more complex search it runs to 30+ seconds everytime! >>

I think you may be making things more complicated than you need to, but I could be wrong...

How about this:

SELECT
 --some fields
FROM People P
WHERE P.Name LIKE '%[STR PARAM]%'
UNION ALL
SELECT
 --some fields
FROM Farmers F
WHERE F.Code LIKE '%[STR PARAM]%'
ORDER BY.....

The reason why the LOJ is so slow is because EDB can't apply the WHERE clause *before* the join is executed due to the <F.Code> reference.  Therefore, it has to join all 500k+ rows first, and then filter them down using the WHERE clause.

Tim Young
Elevate Software
www.elevatesoft.com
Wed, Jan 25 2017 4:31 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Tim


My reading of Adam's initial code was that he's pulling fields from the People table or both the People and Farmers tables so he'll need to have a JOIN or two in there somewhere.

Also if he drops the ALL and just goes with UNION it will prevent duplicates - won't it?


Roy Lambert

Image