Support Forums - View Thread

View Thread

The following is the text of the current message along with any replies.

Messages 1 to 10 of 12 total

Tue, Mar 9 2010 11:32 AM	Permanent Link
Roy Lambert NLH Associates Team Elevate	I do hope John Hay reads the ElevateDB ngs as well as the DBISAM ones. I have four tables Contacts, Career, Companies, Sites Contacts is a one to many with Career Companies is a one to many with Sites Career is one to one with Sites and one to many with Companies With some data loading from external DBs I now have multiple occurrences of the same person (sites as well but I've sussed that and companies can wait). I can almost get the SQL to identify duplicate contacts BUT I can only get one of them. My logic is if two people with the same forename and same surname exist in the same site of a company its a good guess they are the same person so SELECT _fkContacts, _fkCompanies, _fkSites, _Forename, _Surname FROM Career JOIN Contacts C ON _fkContacts = C._ID GROUP BY _Surname,_Forename,_fkCompanies,_fkSites HAVING COUNT(_Surname + _Forename + CAST(_fkCompanies AS VARCHAR(16)) + CAST(_fkSites AS VARCHAR(16))) > 1 The top one on the list is Anthony Allison - ID 1008199. What I need is the other ID (1000095) as well. Any ideas? Roy Lambert
Tue, Mar 9 2010 1:19 PM	Permanent Link
Roy Lambert NLH Associates Team Elevate	I do enjoy feeling like a moron SELECT _ID FROM Contacts WHERE _ID IN (SELECT..... Roy Lambert
Tue, Mar 9 2010 1:24 PM	Permanent Link
Roy Lambert NLH Associates Team Elevate	I now feel an even bigger buffoon. If the original query doesn't produce the full list adding the extra bit won't help. Roy Lambert
Tue, Mar 9 2010 10:00 PM	Permanent Link
Raul Team Elevate	Roy, I think it's the group by that's limiting your output. How about starting with getting duplicate entries from one table first ? Something like this : select C1._fkContacts,C1._Surname,C1._Forename,C2._fkContacts as _fkContacts2, C2._Surname,C2._Forename from Career C1,Career C2 where c1._fkContacts<c2._fkContacts and c1._Surname=c2._Surname and c1._Forename=c2._Forename and C1._fkCompanies = C2._fkCompanies and C1._fkSites=C2._fkSites Then you should be able to use one of the _fk fields to join to other table. Raul
Wed, Mar 10 2010 3:01 AM	Permanent Link
Roy Lambert NLH Associates Team Elevate	Raul >I think it's the group by that's limiting your output. Me to. >How about starting with getting duplicate entries from one table first ? I'd love to, unfortunately 1) _Forename, _Surname are in Contacts, _fkCompanies, _fkSites are in Career so I have to use two tables and 2) Career contains c20k records (Contacts a bit less) and a cartesian join on that takes forever Thanks for the suggestion though, The C1._fkContacts < C2._fkContacts is something I never thought of but feels as if it should help. I did come up with a solution but it doesn't work (I suspect a bug) Roy Lambert
Wed, Mar 10 2010 3:16 AM	Permanent Link
Roy Lambert NLH Associates Team Elevate	Raul Altering the SQL to select _Surname, _Forename FROM Contacts C1, Contacts C2 where c1._ID<c2._ID and c1._Surname=c2._Surname and c1._Forename=c2._Forename just to try and Forever = 1274.887 secs Roy Lambert
Wed, Mar 10 2010 9:07 AM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Roy, << I did come up with a solution but it doesn't work (I suspect a bug) >> What is it that you suspect is a bug ? -- Tim Young Elevate Software www.elevatesoft.com
Wed, Mar 10 2010 9:08 AM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Roy, << just to try and Forever = 1274.887 secs >> http://www.elevatesoft.com/supportfaq?action=view&category=edb&question=joins_where_clause -- Tim Young Elevate Software www.elevatesoft.com
Wed, Mar 10 2010 10:12 AM	Permanent Link
Raul Team Elevate	Roy, Hopefully you get your solution working (bug and all) but question : is this a one time operation or do you need to do this regularly. If one time then creating a temp intermediary tables (do all joins so you're down to single table and then eliminate non-duplicates using your group by logic) might be faster as you'd have all the relevant IDs at the end that just need to be merged. Raul
Wed, Mar 10 2010 11:11 AM	Permanent Link
Roy Lambert NLH Associates Team Elevate	Tim I started a new thread for it: Is this In comparison a bug Roy Lambert

	Page 1 of 2	Next Page »
Jump to Page: 1 2