Support Forums - View Thread

View Thread

The following is the text of the current message along with any replies.

Messages 11 to 15 of 15 total

Filter optimization stumper

Thu, Feb 8 2007 1:46 PM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Chris, << I have to agree with Sam, as it took me quite a bit of swallowing to get my head around this UPPER(xxx) = UPPER(yyy) after Informix and MSSQL where I used WHERE <this field> = <'this value'> on case-insensitive indexes or case-sensitive indexes depending on the way the application needs to work, and got the appropriate result set. >> There's no way that they allowed the use of either type of index with the same field comparison. That would break existing SQL statements just by modifying the index definition. << Indeed, this does work differently for apparently identical SQL, but only because the index says it should. This is exactly how specifying UNIQUE works. >> UNIQUE is a constraint and a completely different thing altogether. << What Sam wants, ISTM, and what I initially expected (and keep on forgetting I won't get) is an index that silently takes a peek at its own definition and acts accordingly, without the programmer having to add these unnecessary and obfuscating constructs, just as it does to decide whether it can accept a new row or not on a unique index. >> How does that help ? The index can't read the developers mind, so it has no way of knowing what type of comparison is desired. << There is also the problem of collation for e.g. French accented characters, where é=è=ê=e=E in a case-insensitive index, except in DBISAM where they are only the same if included in UPPER(). If you use LOWER() they should not be seen as the same - I haven't tried that one on a case- insensitive index in DBISAM. >> If that's the case then it is a bug and needs to be corrected by removing the LOWER() function's ability to force a case-insensitive comparison. However, most people don't use the LOWER() function and opt for the UPPER() function anyways. -- Tim Young Elevate Software www.elevatesoft.com
Thu, Feb 8 2007 5:22 PM	Permanent Link
Sam Karl	"Tim Young [Elevate Software]" <timyoung@elevatesoft.com> wrote: Sam, << I noticed a lot of Delphi databases you have to apply "Upper(Col)='ABC'" on a case insensitive index, and I just don't get it. MySQL indexes are case insenstive by default and we can use "Where col='ABC'" or "col='AbC'" and it returns the same rows. So why can't the database engine be smart enough to realize the index is case insensitive, and add the Upper() part for us behind the scene. Otherwise why would someone want to perform a case sensitive search on a case insensitive index which means going through the entire table? >> <Indexes don't even exist in the SQL standard, so forget them when it comes to how an SQL statement should work. SQL statements should work exactly as written, which is not something that MySQL and others are doing if they take a case-sensitive search and make it case-insensitive simply because an index exists on the column being compared. In fact, I have a hard-time believing that this is even the case. Perhaps what you meant to say was that, by default, all MySQL string column comparisons are case-insensitive ?> Yes, that's what I meant. <EDB, for example, will allow you to perform a case-insensitive search on a column without the UPPER() provided that the collation assigned to the column includes the case-insensitive attribute. However, that is different from an index definition because it is standard part of a table definition and results in the same rows being returned regardless of whether an index exists or not. DBISAM doesn't have column-level collations, so it has to rely on the UPPER() construct for you to tell it that the comparison is case-insensitive. This is the key principle - searches and comparisons should perform in the same fashion and return the exact same rows irrespective of any defined indexes. The SQL standard dictates that all comparions are case-sensitive unless the collation used dictates otherwise.> I understand your point. You're adhering to the SQL standard for compatibility reasons. But to me, after using case insensitive searches for years without having to resort to Upper() functions, it looks like the SQL standard is just plain wrong. At least it is more labor intensive for the programmer and he is more apt to make a mistake. Maybe I should start a letter writing campaign? Write my congressman or hire a lobbyist? Sam
Fri, Feb 9 2007 3:48 PM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Sam, << I understand your point. You're adhering to the SQL standard for compatibility reasons. But to me, after using case insensitive searches for years without having to resort to Upper() functions, it looks like the SQL standard is just plain wrong. At least it is more labor intensive for the programmer and he is more apt to make a mistake. Maybe I should start a letter writing campaign? Write my congressman or hire a lobbyist? >> Actually the standard is to use a forced collation to effect a case-insensitive comparison. EDB had this in there along with the UPPER() / LOWER() constructs to be compatible with DBISAM. However, after Chris's remarks regarding the LOWER() issue, I realized that it would be best if EDB only used the forced collation route. Basically, you would use something like this to force a case-insensitive comparison in EDB: SELECT * FROM customer WHERE Company COLLATE "ANSI_CI" LIKE 'Silicon%' The SQL standard dictates that collations be exactly compatible on both sides of a binary operation like the above or else an error is raised. However, EDB relaxes this a bit and only uses the left side to determine what collation should be used to affect the result of the binary operation. -- Tim Young Elevate Software www.elevatesoft.com
Mon, Feb 12 2007 4:56 AM	Permanent Link
Chris Erdal	"Tim Young [Elevate Software]" <timyoung@elevatesoft.com> wrote in news:C811E1D8-FEB6-4C77-96A4-FA93D21B4E78@news.elevatesoft.com: Tim, I sense a little agro in your response, and I think it's probably justified: > ... after Informix and MSSQL > where I used WHERE <this field> = <'this value'> on case-insensitive > indexes or case-sensitive indexes depending on the way the application > needs to work, and got the appropriate result set. >> > > There's no way that they allowed the use of either type of index with > the same field comparison. That would break existing SQL statements > just by modifying the index definition. I had a look at one MSSQL 6.5 app I was thinking of, and in fact I had only one index that was liable to be affected by case (because the others were all numbers or uppercase codes) and that index was case-insensitive because the WHOLE DATABASE was declared so by its collation when created, so the field on which the index was based was case-insensitive. So what remains of my point is simply that when collation specifies case- insensitivity there is no need to add UPPER() around both sides for it to work. The corollary of that is of course that there's no way to select different versions of a string for that field in index order, which meant that the user complained he couldn't find mis-spelt distinct strings such as "THis wOrd" to correct them because only "This Word" was visible when ordering by this field. So in fact your implementation is much more flexible, and from now on I'll be thanking you for the added possibilities for the programmer, instead of muttering "bloody UPPERs again" under my breath as I type! > ><< Indeed, this does work differently for apparently identical SQL, but >only > because the index says it should. This is exactly how specifying > UNIQUE works. >> > > UNIQUE is a constraint and a completely different thing altogether. From the DB engineer's point of view, yes. From my point of view it's just another special word that adds special characteristics to the way things happen. > ><< What Sam wants, ISTM, and what I initially expected (and keep on > forgetting I won't get) is an index that silently takes a peek at its > own > definition and acts accordingly, without the programmer having to add > these > unnecessary and obfuscating constructs, just as it does to decide > whether it can accept a new row or not on a unique index. >> > > How does that help ? The index can't read the developers mind, so it > has no way of knowing what type of comparison is desired. what I meant was that one could specify what type of index one was creating, and from then on it would be that type without our needing to add UPPER() around each side for case-insensitivity. But that of course removes the possibility of searching case-sensitively on a field if there is a case-insensitive index on it. > ><< There is also the problem of collation for e.g. French accented > characters, where é=è=ê=e=E in a case-insensitive index, except in > DBISAM where they are only the same if included in UPPER(). If you use > LOWER() they should not be seen as the same - I haven't tried that one > on a case- insensitive index in DBISAM. >> > > If that's the case then it is a bug and needs to be corrected by > removing the LOWER() function's ability to force a case-insensitive > comparison. However, most people don't use the LOWER() function and > opt for the UPPER() function anyways. > I was asserting that off the top of my head - sorry - I haven't tested it. And anyway, using UPPER all the time is simpler anyway, now I've understood why it's required. -- Chris (XP-Pro + Delphi 7 Architect + DBISAM 4.25 build 3 + EDB 1.00 build 8)
Mon, Feb 12 2007 3:44 PM	Permanent Link
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com	Chris, << I sense a little agro in your response, and I think it's probably justified: >> Not at all, I'm just trying to be clear. My apologies if it seemed otherwise. << I had a look at one MSSQL 6.5 app I was thinking of, and in fact I had only one index that was liable to be affected by case (because the others were all numbers or uppercase codes) and that index was case-insensitive because the WHOLE DATABASE was declared so by its collation when created, so the field on which the index was based was case-insensitive. >> Yes, that would be acceptable in that case because the case-sensitivity is not index-dependent. << So what remains of my point is simply that when collation specifies case-insensitivity there is no need to add UPPER() around both sides for it to work. >> Correct. << The corollary of that is of course that there's no way to select different versions of a string for that field in index order, which meant that the user complained he couldn't find mis-spelt distinct strings such as "THis wOrd" to correct them because only "This Word" was visible when ordering by this field. >> Good point. << So in fact your implementation is much more flexible, and from now on I'll be thanking you for the added possibilities for the programmer, instead of muttering "bloody UPPERs again" under my breath as I type! >> Well, the UPPER/LOWER stuff is now gone in EDB in favor the standard COLLATE clause, so it is even easier: MyColumn COLLATE "ANSI_CI" = 'MyValue' << From the DB engineer's point of view, yes. From my point of view it's just another special word that adds special characteristics to the way things happen. >> True. << I was asserting that off the top of my head - sorry - I haven't tested it. And anyway, using UPPER all the time is simpler anyway, now I've understood why it's required. >> You are correct - using LOWER() won't work in some cases because of the collation rules. At any rate, that way is gone so it won't be an issue anymore. -- Tim Young Elevate Software www.elevatesoft.com

« Previous Page	Page 2 of 2
Jump to Page: 1 2