Icon View Thread

The following is the text of the current message along with any replies.
Messages 11 to 20 of 33 total
Thread Creating a new datatype in the elevatedb
Wed, Jul 1 2015 4:15 AMPermanent Link

Matthew Jones

PeterBeyer wrote:

> As you can see, the numbers is not sorted correct, because the field
> is a string field. With the Admind field type, it will be sorted this
> way
>
> 1
> 2
> 11
> DKK
> EURO
> NOK
> USD
>
> That makes this fieldtype very powerful, because you can use it for
> many things (and we do that).

What you need is a suitable "collation". That's where I'd focus my
attention. I'm sort of surprised this is not a standard sort available
on anything, but it isn't hard to do. I looked at the collations in
EDB, and it seems to have the standard system ones. What I'd want to do
is look at whether you can do a custom collation, and if so, just do
that and set all character columns to use it.

--

Matthew Jones
Wed, Jul 1 2015 4:58 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Matthew

>What you need is a suitable "collation". That's where I'd focus my
>attention. I'm sort of surprised this is not a standard sort available
>on anything, but it isn't hard to do. I looked at the collations in
>EDB, and it seems to have the standard system ones. What I'd want to do
>is look at whether you can do a custom collation, and if so, just do
>that and set all character columns to use it.

Fascinating idea and I'll be interested to see what Tim says about it. From the manual

<<The available collations in ElevateDB are dynamic and reflect the available installed locales in the
operating system. In addition, ElevateDB includes one default collation>>

so if Windows has it you should be able to use it. However, I don't know how Windows Explorer handles natural sorting but I suspect its not through a built in collation.

Roy
Wed, Jul 1 2015 5:09 AMPermanent Link

Matthew Jones

Indeed - I have no idea how to make it work, but the collation is the
best place to attack it. The addition of another column to mimic the
sorting is just hacking the collation.

I am not saying this is easy, but it is the right solution, if possible.

--

Matthew Jones
Wed, Jul 1 2015 7:19 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Matthew

>Indeed - I have no idea how to make it work, but the collation is the
>best place to attack it. The addition of another column to mimic the
>sorting is just hacking the collation.
>
>I am not saying this is easy, but it is the right solution, if possible.

As I think about it I have to disagree with you. The collation tells you how characters sort in relation to one another. That is a relatively trivial thing. What we're talking about here is actually processing a string and altering it. OK collations already do that a little bit with the _CI_AI_KI_WI flags so it should be possible to add a _NS  (natural numbers sort) to those but I'd hesitate. The other flags are dead simple to understand and shouldn't give any cause for confusion. If natural number sorting is restricted to the equivalent of left padding numbers only with 0 or #32 again no problem but what about the case where the values are:

ABC1
ABC101
ABC11

and that's how it will sort unless you left pad the number portion.

And what about the nastier instance where the value is a mixed arrangement (eg A12/3/A-6)

OK you could program those into the collation handling bit of code but , much as I'm generally in favour of making life easier, I think its better handled  at the application level

Roy Lambert
Wed, Jul 1 2015 7:53 AMPermanent Link

Matthew Jones

I think that the _WI flags are a good place for this. I just checked
the definition of "collation" to check I'm not being mad. "the
verification of the number and order of the leaves and signatures of a
volume." which fits this perfectly - it about the order of things.

FWIW, my code for this is simple. You have a simple state machine which
has either character or number mode, starting in character. Look at the
next character to compare. If a digit, switch to number mode, otherwise
compare as usual. In number mode, build up a string of the numbers for
each string, convert to numeric, use that as compare. Repeat to end.

So complex numbers like AB07/90/37T6 are just switches between the
states as you go. As you rightly pointed out, Windows has the natural
sort order now, but I did this back in 1989 and it wasn't hard.

Heck, here is the code. All ANSI based and C, but it works:


--

Matthew Jones
Wed, Jul 1 2015 7:53 AMPermanent Link

Matthew Jones

Matthew Jones wrote:

> Heck, here is the code. All ANSI based and C, but it works:

Darned accidental Enter key hits!

// Routine to compare strings with numbers in. It results in Fred10
being greater than
// Fred5, to produce a more natural sort.
int gl_strcmpx(const char GLOBAL *lpFirst, const char GLOBAL *lpSecond)
{
   int iDiff, index;
   char caFirstNum[50], caSecondNum[50];

   while(*lpFirst)
   {
      if(isdigit(*lpFirst) && isdigit(*lpSecond))
      {
         index = 0;
         while(isdigit(*lpFirst) && index < (50 - 1))
            caFirstNum[index++] = *(lpFirst++);
         caFirstNum[index] = '\0';

         index = 0;
         while(isdigit(*lpSecond) && index < (50 - 1))
            caSecondNum[index++] = *(lpSecond++);
         caSecondNum[index] = '\0';

         iDiff = atoi(caFirstNum) - atoi(caSecondNum);
         if(iDiff != 0)
            return(iDiff);
         // else carry on, already from after the numbers
      }
      if(*lpFirst == *lpSecond || (gl_toupper(*lpFirst) ==
gl_toupper(*lpSecond)))
      {
         lpFirst++;
         lpSecond++;
      }
      else
      {
         break;
      }
   }
   return(gl_toupper(*lpFirst) - gl_toupper(*lpSecond));
}

--

Matthew Jones
Wed, Jul 1 2015 9:17 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Matthew


I> think that the _WI flags are a good place for this. I just checked
>the definition of "collation" to check I'm not being mad. "the
>verification of the number and order of the leaves and signatures of a
>volume." which fits this perfectly - it about the order of things.


Naughty boy - you missed of the fact that that definition applies to bibilography or publishing. This one from (sudder) wikipedia sort of supports you.

<<Collation is the assembly of written information into a standard order. Many systems of collation are based on numerical order or alphabetical order, or extensions and combinations thereof. Collation is a fundamental element of most office filing systems, library catalogs and reference books.>>

>FWIW, my code for this is simple.

Never said it wasn't simple what I'm talking about is user comprehension (user could be an individual or an organisation). In very general terms people expect numbers to sort as numbers so strings as numbers - no problem - apart from this which I'm ashamed to say I quote from wikipedia

<<Strings representing numbers may be sorted based on the values of the numbers that they represent. For example, "-4", "2.5", "10", "89", "30,000". Note that pure application of this method may provide only a partial ordering on the strings, since different strings can represent the same number (as with "2" and "2.0", or when scientific notation is used, "2e3" and "2000").>>

and if you really want to be freaky add in hex, octal etc - way back in my mainframe days I was trying to save space (processing speed didn't matter to much) and converted everything to base 36.


Mostly people will also expect a and A to be together. I can't comment about accents etc cos I'm a monoglot Frown

Compound stuff is more difficult - not to process but to hit expectations.

>Heck, here is the code. All ANSI based and C, but it works:

As I said I'm a monoglot - I don't speak C Smiley

Roy
Wed, Jul 1 2015 9:27 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Matthew


Just read a bit further down the google list and came up with

The sort order to use for Unicode data types (nchar, nvarchar, and ntext). A sort order defines the sequence in which characters are sorted, and the way characters are evaluated in comparison operations.

from https://technet.microsoft.com/en-us/library/aa174903(v=sql.80).aspx

Roy Lambert
Wed, Jul 1 2015 10:28 AMPermanent Link

Matthew Jones

Roy Lambert wrote:

> Just read a bit further down the google list and came up with

Indeed, quite a lot on SQL Server, but I wasn't sure how relevant it
was. But that made me find this:

http://stackoverflow.com/questions/16935989

And this, which discusses the types of ordering

http://stackoverflow.com/questions/6810619

I think we need Tim though, to tell us how hard adding a new Collation
type would be. Me, I think this would be widely appreciated, and a nice
extension option. But I don't have to do the work of course. 8-)

--

Matthew Jones
Wed, Jul 1 2015 11:26 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Matthew

I love these sort of discussions

>Indeed, quite a lot on SQL Server, but I wasn't sure how relevant it
>was. But that made me find this:
>
>http://stackoverflow.com/questions/16935989

Nice this one - it agrees with me Smile

>And this, which discusses the types of ordering
>
>http://stackoverflow.com/questions/6810619

Very good explanation - especially the German bit

>I think we need Tim though, to tell us how hard adding a new Collation
>type would be. Me, I think this would be widely appreciated, and a nice
>extension option. But I don't have to do the work of course. 8-)

I don't think adding another switch in should be a major problem (he says without having, and not wanting, the source) since, as you say, its relatively simply programming. It depends on, for example, wether the _CI switch is implemented by simply using UpperCase/LowerCase on the data or what. If it is implemented my vote is on a very simple approach - totally ignore compound data

Roy
« Previous PagePage 2 of 4Next Page »
Jump to Page:  1 2 3 4
Image