Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 7 of 7 total
Thread Migrating from DBISAM to ElevateDB: codepage question
Tue, Mar 11 2008 4:50 PMPermanent Link

Dan Rootham
Tim,

Almost all the tables which we have so far migrated from DBISAM
use the "default" Windows 1252 ANSI / West European codepage,
and these tables have migrated without any problem.

For Chinese and Japanese we held the text in a blob field in DBISAM,
and it's no trouble to decode that separately after migrating to EDB.

But our Russian DBISAM tables use codepage Win-1251 (Cyrillic), and
have this as part of their definition:
   LANGUAGE "Russian" SORT "Default Order"
When migrating these tables, we found that the migrator just transferred
the data to the equivalent codepoint which would have been used for
Win-1252 West European. The data wasn't converted to the correct
Unicode codepoints for the Cyrillic alphabet.

To convert the data fully, we had to undertake a separate conversion
step on the EDB table, as shown in the procedure below which uses
the Windows API MultiByteToWideChar.

Would you expect the EDB migrator to convert DBISAM tables which
used any of the following codepages:
 Win-1250  Central European
 Win-1251  Cyrillic
 Win-1253  Greek
 Win-1254  Turkish
 Win-1255  Hebrew

Or is it down to the user to do the required extra conversion?
Does any other DBISAM user have any migration experience here
with codepages other than Win-1252 (ANSI / West European)?

Many thanks,
Dan
--  Lexicon Software Ltd, UK --

procedure TfrmConvert.btnConvertClick(Sender: TObject);
const
 CP_WINDOWS_1251 = 1251;
var
 iSize: integer;
 cpCyrillic: integer;
 sCyrillic: string;
 wsCyrillic: WideString;
 arUnicode: Array[0..MAX_PATH] of WideChar;
begin
 // assign the codepage value for Win-1251 (Cyrillic)
 cpCyrillic := CP_WINDOWS_1251;  

 // tblDictionary is the EDB table which holds the migrated data
 dm.tblDictionary.First;

 while not dm.tblDictionary.EOF do
 begin
   // assign the migrated Russian text as if still a codepage Win-1251 string
   sCyrillic := Trim(dm.tblDictionary.FieldByName('TermWin1251').AsString);

   // obtain the size required for the widechar buffer
   iSize := MultiByteToWideChar(cpCyrillic, 0, PChar(sCyrillic), -1, nil, 0);

   // if there is data, do the conversion from Cyrillic codepage Win-1251 to Unicode
   if (iSize > 0) and (iSize < MAX_PATH) then
   begin
     MultiByteToWideChar(cpCyrillic, 0, PChar(sCyrillic), -1, @arUnicode, iSize);
     wsCyrillic := arUnicode;
   end
   else
   begin
     wsCyrillic := '';
   end;

   // write the Unicode text as a widestring to a new field in the ElevateDB table
   dm.tblDictionary.Edit;
   dm.tblDictionary.FieldByName('TermUnicode').Value := wsCyrillic;
   dm.tblDictionary.Post;

   dm.tblDictionary.Next;
 end;
end;



Tue, Mar 11 2008 5:04 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Dan,

<< But our Russian DBISAM tables use codepage Win-1251 (Cyrillic), and have
this as part of their definition:
   LANGUAGE "Russian" SORT "Default Order"
When migrating these tables, we found that the migrator just transferred
the data to the equivalent codepoint which would have been used for Win-1252
West European. The data wasn't converted to the correct Unicode codepoints
for the Cyrillic alphabet. >>

We are missing that final step that is required for non-US/European
languages in the migrators, so for now you'll have to do so manually.

If you could send me the Russian table, however, I'll see about getting this
added in the next release (if it's possible).

--
Tim Young
Elevate Software
www.elevatesoft.com

Tue, Mar 11 2008 5:13 PMPermanent Link

Dan Rootham
Tim, and everyone,

My apologies!

I forgot to mention that this migration question refers specifically
to the **Unicode** version of EDB: we are currently using v 1.08.

What I'm trying to do is to migrate all our various language tables
to EDB Unicode. In other words, not to use codepages any more....

Regards,
Dan
Tue, Mar 11 2008 6:07 PMPermanent Link

Dan Rootham
Tim,

<< If you could send me the Russian table, however, I'll see about getting this
added in the next release (if it's possible). >>

That would be great, but I know you have a raft of things to do already.

Is it the DBISAM Russian table which you need, or the migrated EDB table?

Thanks,
Dan

Wed, Mar 12 2008 3:38 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Dan,

<< That would be great, but I know you have a raft of things to do already.
>>

It's okay, really.  Having a workload sure beats not having anything to do.
Smiley

<< Is it the DBISAM Russian table which you need, or the migrated EDB table?
>>

The DBISAM Russian table will be fine, thanks.

--
Tim Young
Elevate Software
www.elevatesoft.com

Thu, Mar 13 2008 9:51 AMPermanent Link

Dan Rootham
Tim,

<< The DBISAM Russian table will be fine, thanks. >>

I actually emailed both the DBISAM and the EDB tables to you before I read this.
Sorry... Smiley

Regards,
Dan
Thu, Mar 13 2008 2:59 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Dan,

<< I actually emailed both the DBISAM and the EDB tables to you before I
read this. Sorry... Smiley>>

No problem at all.

Thanks,

--
Tim Young
Elevate Software
www.elevatesoft.com

Image