Icon View Thread

The following is the text of the current message along with any replies.
Messages 11 to 15 of 15 total
Thread CLOB field sometimes returned as garbage
Sun, Dec 27 2015 9:53 AMPermanent Link

Raul

Team Elevate Team Elevate

On 12/26/2015 4:01 PM, Malcolm wrote:
> I used the CLOB Editor to export the garbage from the EDB Manager into
> a file then opened it with a Hex Editor.
> Yay!  It showed a BOM (FF FE) followed by plain ANSI text.
> But it should have been UNICODE .. unless one of the two editors

AFAIK this is not a valid BOM - UTF16 encoding uses FE FF so if you're
seeing FF FE then looks like you have wrong endianess somewhere.


Raul
Sun, Dec 27 2015 9:59 AMPermanent Link

Raul

Team Elevate Team Elevate

On 12/27/2015 9:53 AM, Raul wrote:
> AFAIK this is not a valid BOM - UTF16 encoding uses FE FF so if you're
> seeing FF FE then looks like you have wrong endianess somewhere.

Never mind - not enough coffee.

This is legit windows little endian BOM but if your content is pure ANSI
then it's not legit UTF-16 (which should have 00 added in case of ansi
to egt the 2 bytes).

Raul
Sun, Dec 27 2015 11:08 AMPermanent Link

Malcolm Taylor

Hi Raul

I can see the problem but I can't replicate it.
If I clear the CLOB and then save something in it using either the EWB
Manager or my program, it is correctly written with the BOM and the
UTF-16LE text.
I have then tried Backup and Restore but it all works as expected.
Frustrating!

However, from time to time it ends up corrupted and I can't figure out
why. Fortunately the data is not crucial so if users see the garbage
they tend to delete/replace it and don't bother to report it.  But that
is not a real solution.
Anecdotally it seems to happen after a Restore. But I can't be sure and
I have never managed to replicate it to order, 'though once corrupted
it stays corrupted.   Grrrr.

I had wondered if the Backup compression had stripped the zero bytes
then the Restore had sometimes failed to restore them.  But I can't
demonstrate that and I am inclined to take Tim at his word when he says
the CLOBs are not touched by compression.

I am starting to think I will have to read the first few bytes or so to
check for valid encoding so that I can try to rescue the contents if
the encoding appears wrong.  Or maybe I should give up on the CLOB
(sorry Tim) and see if I can make do with another data type.

The original purpose of raising this matter was to see if anyone else
had seen the behaviour.  But I appear to be unique!

Malcolm
Sun, Dec 27 2015 2:11 PMPermanent Link

Raul

Team Elevate Team Elevate

On 12/27/2015 11:08 AM, Malcolm wrote:
> I can see the problem but I can't replicate it.
> If I clear the CLOB and then save something in it using either the EWB
> Manager or my program, it is correctly written with the BOM and the
> UTF-16LE text.

Any chance something in your app uses UTF-8 and then for some reason
puts a UTF-16 BOM on ? ANSI would be legit in UTF8

> The original purpose of raising this matter was to see if anyone else
> had seen the behaviour.  But I appear to be unique!

Can't help you there unfortunately. In our case our main app is still
DBISAM and for EDB side of things have not not run into this.

Raul
Tue, Dec 29 2015 12:49 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Malcolm,

<< I can see the problem but I can't replicate it. If I clear the CLOB and then save something in it using either the EWB Manager or my program, it is correctly written with the BOM and the UTF-16LE text. I have then tried Backup and Restore but it all works as expected. Frustrating! >>

You should try to isolate all areas where data is assigned/written to the CLOB field navigationally.  That's the only way that such data could sneak into a Unicode CLOB field.  At the VCL TField/TDataSet level, EDB has no control over how the data is streamed into the field and, because a CLOB is just a BLOB with some special SQL handling, it will just store it like a BLOB.

<< I had wondered if the Backup compression had stripped the zero bytes then the Restore had sometimes failed to restore them.  But I can't demonstrate that and I am inclined to take Tim at his word when he says the CLOBs are not touched by compression. >>

The backup and restore have no idea what a CLOB is. Smile They are strictly raw operations on files. In fact, you could, with a little modification, use them as a general backup utility.

<< I am starting to think I will have to read the first few bytes or so to check for valid encoding so that I can try to rescue the contents if the encoding appears wrong.  Or maybe I should give up on the CLOB (sorry Tim) and see if I can make do with another data type. >>

If you switch to a plain BLOB, I *guarantee* that you'll have the same problem (see my first comment above).

Tim Young
Elevate Software
www.elevatesoft.com
« Previous PagePage 2 of 2
Jump to Page:  1 2
Image