Icon View Thread

The following is the text of the current message along with any replies.
Messages 11 to 20 of 24 total
Thread IMPORT with QUOTE CHAR #0
Tue, Dec 1 2009 9:48 AMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com


<< Never mind - I found out what the issue is.  The export is still screwed
up a bit, and is including #0 delimiters in the output data.   A fix will be
in the next build. >>

Hold a second - I was testing it with the delimiter character as #0, which
is why the #0 characters were in the output file.

The problem with the import is most likely due to the data that is coming
in.  If you tell EDB to use #0 as the quote character, then it really has no
way of dealing properly with string values that have spaces in them.   Is
this the case with the file that you're trying to import ?

--
Tim Young
Elevate Software
www.elevatesoft.com

Tue, Dec 1 2009 12:26 PMPermanent Link

"Malcolm"
>
> << Never mind - I found out what the issue is.  The export is still
> screwed up a bit, and is including #0 delimiters in the output
> data.   A fix will be in the next build. >>
>
> Hold a second - I was testing it with the delimiter character as
> #0, which is why the #0 characters were in the output file.
>
Tim, are you quite sure about the above?
Remember that I am using the unicode version.
What I see when using #0 for export is that the output file is
'perfecly' formed.  There are *no* extra #0 chars in the output, so
with a delimiter char '|' and quote char #0 I see something like this
for a varchar column value <#=null for readability>:

 |#T#i#m# #Y#o#u#n#g#|#

In other words, using only ascii characters, each char is two bytes
comprising the ascii code followed by a null.  There are *no* extra
(double) nulls for the quote char #0.

So I think your output is just fine.  Well done. Smiley

> The problem with the import is most likely due to the data that is
> coming in.  If you tell EDB to use #0 as the quote character, then
> it really has no way of dealing properly with string values that
> have spaces in them.   Is this the case with the file that you're
> trying to import ?

(As an aside, can't EDB just read from delimiter to delimeter or does
your parser *require* quoting when handling any textual data?)

Whatever build I have here (reports 2.03 b6) seems to behave oddly
only with the IMPORT.

What has surprised me, and has allowed me to get my whole import
function working, is the undoubted fact that an import file with *no*
quoting *and* with multiple word text values will happily import if I
do not specify a quote char and thus default to '"'.  This may not be
WAD(?) - it is expecting quotes but works just fine without them! But
this may be a special case for absent quotes. (would be an acceptable
convention for me, and probably othersSmiley

But if I specify #0 it does load all records .. but with NULL values
for all fields no matter the target column type.  I suspect this is
where you need to look for the real issue.
Do you expect the import to succeed with the default quote char even
if no textual column values are quoted?
And,. yes, some of my import columns have multi-word content - but
they import fine so long as I don't specify #0.

I hope this helps Surprised  But please don't change the export unless you
know something I am not seeing!  It works really well for me. <bg>

Malcolm
--
Wed, Dec 2 2009 12:20 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Malcolm,

<< Tim, are you quite sure about the above? >>

Yes.  Note that I said *delimiter* character, and not quote character.  I
was mistakenly using #0 for the delimiter character, not the quote
character.  The changes that I made were only for the quote character.

<< (As an aside, can't EDB just read from delimiter to delimeter or does
your parser *require* quoting when handling any textual data?) >>

Yes, that is what it does.  I was mistaken again, my statement should have
been:

"The problem with the import is most likely due to the data that is coming
in.  If you tell EDB to use #0 as the quote character, then it really has no
way of dealing properly with string values that have *the delimiter
character* in them."

<< Whatever build I have here (reports 2.03 b6) seems to behave oddly only
with the IMPORT. >>

If you can send me the database catalog that you're using and an import
file, I can clear this up very quickly.  I'm kind of just guessing as to
what you're doing at this point.

<< What has surprised me, and has allowed me to get my whole import function
working, is the undoubted fact that an import file with *no* quoting *and*
with multiple word text values will happily import if I do not specify a
quote char and thus default to '"'.  This may not be WAD(?) - it is
expecting quotes but works just fine without them! >>

Yes, EDB backs off to just examining everything between the delimiters if
there aren't any quotes.  Quotes are only absolutely necessary if there are
embedded characters that are also used as delimiters, etc.

--
Tim Young
Elevate Software
www.elevatesoft.com

Wed, Dec 2 2009 12:44 PMPermanent Link

"Malcolm"
Hi Tim

OK, then that is all fine and dandy.
 *  Export is WAD
 *  Import,it would seem, handles unquoted strings fine so long as
the delimiter char is never embedded in the string data.

So it is all working fine so far as I am concerned now, thanks.  Surprised

On your side, do you need to be sure that the quote char #0 is
illegal for import unless you can spot the reason for it failing
(assuming you see the failure I see)?

I think my 2.03 b6 installation is OK in the light of all this. Surprised

Malcolm

--
Wed, Dec 2 2009 10:41 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Malcolm,

<< On your side, do you need to be sure that the quote char #0 is illegal
for import unless you can spot the reason for it failing (assuming you see
the failure I see)? >>

The #0 character for import is legal when used as the quote character, per
the change that I made.  It is interpreted as "there are no quotes around
any strings".

--
Tim Young
Elevate Software
www.elevatesoft.com

Thu, Dec 3 2009 6:09 AMPermanent Link

"Malcolm"
Tim Young [Elevate Software] wrote:

> The #0 character for import is legal when used as the quote
> character, per the change that I made.  It is interpreted as "there
> are no quotes around any strings".

OK, Tim.

As I am still seeing the import misbehaving when #0 is specified as
the quote char, I will send you a sample file and the catalog as you
suggested.

Malcolm

--
Thu, Dec 3 2009 12:53 PMPermanent Link

"Malcolm"
Malcolm wrote:

>
> OK, Tim.
>
> As I am still seeing the import misbehaving when #0 is specified as
> the quote char, I will send you a sample file and the catalog as you
> suggested.
>
> Malcolm

For any lurkers, Tim has replicated the problem and the fix will be
in the next build.

Malcolm

--
Fri, Dec 25 2009 5:05 PMPermanent Link

"Malcolm"
Malcolm wrote:

> Malcolm wrote:
>
> For any lurkers, Tim has replicated the problem and the fix will be
> in the next build.
>
> Malcolm

Tim, here is a problem I see when using the work around of using the
default QUOTE CHAR when the input data is unquoted.

If an input field *starts* with a Cyrillic character (and possibly
many or even any other non-ascii or non-ansi characters) the whole
field is imported as a NUL value.

But if I add an ascii char at the front (or use quotes!) it is
imported OK.

So this unquoted value ends up as a NULL:- Владимр Путин  
while this one starting with U+0066 ('a') is imported OK:-
aВладимр Путин

Hopefully this will be covered by your fix but you might want to test
before release.

Sorry, I had no idea what a can of worms I had opened .. but
importing unquoted strings is still an important feature for me.  Surprised

Malcolm

--
Mon, Dec 28 2009 9:48 AMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Malcolm,

<< Tim, here is a problem I see when using the work around of using the
default QUOTE CHAR when the input data is unquoted.

If an input field *starts* with a Cyrillic character (and possibly many or
even any other non-ascii or non-ansi characters) the whole
field is imported as a NUL value.

But if I add an ascii char at the front (or use quotes!) it is imported OK.

So this unquoted value ends up as a NULL:- ??????? ????? while this one
starting with U+0066 ('a') is imported OK:-
a??????? ?????

Hopefully this will be covered by your fix but you might want to test
before release. >>

The fix covers this issue also.  The problem, as it currently exists, is
that it is mistaking leading #0 characters as quote characters.

--
Tim Young
Elevate Software
www.elevatesoft.com

Fri, Jan 15 2010 7:38 AMPermanent Link

"Malcolm"
Tim Young +AFs-Elevate Software+AF0- wrote:

+AD4- Malcolm,
+AD4-
+AD4- The fix covers this issue also.  The problem, as it currently
+AD4- exists, is that it is mistaking leading +ACM-0 characters as
quote
+AD4- characters.

Hi Tim, the eagerly awaited fix .. doesn't seem to work for me. Surprised

With build 7(U) and in the Manager, I have tried this on a table
containing a single record with several columns containing Cyrillic
text:

EXPORT TABLE +ACI-ATest+ACI-
TO +ACI-Vladimir.txt+ACI-
IN STORE +ACI-DRBackup+ACI-
DELIMITER CHAR +ACM-124
QUOTE CHAR +ACM-0+ADs-

The output file appears to be as expected.
Then I simply edited the key value to avoid an index error, and:

IMPORT TABLE +ACI-ATest+ACI-
FROM +ACI-Vladimir.txt+ACI-
IN STORE +ACI-DRBackup+ACI-
DELIMITER CHAR +ACM-124
QUOTE CHAR +ACM-0+ADs-

All the text column values starting with a non-ascii char(code point)
are imported as NULLs.

Same problem as before.  Surprised
It seems EDB import just can't cope with unquoted text.
Do you see the same?
--
« Previous PagePage 2 of 3Next Page »
Jump to Page:  1 2 3
Image