Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 6 of 6 total
Thread Determination of column types as the result of a union
Wed, Mar 28 2007 1:12 PMPermanent Link

"Ole Willy Tuv"
create table t1
(
 col1 char(1),
 col2 char(1),
 col3 smallint,
 col4 integer
);

create table t2
(
 col1 varchar(2),
 col2 varchar(2),
 col3 decimal(18,2),
 col4 float
);

create table t3
(
 col1 clob,
 col2 char(3),
 col3 integer,
 col4 double precision
);

insert into t1 values ('a','a',32767,2147483647);
insert into t2 values ('ab','ab',123456789012345.67,1.7E308);
insert into t3 values ('abc','abc',2147483647,1.7E308);

create table test as
(
 select * from t1
 union all
 select * from t2
 union all
 select * from t3
)
with data;

select
 name column_name,
 type data_type,
 "length",
 scale
from information.tablecolumns
where tablename = 'test'

It seems that EDB picks the column types from the first table in the union
as the data types for the resulting columns. This approach has implications,
such as silent data loss since EDB currently truncates data overflow in
assignments, as seen by querying the resulting "test" table:

select * from test

The general rule/approach for how to determine the data types of a union
result, is to pick the data type with the highest precedence for each
column, and set the length and scale to the maximum of the applicable column
types.

This would lead to the following column types:

col1 - CLOB
col2 - VARCHAR(3)
col3 - DECIMAL(18,2)
col4 - FLOAT or DOUBLE PRECISION (equivalent types in EDB)

This approach also asserts that there is no data loss in a union of
compatible columns.

General rules for evaluating the precedence of data types:

1) Character string types

a) If any of the columns is CLOB, then the result is CLOB.
b) If any of the columns is VARCHAR, then the result is VARCHAR.
c) Otherwise the result is CHAR.
d) The length is equal to the maximum length of the participating column
types.

2) Numeric types

a) If all the columns are exact numeric, then the result is exact numeric.
b) If any of the columns is approximate numeric, then the result is
approximate numeric.
c) The scale is equal to the maximum of the scales of the participating
column types.

Ole Willy Tuv

Thu, Mar 29 2007 5:37 AMPermanent Link

"Ole Willy Tuv"
Tim,

<< select
 name column_name,
 type data_type,
 "length",
 scale
from information.tablecolumns
where tablename = 'test' >>

Just a suggestion - I think the Information.TableColumns.Scale column should
have the value 0 instead of null for exact numeric types with scale 0, i.e.
SMALLINT, INTEGER, BIGINT and DECIMAL/NUMERIC (with implicit or explicit
scale 0).

Ole Willy Tuv

Thu, Mar 29 2007 6:45 AMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Ole,

<< It seems that EDB picks the column types from the first table in the
union as the data types for the resulting columns. >>

Yep, that's exactly what it does.

<< This approach has implications, such as silent data loss since EDB
currently truncates data overflow in
assignments, as seen by querying the resulting "test" table: >>

Noted.  A CAST() will solve the problem for now.

--
Tim Young
Elevate Software
www.elevatesoft.com

Thu, Mar 29 2007 6:45 AMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Ole,

<< Just a suggestion - I think the Information.TableColumns.Scale column
should have the value 0 instead of null for exact numeric types with scale
0, i.e. SMALLINT, INTEGER, BIGINT and DECIMAL/NUMERIC (with implicit or
explicit scale 0). >>

Noted.

--
Tim Young
Elevate Software
www.elevatesoft.com

Thu, Mar 29 2007 4:59 PMPermanent Link

"Ole Willy Tuv"
Tim,

<< Noted.  A CAST() will solve the problem for now. >>

select
 cast(col1 as clob) col1,
 cast(col2 as varchar(3)) col2,
 cast(col3 as decimal(18,2)) col3,
 cast(col4 as double precision) col4
from t1
union all
select
 cast(col1 as clob) col1,
 cast(col2 as varchar(3)) col2,
 col3,
 col4
from t2
union all
select * from t3

Works as expected.

select
 cast(col1 as clob) col1,
 cast(col2 as varchar(3)) col2,
 cast(col3 as decimal(18,2)) col3,
 cast(col4 as double precision) col4
from t1
union all
select
 cast(col1 as clob) col1,
 cast(col2 as varchar(3)) col2,
 col3,
 col4
from t2

Error:
ElevateDB Error #400 The column col1 already exists

Ole Willy Tuv


Fri, Mar 30 2007 11:05 AMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Ole,

<< Error:
ElevateDB Error #400 The column col1 already exists >>

I'll check it out and make sure that its fixed.

--
Tim Young
Elevate Software
www.elevatesoft.com

Image