Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 10 of 16 total
Thread optimize table: smaller # records
Mon, May 8 2006 8:44 AMPermanent Link

"Harry de Boer"
LS

I have a table with 100.000+ records which is taking quite some time to
process queries (requests are by browser to a intrawebserver with kbmmw
middleware)? A lot of records contain the same info (generated). So I wonder
if there is a way to gain speed by creating another table(s) with maybe a
different structure. The table structure below with some example data shows
that some records are the same (type/period/etc.)  except the ID, but not
all! The records that are the same must be grouped somehow, but still have
the possibility to know which ID's are involved. So, Is there a way of
creating a better structure that results in smaller table(s) - in SQL?

ID,TYPE,PERIOD,STARTDATE,HARBOUR
"FOX (07)","Fox","wk",2006-09-02,"Terherne"
"FOX (08)","Fox","wk",2006-09-02,"Terherne"
"FOX (07)","Fox","wk",2006-09-04,"Terherne"
"FOX (08)","Fox","wk",2006-09-04,"Terherne"
"FOX (07)","Fox","wk",2006-09-06,"Terherne"
"FOX (07)","Fox","wk",2006-09-07,"Terherne"
"FOX (08)","Fox","wk",2006-09-08,"Terherne"
"FOX (07)","Fox","wk",2006-09-10,"Terherne"
"FOX (08)","Fox","wk",2006-09-10,"Terherne"

Regards, Harry
dbIsam 3.3, kbmMW 2.5+, IW7.18

Mon, May 8 2006 2:45 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Harry,

<< I have a table with 100.000+ records which is taking quite some time to
process queries (requests are by browser to a intrawebserver with kbmmw
middleware)? A lot of records contain the same info (generated). So I wonder
if there is a way to gain speed by creating another table(s) with maybe a
different structure. The table structure below with some example data shows
that some records are the same (type/period/etc.)  except the ID, but not
all! The records that are the same must be grouped somehow, but still have
the possibility to know which ID's are involved. So, Is there a way of
creating a better structure that results in smaller table(s) - in SQL? >>

Sure, it can be done and is normally recommended for proper normalization.
However, you will then be forcing the use of joins that weren't required
before, but that may or may not be a problem depending upon how many records
can be trimmed down due to the reorganization.  The bottom line is that you
should always have a unique primary key for a table, and determining that
will help determine whether the table can be normalized further.  In your
eyes, what fields make up the primary key for this table ?

--
Tim Young
Elevate Software
www.elevatesoft.com

Tue, May 9 2006 3:03 AMPermanent Link

"Harry de Boer"
> In your eyes, what fields make up the primary key for this table ?

Tim,

Well, that's the question that's hard to answer. The table now contains a
field COUNTER of type autoinc as PK. I guess that TYPE, PERIOD and STARTDATE
could be the PK if the rest of the fields contains the same info, but that's
not always true hence my question. In fact: there are more fields (like
PRICE etc.) then I mentioned here in the example. The rule is this: only
ID's within a certain TYPE with the same PERIOD and STARTDATE where all of
the other fields data equals, could be grouped. If the other fields data
does not equal then I must see them seperate. For example: one ID could be
given another price on a certain date. So the question is if normalization
would really reduce the number of records or just divide records amongst a
couple of tables. The hard part is that the websites build on this, have no
particular order of requests: A first question could be what type of boat,
then the period. But it could be the other way round too (and there are more
then those fields involved).

Regards, Harry



"Tim Young [Elevate Software]" <timyoung@elevatesoft.com> schreef in bericht
news:2615DFCE-21BF-4D07-87B1-7515C71F38B7@news.elevatesoft.com...
> Harry,
>
> << I have a table with 100.000+ records which is taking quite some time to
> process queries (requests are by browser to a intrawebserver with kbmmw
> middleware)? A lot of records contain the same info (generated). So I
wonder
> if there is a way to gain speed by creating another table(s) with maybe a
> different structure. The table structure below with some example data
shows
> that some records are the same (type/period/etc.)  except the ID, but not
> all! The records that are the same must be grouped somehow, but still have
> the possibility to know which ID's are involved. So, Is there a way of
> creating a better structure that results in smaller table(s) - in SQL? >>
>
> Sure, it can be done and is normally recommended for proper normalization.
> However, you will then be forcing the use of joins that weren't required
> before, but that may or may not be a problem depending upon how many
records
> can be trimmed down due to the reorganization.  The bottom line is that
you
> should always have a unique primary key for a table, and determining that
> will help determine whether the table can be normalized further.  In your
> eyes, what fields make up the primary key for this table ?
>
> --
> Tim Young
> Elevate Software
> www.elevatesoft.com
>
>

Tue, May 9 2006 6:09 AMPermanent Link

Chris Erdal
"Harry de Boer" <harry@staaf.nl> wrote in
news:9A96FBBE-67BE-418A-888F-DF30878E5189@news.elevatesoft.com:

>> In your eyes, what fields make up the primary key for this table ?
>
> Tim,
>
> Well, that's the question that's hard to answer. The table now
> contains a field COUNTER of type autoinc as PK. I guess that TYPE,
> PERIOD and STARTDATE could be the PK if the rest of the fields
> contains the same info, but that's not always true hence my question.
> In fact: there are more fields (like PRICE etc.) then I mentioned here
> in the example. The rule is this: only ID's within a certain TYPE with
> the same PERIOD and STARTDATE where all of the other fields data
> equals, could be grouped. If the other fields data does not equal then
> I must see them seperate. For example: one ID could be given another
> price on a certain date. So the question is if normalization would
> really reduce the number of records or just divide records amongst a
> couple of tables. The hard part is that the websites build on this,
> have no particular order of requests: A first question could be what
> type of boat, then the period. But it could be the other way round too
> (and there are more then those fields involved).
>
> Regards, Harry
>

It would appear that this is a log file of some sort, perhaps the boats
you're hiring out from a number of harbours.

I wonder if your application could be speeded up by transferring older
records to an archive table with an identical structure? Say all rentals
terminated over 3 months ago?

You could use the first table (with only a few thousand records) most of
the time, and do a UNION with the same select on both tables for
analytical needs.

--
Chris
Tue, May 9 2006 7:42 AMPermanent Link

"Harry de Boer"
Chris,

It's not a logfile, but a generated table (for rental companies) that holds
all possible departure info about their rental objects (mostly boats) which
is used as a base for internet orders.

Regards, Harry

"Chris Erdal" <chris@No-Spam-erdal.net> schreef in bericht
news:Xns97BE7B6D6467214torcatis@64.65.248.118...
> "Harry de Boer" <harry@staaf.nl> wrote in
> news:9A96FBBE-67BE-418A-888F-DF30878E5189@news.elevatesoft.com:
>
> >> In your eyes, what fields make up the primary key for this table ?
> >
> > Tim,
> >
> > Well, that's the question that's hard to answer. The table now
> > contains a field COUNTER of type autoinc as PK. I guess that TYPE,
> > PERIOD and STARTDATE could be the PK if the rest of the fields
> > contains the same info, but that's not always true hence my question.
> > In fact: there are more fields (like PRICE etc.) then I mentioned here
> > in the example. The rule is this: only ID's within a certain TYPE with
> > the same PERIOD and STARTDATE where all of the other fields data
> > equals, could be grouped. If the other fields data does not equal then
> > I must see them seperate. For example: one ID could be given another
> > price on a certain date. So the question is if normalization would
> > really reduce the number of records or just divide records amongst a
> > couple of tables. The hard part is that the websites build on this,
> > have no particular order of requests: A first question could be what
> > type of boat, then the period. But it could be the other way round too
> > (and there are more then those fields involved).
> >
> > Regards, Harry
> >
>
> It would appear that this is a log file of some sort, perhaps the boats
> you're hiring out from a number of harbours.
>
> I wonder if your application could be speeded up by transferring older
> records to an archive table with an identical structure? Say all rentals
> terminated over 3 months ago?
>
> You could use the first table (with only a few thousand records) most of
> the time, and do a UNION with the same select on both tables for
> analytical needs.
>
> --
> Chris

Tue, May 9 2006 8:30 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Harry


I presume the problem is the speed of response to a user of the website, but are you sure where the root problem is?

A few questions:

1. What is "quite some time"?
2. How long does the same operation take if done purely locally?
3. How much data is being transferred across the web?

Can we have a look at the website to see what's happening?

Roy Lambert
Tue, May 9 2006 1:59 PMPermanent Link

"Harry de Boer"
Roy,

The time it takes to execute some of the involved SQL statements is ca. 12
to 20 seconds on a broadband connection, for slower connection types it
takes a lot longer. It seems that in Germany -as we heard from our customer-
a lot of folks have slow connections. They are the target audience for the
site(s). Locally it takes a couple of seconds. The website is down at the
moment till we solved this matter. We are currently thinking of breaking it
down to smaller seperate tables (e.g. per period or type).

Regards, Harry

"Roy Lambert" <roy.lambert@skynet.co.uk> schreef in bericht
news:6442B8B1-8FC7-4552-817A-410BA43C32D0@news.elevatesoft.com...
> Harry
>
>
> I presume the problem is the speed of response to a user of the website,
but are you sure where the root problem is?
>
> A few questions:
>
> 1. What is "quite some time"?
> 2. How long does the same operation take if done purely locally?
> 3. How much data is being transferred across the web?
>
> Can we have a look at the website to see what's happening?
>
> Roy Lambert
>

Tue, May 9 2006 4:13 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Harry,

<< Well, that's the question that's hard to answer. The table now contains a
field COUNTER of type autoinc as PK. I guess that TYPE, PERIOD and STARTDATE
could be the PK if the rest of the fields contains the same info, but that's
not always true hence my question. In fact: there are more fields (like
PRICE etc.) then I mentioned here in the example. The rule is this: only
ID's within a certain TYPE with the same PERIOD and STARTDATE where all of
the other fields data equals, could be grouped. >>

Okay, in that case the primary key would be ID, TYPE, PERIOD, STARTDATE.
Unfortunately, that doesn't lend itself to any other normalization, as far
as I can see.

<< If the other fields data does not equal then I must see them seperate.
For example: one ID could be given another price on a certain date. >>

Hmm, so that means that PRICE is also a candidate column, which means that
now you're getting into territory where a good primary key might be just an
AutoInc column.

<< So the question is if normalization would really reduce the number of
records or just divide records amongst a couple of tables. >>

Based upon what you've stated here, I wouldn't think there is much
normalization that can be done.  There doesn't seem to be much redundant
data in most cases, so the benefits might be very small, if any at all.

--
Tim Young
Elevate Software
www.elevatesoft.com

Tue, May 9 2006 4:14 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Harry,

<< The time it takes to execute some of the involved SQL statements is ca.
12 to 20 seconds on a broadband connection, for slower connection types it
takes a lot longer. It seems that in Germany -as we heard from our customer-
a lot of folks have slow connections. They are the target audience for the
site(s). Locally it takes a couple of seconds. The website is down at the
moment till we solved this matter. We are currently thinking of breaking it
down to smaller seperate tables (e.g. per period or type). >>

How long does the SQL take locally ?  IOW, is the majority of the time due
to latency in the connection, or is the actual SQL processing time ?

--
Tim Young
Elevate Software
www.elevatesoft.com

Wed, May 10 2006 3:02 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Harry


My bet is the big problem is the time taken to send the result set back over the wire. I think there should be a law requiring all web based applications to be tested over a 19.2kb dialup connection not the in-house gigabit ethernet Smiley

The first thing I'd suggest would be a show the top 10 results and offer to show the rest with a warning that it might take a bit of time....

Roy Lambert
Page 1 of 2Next Page »
Jump to Page:  1 2
Image