Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 10 of 11 total
Thread TOP clause
Fri, Oct 5 2007 6:56 PMPermanent Link

Abdulaziz Jasser
As I understand EDB does not support TOP clause.  What is the plan for it?
Mon, Oct 8 2007 11:55 AMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Abdulaziz,

<< As I understand EDB does not support TOP clause.  What is the plan for
it? >>

We will be supporting a RANGE or LIMIT clause (or something similar) instead
in an upcoming version.  The beauty of it, also, will be that it will be a
dynamic range attribute of a query.  IOW, changing it will not cause the
query to be re-executed.

--
Tim Young
Elevate Software
www.elevatesoft.com

Wed, Oct 10 2007 1:15 AMPermanent Link

Dave Harrison
Tim Young [Elevate Software] wrote:
> Abdulaziz,
>
> << As I understand EDB does not support TOP clause.  What is the plan for
> it? >>
>
> We will be supporting a RANGE or LIMIT clause (or something similar) instead
> in an upcoming version.

You may want to add an offset as in "LIMIT [offset,][limit]" so users
can move through the table a range at a time.

> The beauty of it, also, will be that it will be a
> dynamic range attribute of a query.  IOW, changing it will not cause the
> query to be re-executed.

"Re-executed"? or "Re-prepared"?

Dave
Wed, Oct 10 2007 8:47 AMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Dave,

<< "Re-executed"? or "Re-prepared"? >>

Both.   IOW, the range will act on the result set only.

--
Tim Young
Elevate Software
www.elevatesoft.com

Wed, Oct 10 2007 10:34 AMPermanent Link

Dave Harrison
Tim Young [Elevate Software] wrote:
> Dave,
>
> << "Re-executed"? or "Re-prepared"? >>
>
> Both.   IOW, the range will act on the result set only.
>

Tim,
    Wouldn't that mean the result set has already been returned and the
rows are sitting in memory somewhere, before the next "Select ...
Offset,Limit" statement is executed?  For example, let's say on a web
page Im displaying 25 rows at a time. The first time it executes:

select * from table order by colx limit 0,25;

The next time he presses "Next Page" it executes:

select * from table order by colx limit 26,25;

etc..

Now from what I've inferred from your previous post, the second query
won't have to execute and will get the results from the first query's
result set. Now that sounds good in theory, but won't that mean the
first query will take longer? When I use Offset,Limit in MySQL it
returns the results in under 10ms because it stops looking after "Limit"
rows. If your query is pulling in more data than it needs to for the
first Limit clause, isn't it going to take longer?

The only reason I'm bringing this up is on web pages people seldom go
past the first page. So you could be adding precious ms to the first
query needlessly. Is this what's happening here? Or did you find a way
not to incur a performance penalty?

Dave
Thu, Oct 11 2007 1:09 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Dave,

<< Now from what I've inferred from your previous post, the second query
won't have to execute and will get the results from the first query's result
set. Now that sounds good in theory, but won't that mean the first query
will take longer? When I use Offset,Limit in MySQL it returns the results in
under 10ms because it stops looking after "Limit"
rows. If your query is pulling in more data than it needs to for the first
Limit clause, isn't it going to take longer? >>

Yes, it will, but only if the result set is insensitive and is generating a
*lot* of rows.  Take an ordered result set, for example.  Unless there is an
index available, even MySQL has to order the rows *first* before grabbing
the X number of rows.  Therefore, each pagination operation results in the
entire result set being ordered *prior* to the rows being selected.  The
only other alternative is to cache the query and resume execution of it for
every change in the LIMIT clause, and that effectively is what EDB will be
doing except that it will execute the entire query to start with.

<< The only reason I'm bringing this up is on web pages people seldom go
past the first page. So you could be adding precious ms to the first query
needlessly. >>

It's either that or re-execute the query for every pagination operation, and
that can take much longer than executing the entire query to begin with, as
I've shown above.  Furthermore, I don't think that what you're saying
regarding looking at the first page is anything but anecdotal unless you can
provide statistics that show otherwise. Smiley

--
Tim Young
Elevate Software
www.elevatesoft.com

Mon, Oct 15 2007 12:05 PMPermanent Link

Dave Harrison
Tim Young [Elevate Software] wrote:
> Dave,
>
> << Now from what I've inferred from your previous post, the second query
> won't have to execute and will get the results from the first query's result
> set. Now that sounds good in theory, but won't that mean the first query
> will take longer? When I use Offset,Limit in MySQL it returns the results in
> under 10ms because it stops looking after "Limit"
> rows. If your query is pulling in more data than it needs to for the first
> Limit clause, isn't it going to take longer? >>
>
> Yes, it will, but only if the result set is insensitive and is generating a
> *lot* of rows.  Take an ordered result set, for example.  Unless there is an
> index available, even MySQL has to order the rows *first* before grabbing
> the X number of rows.

> Therefore, each pagination operation results in the
> entire result set being ordered *prior* to the rows being selected.  The
> only other alternative is to cache the query and resume execution of it for
> every change in the LIMIT clause, and that effectively is what EDB will be
> doing except that it will execute the entire query to start with.
>
> << The only reason I'm bringing this up is on web pages people seldom go
> past the first page. So you could be adding precious ms to the first query
> needlessly. >>
>
> It's either that or re-execute the query for every pagination operation, and
> that can take much longer than executing the entire query to begin with, as
> I've shown above.  Furthermore, I don't think that what you're saying
> regarding looking at the first page is anything but anecdotal unless you can
> provide statistics that show otherwise. Smiley
>

If you can pull it off and have it return the top 20 results from a
million row table in just ms, like a MySQL table with an index then
great. The problem I see is lets say you have a web page that returns
search results like Google, and it returns 150,000 results to the user.
He is likely only interested in the first page (25 rows) or the second
page. With MySQL it fetches only the first 25 rows so it is quite quick.
An Offset on the next query will fetch the next 25 rows. Granted your
method of doing it will be faster *if* the user views all the results.
But with web pages that is unlikely to occur.

Also won't a large number of threads (100) executing similar queries
(different Where clause or sort but still using an index) on a large
table going to be slower than just fetching 25 rows at a time?

If you've done your own benchmarking and it's fast enough, then great,
you can ignore my rants. SmileI just thought I should point out some of my
concerns.

Dave
Mon, Oct 15 2007 2:58 PMPermanent Link

"Jose Eduardo Helminsky"
Dave

I don´t know Google engine mechanism but I can bet they only execute the
search query once, cache it and use it again for each page shown. If the
query is left opened without paging for some period (60 seconds for example)
then the server close the query. I have a web server that do this and it
works fine, users can browse throw the records in a web page and the server
use the same query just moving start pointer.

I don´t know exactly what you want but for paging records the clause TOP is
not a solution because it will execute the query for each page you want.

Eduardo

Mon, Oct 15 2007 4:41 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Dave,

<< If you can pull it off and have it return the top 20 results from a
million row table in just ms, like a MySQL table with an index then great.
>>

No, it will not generate 1 million records just to view the first 20 unless
it is a static result set (insensitive), at least initially.  Progressive
result set population is in the cards, but not initially.  The solution to
this issue is to simply use a sensitive result set instead.

--
Tim Young
Elevate Software
www.elevatesoft.com

Mon, Oct 15 2007 5:25 PMPermanent Link

Dave Harrison
Jose Eduardo Helminsky wrote:
> Dave
>
> I don´t know Google engine mechanism but I can bet they only execute the
> search query once, cache it and use it again for each page shown. If the
> query is left opened without paging for some period (60 seconds for example)
> then the server close the query.

I doubt it works this way simply because Google has close to 100 million
searches per day, and therefore would have to have close to 70,000
queries open and remain open until the cache expires which is a minute
by your estimate. They would also have to create a session for each
browser window and I don't think they are doing that. (There is no
indication there is a session # in the URL or html code). Doing so would
put a strain on their resources.

Their queries would quickly back up and they would consume a lot of
memory. Your idea will work fine with 10 concurrent users but I don't
think it will work for a database that is doing thousands of queries per
second. Google must be using connection pooling so the query gets
executed as quickly as possible and is released for others to use. When
the user clicks Next, the query is executed again with an offset.

If you look at their URL you see there is an offset:

For the second page:
http://www.google.ca/search?q=dbisam&hl=en&start=10&sa=N

For the 3rd page:
http://www.google.ca/search?q=dbisam&hl=en&start=20&sa=N

So this would equate to doing a "Select ... offset 20,10" etc. where
"10" is the number of rows per page. Now if they did use sessions and
keep the query open for 60 seconds, then you couldn't bookmark page 10
because when you come back the session would be expired (like Tamaracka).

> I have a web server that do this and it
> works fine, users can browse throw the records in a web page and the server
> use the same query just moving start pointer.
>
> I don´t know exactly what you want but for paging records the clause TOP is
> not a solution because it will execute the query for each page you want.

Yup. It's better this way to get rid of the resources that the query has
open because that query could execute a few hundred more queries by the
time the user decides to press the Next button. The way to build a fast
webserver is to dumb it down. Smile

Dave
Page 1 of 2Next Page »
Jump to Page:  1 2
Image