Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 10 of 16 total
Thread grouping questions (retesting case with 1.4b4)
Sun, Jul 1 2007 6:33 AMPermanent Link

"Harry de Boer"
EDB 1.04b4 (retesting grouping case with 1.4b2: testfiles then already send
to binaries)
----------------------------------------------------------------------------
-----------

SELECT id_medewerker, id_campagne, aanvang, einde,
CAST((einde - aanvang) HOUR TO MINUTE AS VARCHAR(20)) as bt
FROM registraties
where id_medewerker = '20020'
and id_campagne = 'PAUZE'

results in a total time (bt) of 5 hours and 16 minutes (when I add the bt
column values).

SELECT id_medewerker, id_campagne,
CAST((max(einde) - min(aanvang)) HOUR TO MINUTE AS VARCHAR(20)) as bt
FROM registraties
group by id_campagne, id_medewerker
having id_medewerker = '20020' and id_campagne='PAUZE'

or

SELECT id_medewerker, id_campagne,
CAST((max(einde) - min(aanvang)) HOUR TO MINUTE AS VARCHAR(20)) as bt
FROM registraties
where id_medewerker = '20020' and id_campagne='PAUZE'
group by id_campagne, id_medewerker

results in a total time (bt) of 5 hours and zero minutes (incorrect)

Questions:
A: is this a bug or am I doing something wrong?
B: if I am wrong: what should be the correct statement?
C: the result of 'having' and 'where' clauses are the same. I thought there
should be a difference (reading the help). Could you explain it some more?
D: why (if the results indeed should be the same) is there an enormous speed
advantage in using the 'where' clause?

Regards, Harry

Mon, Jul 2 2007 12:18 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Harry,

Run this to see what the min/max values are:

SELECT id_medewerker, id_campagne,
min(aanvang) as bt1,
max(einde) as bt2
FROM registraties
group by id_campagne, id_medewerker
having id_medewerker = '20020' and id_campagne='PAUZE'

<< results in a total time (bt) of 5 hours and 16 minutes (when I add the bt
column values). >>

Yes, but that's not what you're doing in the other queries - you're using a
min/max subtraction.  What you want is this:

SELECT id_medewerker, id_campagne,
CAST(SUM((einde - aanvang) HOUR TO MINUTE) AS VARCHAR(20)) as bt
FROM registraties
group by id_campagne, id_medewerker
having id_medewerker = '20020' and id_campagne='PAUZE'

But, I also believe the correct time is 4:16, or at least it is with the
data that you gave me originally.

--
Tim Young
Elevate Software
www.elevatesoft.com

Mon, Jul 2 2007 6:12 PMPermanent Link

"Ole Willy Tuv"
Tim,

<< Run this to see what the min/max values are:

SELECT id_medewerker, id_campagne,
min(aanvang) as bt1,
max(einde) as bt2
FROM registraties
group by id_campagne, id_medewerker
having id_medewerker = '20020' and id_campagne='PAUZE' >>

Filtering before grouping, i.e. using the WHERE clause instead of the HAVING
clause, is significantly faster with EDB:

select
 id_medewerker,
 id_campagne,
 min(aanvang) as bt1,
 max(einde) as bt2
from registraties
where id_medewerker = '20020' and id_campagne='PAUZE'
group by id_campagne, id_medewerker

Ole Willy Tuv

Mon, Jul 2 2007 6:49 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Ole,

<< Filtering before grouping, i.e. using the WHERE clause instead of the
HAVING clause, is significantly faster with EDB: >>

The WHERE clause basically causes only a few rows to be grouped, as opposed
to the whole table with the HAVING.

--
Tim Young
Elevate Software
www.elevatesoft.com

Mon, Jul 2 2007 7:08 PMPermanent Link

"Ole Willy Tuv"
Tim,

<< The WHERE clause basically causes only a few rows to be grouped, as
opposed to the whole table with the HAVING. >>

Yes, I know. The suggestion to filter rows in the WHERE clause (whenever
possible) was an optimization tip Smile

Ole Willy Tuv

Tue, Jul 3 2007 4:05 AMPermanent Link

"Harry de Boer"
Tim,

I see now: the statement I used was indeed incorrect (that's why I also
asked: am I doing it the right way? - well, got my answer nowSmile You're
right about the 4 hour 16 minutes. That's the correct result. Thanks.

One thing I don't quite get is the difference between using  'where' and
'having'.  You wrote "The WHERE clause basically causes only a few rows to
be grouped, as opposed to the whole table with the HAVING." The condition is
the same however (id_campagne and id_medewerker must have a certain value).
So, if a 'where' is much faster and the result is the same as using 'having'
I don't see a point in using 'having' at all. Or are there circumstances
where the result is different?

Regards, Harry

"Tim Young [Elevate Software]" <timyoung@elevatesoft.com> schreef in bericht
news:444BCE01-6EC1-4FD1-B593-F4AB8F038328@news.elevatesoft.com...
> Harry,
>
> Run this to see what the min/max values are:
>
> SELECT id_medewerker, id_campagne,
> min(aanvang) as bt1,
> max(einde) as bt2
> FROM registraties
> group by id_campagne, id_medewerker
> having id_medewerker = '20020' and id_campagne='PAUZE'
>
> << results in a total time (bt) of 5 hours and 16 minutes (when I add the
bt
> column values). >>
>
> Yes, but that's not what you're doing in the other queries - you're using
a
> min/max subtraction.  What you want is this:
>
>  SELECT id_medewerker, id_campagne,
>  CAST(SUM((einde - aanvang) HOUR TO MINUTE) AS VARCHAR(20)) as bt
>  FROM registraties
>  group by id_campagne, id_medewerker
>  having id_medewerker = '20020' and id_campagne='PAUZE'
>
> But, I also believe the correct time is 4:16, or at least it is with the
> data that you gave me originally.
>
> --
> Tim Young
> Elevate Software
> www.elevatesoft.com
>
>

Tue, Jul 3 2007 5:39 AMPermanent Link

"Harry de Boer"
Furthermore:

I created a view 'BT' with:

SELECT id_medewerker, id_campagne,
CAST(SUM((einde - aanvang) HOUR TO MINUTE) AS VARCHAR(20)) as bt
FROM registraties
group by id_campagne, id_medewerker

and then do a select * from bt where id_medewerker = '2002' and id_campagne
= 'PAUZE'

It is a lot slower then executing a query with a where clause (I guess
because the filtering is happening after the whole lot is grouped (like the
having). Do I understand correctly that in this case a use of  a view isn't
the wisest thing to do?

Regards, harry


"Tim Young [Elevate Software]" <timyoung@elevatesoft.com> schreef in bericht
news:444BCE01-6EC1-4FD1-B593-F4AB8F038328@news.elevatesoft.com...
> Harry,
>
> Run this to see what the min/max values are:
>
> SELECT id_medewerker, id_campagne,
> min(aanvang) as bt1,
> max(einde) as bt2
> FROM registraties
> group by id_campagne, id_medewerker
> having id_medewerker = '20020' and id_campagne='PAUZE'
>
> << results in a total time (bt) of 5 hours and 16 minutes (when I add the
bt
> column values). >>
>
> Yes, but that's not what you're doing in the other queries - you're using
a
> min/max subtraction.  What you want is this:
>
>  SELECT id_medewerker, id_campagne,
>  CAST(SUM((einde - aanvang) HOUR TO MINUTE) AS VARCHAR(20)) as bt
>  FROM registraties
>  group by id_campagne, id_medewerker
>  having id_medewerker = '20020' and id_campagne='PAUZE'
>
> But, I also believe the correct time is 4:16, or at least it is with the
> data that you gave me originally.
>
> --
> Tim Young
> Elevate Software
> www.elevatesoft.com
>
>

Tue, Jul 3 2007 5:36 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Ole,

<< Yes, I know. The suggestion to filter rows in the WHERE clause (whenever
possible) was an optimization tip Smile>>

Sorry, Mondays stink big time, so I'm usually a little rushed. Smiley I will
put this on the list of to-do items.

--
Tim Young
Elevate Software
www.elevatesoft.com

Tue, Jul 3 2007 5:37 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Harry,

<< One thing I don't quite get is the difference between using  'where' and
'having'.  You wrote "The WHERE clause basically causes only a few rows to
be grouped, as opposed to the whole table with the HAVING." The condition is
the same however (id_campagne and id_medewerker must have a certain value).
So, if a 'where' is much faster and the result is the same as using 'having'
I don't see a point in using 'having' at all. Or are there circumstances
where the result is different? >>

In this case, no, but you can have HAVING clauses that reference aggregate
columns and expressions that you couldn't reference in a WHERE clause.

--
Tim Young
Elevate Software
www.elevatesoft.com

Tue, Jul 3 2007 5:40 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

Harry,

<< It is a lot slower then executing a query with a where clause (I guess
because the filtering is happening after the whole lot is grouped (like the
having). Do I understand correctly that in this case a use of  a view isn't
the wisest thing to do? >>

Well, views are the same as any other query, so there's nothing inherently
worse about them.  It just so happens, however, that in this case that the
straight query with the WHERE clause is faster.   Hopefully at some point in
EDB's lifetime I will be able to add the ability to "push" such WHERE
expressions down into the view itself so as to realize the benefits of
having the WHERE clause limit the number of rows being grouped.  However,
that is pretty complicated and won't be around for some time.

--
Tim Young
Elevate Software
www.elevatesoft.com

Page 1 of 2Next Page »
Jump to Page:  1 2
Image