Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 10 of 11 total
Thread Double counting in SQL
Sun, May 9 2010 11:03 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

How can this double count?

SELECT
_PNO,
COALESCE(_Result,'-') AS _Result,
COALESCE(_Analysis,'-') AS _Analysis,
COUNT(*) AS _Count,
CAST(100* COUNT(*) /  T._Total AS NUMERIC(0,1)) AS _Percentage
FROM CallStats
JOIN
(
 SELECT _Result, COUNT(*) AS _Total
 FROM CallStats
 WHERE
 _fkProjects = :ProjectID
 AND
 _Date BETWEEN :StartDate AND :EndDate
 GROUP BY _PNO,COALESCE(_Result,'-')
) T ON COALESCE(T._Result,'-') = COALESCE(_Result,'-')
WHERE
_fkProjects = :ProjectID
and _pno = 'P'
AND
_Date BETWEEN :StartDate AND :EndDate
GROUP BY _PNO,COALESCE(_Result,'-'),COALESCE(_Analysis,'-')


If I add up the _Count column I get 26.

This

select count(*) from callstats where _pno = 'P' and _fkprojects = 94

gives 24. SO somehow my counting is wrong. Any suggestions?



Roy Lambert
Sun, May 9 2010 11:18 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Its me - its the JOIN. I now have to figure out how to just use the derived table as a column.

Roy Lambert
Mon, May 10 2010 6:38 AMPermanent Link

John Hay

Roy,

It looks like a derived table is needed.  I'm not sure if this WAD or a bug
but the first example gives "incorrect" results and the second is correct.

SELECT Field1,Field2,Count(*) AS Cnt,
Count(*)/(SELECT Count(*) FROM Table1 t1 WHERE t1.Field1=Table1.Field1) AS
DivCnt
FROM Table1
GROUP BY Field1,Field2


SELECT Field1,Field2,Cnt,Cnt/Cnt1 AS DivCnt FROM
(
SELECT Field1,Field2,Count(*) AS Cnt,
(SELECT Count(*) FROM Table1 t1 WHERE t1.Field1=Table1.Field1) AS Cnt1
FROM Table1)
GROUP BY Field1,Field2
) t


Your query would be something like (maybe Smiley

SELECT
_PNO,_Result, _Analysis, _Count,CAST(100*_ Count / _ResultCount AS
NUMERIC(0,1)) AS _Percentage
FROM
(
SELECT
_PNO,
COALESCE(_Result,'-') AS _Result,
COALESCE(_Analysis,'-') AS _Analysis,
COUNT(*) AS _Count,
(
SELECT _Result, COUNT(*) AS _Total
FROM CallStats T
WHERE
_fkProjects = :ProjectID
AND
_Date BETWEEN :StartDate AND :EndDate
AND
COALESCE(T._Result,'-') = COALESCE(CallStats._Result,'-')
GROUP BY _PNO,COALESCE(_Result,'-')
) AS _ResultCount,
FROM CallStats
WHERE
_fkProjects = :ProjectID
and _pno = 'P'
AND
_Date BETWEEN :StartDate AND :EndDate
GROUP BY _PNO,COALESCE(_Result,'-'),COALESCE(_Analysis,'-') ) t1

John

Mon, May 10 2010 7:04 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

John


What I ended up with (and it works <vbg>) is

SELECT
_PNO,
COALESCE(_Result,'-') AS _Result,
COALESCE(_Analysis,'-') AS _Analysis,
COUNT(*) AS _Count,
CAST(100 * COUNT(*) / T._Total AS NUMERIC(0,1)) AS _Percentage
FROM CallStats
JOIN
(
 SELECT COALESCE(_Result,'-') AS _Result, COUNT(*) AS _Total
 FROM CallStats
 WHERE
 _fkProjects = :ProjectID
 AND
 _PNO = :PNO
 AND
 _Date BETWEEN :StartDate AND :EndDate
 GROUP BY _PNO,COALESCE(_Result,'-')
) T ON COALESCE(T._Result,'-') = COALESCE(_Result,'-')
WHERE
_fkProjects = :ProjectID
AND
_PNO = :PNO
AND
_Date BETWEEN :StartDate AND :EndDate
GROUP BY _PNO,COALESCE(_Result,'-'),COALESCE(_Analysis,'-')



Roy Lambert
Mon, May 10 2010 1:29 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

John,

<< It looks like a derived table is needed.  I'm not sure if this WAD or a
bug but the first example gives "incorrect" results and the second is
correct. >>

I'll have to check this out with some data before I can comment in detail.
At first glance it seems like a WAD, because I think that you're wanting the
SELECT COUNT(*) sub-query to be grouped, and there's nothing that would
cause it be necessarily grouped.  IOW, it's a very ambiguous query in terms
of what EDB should do with it. Smiley

--
Tim Young
Elevate Software
www.elevatesoft.com
Tue, May 11 2010 3:45 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Tim

>I'll have to check this out with some data before I can comment in detail.
>At first glance it seems like a WAD, because I think that you're wanting the
>SELECT COUNT(*) sub-query to be grouped, and there's nothing that would
>cause it be necessarily grouped. IOW, it's a very ambiguous query in terms
>of what EDB should do with it. Smiley

I thought the GROUP BY in the sub-query would do that. Essentially what I want is

a1, sum(a)
a2, sum(a)
a3, sum(a)
b1, sum(b)
b2, sum(b)
b3, sum(b)

so I can would out percentages.

The one I posted works, but as I emailed you I have to close and unprepare the query each loop.

Roy Lambert
Tue, May 11 2010 4:05 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Tim


The data is the same as the stuff I emailed you.

Roy Lambert
Tue, May 11 2010 5:21 AMPermanent Link

John Hay

Tim
> At first glance it seems like a WAD, because I think that you're wanting
the
> SELECT COUNT(*) sub-query to be grouped, and there's nothing that would
> cause it be necessarily grouped.  IOW, it's a very ambiguous query in
terms
> of what EDB should do with it. Smiley

I don't think it is ambiguous.  It evaulates the subquery correctly if it
not part of a further calculation eg

SELECT Field1,Field2,Count(*) AS Cnt,
(SELECT Count(*) FROM Table1 t1 WHERE t1.Field1=Table1.Field1),   <--- this
is right
Count(*)/(SELECT Count(*) FROM Table1 t1 WHERE t1.Field1=Table1.Field1) AS
DivCnt   <--- this is wrong
FROM Table1
GROUP BY Field1,Field2

John

Tue, May 11 2010 3:06 PMPermanent Link

Tim Young [Elevate Software]

Elevate Software, Inc.

Avatar

Email timyoung@elevatesoft.com

John,

<< I don't think it is ambiguous.  It evaulates the subquery correctly if it
not part of a further calculation eg >>

Yes, but the context of a SELECT expression changes when you combine it with
an aggregate expression in the parent query.  It then becomes an expression
that isn't being evaluated on a row-by-row basis, but rather one that is
evaluated along with the aggregate expression.

Granted, I'm talking about how EDB works here, not at a higher level.

Give me a couple of days to look at this and I'll get back to you on this.
Right now I've got a 4-5 day backlog of support issues, and they're piling
up quick, so these types of issues aren't particularly urgent right now.

--
Tim Young
Elevate Software
www.elevatesoft.com
Wed, May 12 2010 6:03 AMPermanent Link

John Hay

Tim

> Yes, but the context of a SELECT expression changes when you combine it
with
> an aggregate expression in the parent query.  It then becomes an
expression
> that isn't being evaluated on a row-by-row basis, but rather one that is
> evaluated along with the aggregate expression.

Ahh.  Now I see where the results are coming from - I couldn't work it
before.
If this means that correlated sub queries are executed on a row by row basis
for aggregate outer queries then I guess there is some scope for optmization
(geeky interest rather than a request <bg>).

> Give me a couple of days to look at this and I'll get back to you on this.
> Right now I've got a 4-5 day backlog of support issues, and they're piling
> up quick, so these types of issues aren't particularly urgent right now.

No problem.  There are work arounds and I guess that kind of query isn't
used that much.

John

Page 1 of 2Next Page »
Jump to Page:  1 2
Image