Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 10 of 11 total
Thread Speed up wanted
Thu, Jun 21 2012 1:42 PMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Anyone suggest a way to improve the speed on this - it currently takes 1341.78 seconds


SELECT
SUBSTR(_emSender,POS('@',_emSender)+1,Length(_emSender)) AS Domain,
COUNT(*) AS InSpam,
(
SELECT COUNT(*)
FROM Contacts C1 WHERE
C1._HomeEMail IS NOT NULL
AND
SUBSTR(SpamStore._emSender,POS('@',SpamStore._emSender)+1,Length(SpamStore._emSender))
=
SUBSTR(C1._HomeEMail,POS('@',C1._HomeEMail)+1,Length(C1._HomeEMail))
) AS HomeEMail,
(
SELECT COUNT(*)
FROM Career C2 WHERE
C2._EMail IS NOT NULL
AND
SUBSTR(SpamStore._emSender,POS('@',SpamStore._emSender)+1,Length(SpamStore._emSender))
=
SUBSTR(C2._EMail,POS('@',C2._EMail)+1,Length(C2._EMail))
) AS WorkEMail,
SUBSTR(_emSender,POS('@',_emSender)+1,Length(_emSender)),
(
SELECT COUNT(*)
FROM EMails C3 WHERE
C3._emSender IS NOT NULL
AND
SUBSTR(SpamStore._emSender,POS('@',SpamStore._emSender)+1,Length(SpamStore._emSender))
=
SUBSTR(C3._emSender,POS('@',C3._emSender)+1,Length(C3._emSender))
)
+             
(
SELECT COUNT(*)
FROM emArchive C4 WHERE
C4._emSender IS NOT NULL
AND
SUBSTR(SpamStore._emSender,POS('@',SpamStore._emSender)+1,Length(SpamStore._emSender))
=
SUBSTR(C4._emSender,POS('@',C4._emSender)+1,Length(C4._emSender))
)
AS EMails
FROM SpamStore
WHERE _emSender IS NOT NULL
GROUP BY SUBSTR(_emSender,POS('@',_emSender)+1,Length(_emSender))
HAVING COUNT(*) > 20



Roy Lambert
Fri, Jun 22 2012 4:12 AMPermanent Link

John Hay


Roy

How long does the following take to run?

SELECT
SUBSTR(_emSender,POS('@',_emSender)+1,Length(_emSender)) AS Domain,
COUNT(*) AS InSpam,
FROM SpamStore
WHERE _emSender IS NOT NULL
GROUP BY SUBSTR(_emSender,POS('@',_emSender)+1,Length(_emSender))
HAVING COUNT(*) > 20

John

Fri, Jun 22 2012 4:50 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

John


0.218 secs

Roy Lambert
Fri, Jun 22 2012 7:08 AMPermanent Link

John Hay

Roy

Not absolutely sure if I have a typo, but what about trying to use a derived table for Spamstore like

SELECT
Domain,InSpam
(
SELECT COUNT(*)
FROM Contacts C1 WHERE
C1._HomeEMail IS NOT NULL
AND
SS.Domain
=
SUBSTR(C1._HomeEMail,POS('@',C1._HomeEMail)+1,Length(C1._HomeEMail))
) AS HomeEMail,
(
SELECT COUNT(*)
FROM Career C2 WHERE
C2._EMail IS NOT NULL
AND
SS.Domain
=
SUBSTR(C2._EMail,POS('@',C2._EMail)+1,Length(C2._EMail))
) AS WorkEMail,
SUBSTR(_emSender,POS('@',_emSender)+1,Length(_emSender)),
(
SELECT COUNT(*)
FROM EMails C3 WHERE
C3._emSender IS NOT NULL
AND
SS.Domain
=
SUBSTR(C3._emSender,POS('@',C3._emSender)+1,Length(C3._emSender))
)
+
(
SELECT COUNT(*)
FROM emArchive C4 WHERE
C4._emSender IS NOT NULL
AND
SS.Domain
=
SUBSTR(C4._emSender,POS('@',C4._emSender)+1,Length(C4._emSender))
)
AS EMails
FROM
(SELECT SUBSTR(_emSender,POS('@',_emSender)+1,Length(_emSender)) AS Domain,COUNT(*) AS InSpam
FROM SpamStore
WHERE _emSender IS NOT NULL
GROUP BY SUBSTR(_emSender,POS('@',_emSender)+1,Length(_emSender))
HAVING COUNT(*) > 20) SS

John

Fri, Jun 22 2012 9:06 AMPermanent Link

John Hay

Roy

If the having clause does not exclude many records then you could try joins eg


SELECT
SUBSTR(_emSender,POS('@',_emSender)+1,Length(_emSender)) Domain,
COUNT(*)  InSpam,
COALESCE(HomeEMail,0) HomeEmail,
COALESCE(WorkEMail,0) WorkEmail,
COALESCE(EMails,0)+COALESCE(EMails1,0) Emails
FROM SpamStore SS
LEFT OUTER JOIN
(
SELECT SUBSTR(_HomeEMail,POS('@',C1._HomeEMail)+1,Length(C1._HomeEMail)) HomeDom,COUNT(*) AS HomeEMail
FROM Contacts WHERE
C1._HomeEMail IS NOT NULL
GROUP BY HomeDom
) T1
ON SUBSTR(SS._emSender,POS('@',_emSender)+1,Length(_emSender)) =T1.HomeDom
LEFT OUTER JOIN
(
SELECT SUBSTR(C2._EMail,POS('@',C2._EMail)+1,Length(C2._EMail)) WorkDom,COUNT(*) AS WorkEmail
FROM Career C2 WHERE
C2._EMail IS NOT NULL
GROUP BY WorkDom
) T2
ON SUBSTR(SS._emSender,POS('@',_emSender)+1,Length(_emSender)) =T2.WorkDom
LEFT OUTER JOIN
(
SELECT SUBSTR(C3._emSender,POS('@',C3._emSender)+1,Length(C3._emSender)) EmailDom,COUNT(*) AS Emails
FROM EMails C3 WHERE
C3._emSender IS NOT NULL
GROUP BY EmailDom
) T3
ON SUBSTR(SS._emSender,POS('@',_emSender)+1,Length(_emSender)) =T3.EmailDom
LEFT OUTER JOIN
(
SELECT SUBSTR(C4._emSender,POS('@',C4._emSender)+1,Length(C4._emSender)) EmailDom1,COUNT(*) AS Emails1
FROM emArchive C4 WHERE
C4._emSender IS NOT NULL
GROUP BY EmailDom1
) T4
ON SUBSTR(SS._emSender,POS('@',_emSender)+1,Length(_emSender)) =T4.EmailDom1

WHERE _emSender IS NOT NULL
GROUP BY Domain
HAVING COUNT(*) > 20

John

Fri, Jun 22 2012 9:56 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

John

I've tried JOINs and they were slower. I think its the row scanning forced by the SUBSTRs. I tried your sub table idea but couldn't get it to work. I'm now trying a number of temporary tables. If it works I'll tie the lot together in a VIEW.


Roy Lambert
Fri, Jun 22 2012 10:47 AMPermanent Link

John Hay

Roy

> I've tried JOINs and they were slower. I think its the row scanning forced by the SUBSTRs. I tried your sub table idea
but couldn't get it to work. I'm now trying a number of temporary tables. If it works I'll tie the lot together in a
VIEW.

Did you try it the way I coded it?

I found that if the target of a join was a substr it ran very slow (ie I cancelled the query)

ON SUBSTR(SS._emSender,POS('@',_emSender)+1,Length(_emSender))
=SUBSTR(C3._emSender,POS('@',C3._emSender)+1,Length(C3._emSender))

If it was not it ran in 2 to 3 seconds

ON SUBSTR(SS._emSender,POS('@',_emSender)+1,Length(_emSender)) =T2.WorkDom

I think the subtable wasn't working because there is a stray SUBSTR(_emSender,POS('@',_emSender)+1,Length(_emSender)),
after WorkEmail.  If the having clause did not reduce the number of rows it wouldn't help anyway.

John

Fri, Jun 22 2012 11:10 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

John


From 22 minutes to sub second SmileyEven over the LAN using wireless (802.11G) still only 24 seconds

SCRIPT
BEGIN

BEGIN
EXECUTE IMMEDIATE 'DROP TABLE BadStuff';
EXCEPTION
-- Couldn't drop it so either not there or in use
END;

BEGIN
EXECUTE IMMEDIATE 'CREATE TEMPORARY TABLE BadStuff AS
 SELECT COUNT(*) AS BadUns, BadDomain
 FROM
 (
 SELECT
 SUBSTR(_emSender,POS(''@'',_emSender)+1,Length(_emSender)) AS BadDomain
 FROM SpamStore
 UNION ALL
 SELECT
 SUBSTR(_Blacklisted,POS(''@'',_Blacklisted)+1,Length(_Blacklisted))  AS BadDomain
 FROM emBlacklist
 ) BadStuff
 WHERE
  BadDomain IS NOT NULL
  AND
  BadDomain <> ''nlh.co.uk''
 GROUP BY BadDomain
 WITH DATA';
EXCEPTION
-- Can't create it so probably there and in use - at least hope so
END;

BEGIN
EXECUTE IMMEDIATE 'DROP TABLE HomeEMails';
EXCEPTION
-- Couldn't drop it so either not there or in use
END;

BEGIN
EXECUTE IMMEDIATE 'CREATE TEMPORARY TABLE HomeEMails AS
 SELECT
 COUNT(*) AS HomeEMail,
 SUBSTR(_HomeEMail,POS(''@'',_HomeEMail)+1,Length(_HomeEMail)) AS Domain
 FROM Contacts
 WHERE
  _HomeEMail IS NOT NULL
  AND
  _HomeEMail NOT LIKE ''%nlh.co.uk''
 GROUP BY Domain
 WITH DATA';
EXCEPTION
-- Can't create it so probably there and in use - at least hope so
END;
  
BEGIN
EXECUTE IMMEDIATE 'DROP TABLE WorkEMails';
EXCEPTION
-- Couldn't drop it so either not there or in use
END;

BEGIN
EXECUTE IMMEDIATE 'CREATE TEMPORARY TABLE WorkEMails AS
 SELECT
 COUNT(*) AS WorksEMail,
 SUBSTR(_EMail,POS(''@'',_EMail)+1,Length(_EMail)) AS Domain
 FROM Career
 WHERE
  _EMail IS NOT NULL
  AND
  _EMail NOT LIKE ''%nlh.co.uk''
 GROUP BY Domain
 WITH DATA';
EXCEPTION
-- Can't create it so probably there and in use - at least hope so
END;

BEGIN
EXECUTE IMMEDIATE 'DROP TABLE OtherEMails';
EXCEPTION
-- Couldn't drop it so either not there or in use
END;

BEGIN
EXECUTE IMMEDIATE 'CREATE TEMPORARY TABLE OtherEMails AS
 SELECT COUNT(*) AS OtherEMails, Domain
 FROM
 (
 SELECT
 SUBSTR(_emSender,POS(''@'',_emSender)+1,Length(_emSender)) AS Domain
 FROM EMails
 UNION ALL
 SELECT
 SUBSTR(_emSender,POS(''@'',_emSender)+1,Length(_emSender))  AS Domain
 FROM emArchive
 ) GoodStuff
 WHERE
  Domain IS NOT NULL
  AND
  Domain <> ''nlh.co.uk''
 GROUP BY Domain
 WITH DATA';
EXCEPTION
-- Can't create it so probably there and in use - at least hope so
END;

BEGIN
EXECUTE IMMEDIATE 'DROP TABLE DomainAnalysis';
EXCEPTION
-- Couldn't drop it so either not there or in use
END;

BEGIN
EXECUTE IMMEDIATE 'CREATE TEMPORARY TABLE DomainAnalysis AS
 SELECT
 BadDomain,
 COALESCE(BadUns,0) AS BadUns,
 COALESCE(HomeEmail,0) AS Home,
 COALESCE(WorksEmail,0)AS Work,
 COALESCE(OtherEmails,0)AS Other
 FROM BadStuff
 LEFT OUTER JOIN HomeEMails H ON H.Domain = BadDomain
 LEFT OUTER JOIN WorkEMails W ON W.Domain = BadDomain
 LEFT OUTER JOIN OtherEMails O ON O.Domain = BadDomain
 WITH DATA';
EXCEPTION
-- Can't create it so probably there and in use - at least hope so
END;

END



followed by
SELECT * FROM DomainAnalysis
WHERE BadUns > 20
ORDER BY BadUns DESC

and I have the same display

I've now split it up into several views. At last I've found a use for views - yippee

Roy Lambert
Fri, Jun 22 2012 11:16 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

John


I'll have another go with your sub-table concept but I'm well chuffed with what I've achieved so far.

Roy Lambert
Fri, Jun 22 2012 11:26 AMPermanent Link

John Hay

Roy

The derived table joins should in theory be very similar to what you have done in your script.  Do you know if the
"Group By" index is kept in the temporary tables and then used in the final join?

The nice thing about your script is it very obvious what is being done, and will much easier to maintain/extend if
required.

John

Page 1 of 2Next Page »
Jump to Page:  1 2
Image