Login ProductsSalesSupportDownloadsAbout |
Home » Technical Support » ElevateDB Technical Support » Support Forums » ElevateDB SQL » View Thread |
Messages 1 to 10 of 10 total |
Sensitive dataset with subselect selecting |
Thu, Mar 3 2011 9:06 AM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | I apologise for the horrible title
Background Part of my filesharing app is an email subsystem. I move emails from live to an archive table after they pass a user selected age. The list of emails for each mailbox is created when a mailbox is clicked in the tree and displayed using a dbgrid (based on Mike Skolnik's). Speed is good even over the wireless parts of my LAN. The query I use is SELECT IF(_Private,$hgPrivate,-1) AS _xPrivacy, IF(_aList IS NOT NULL, $hgAttachment,-1) AS _xAttachment, IF(_fkContacts IS NOT NULL AND _fkContacts > 0, IF(_InOutInd = $MailIn,$hgTfRMailIn,$hgTfRMailOut), IF(_InOutInd = $MailIn,$hgMailIn,$hgMailOut)) AS _xMailDir, CAST(RCF(_emDestination,_Destination,', ') AS VARCHAR(110)) AS _xDestination, CAST(RCF(_emSender,_Sender,', ') AS VARCHAR(110)) AS _xSender, CAST(RCF(_Comments,_Subject,' § ') AS VARCHAR(210)) AS _xSubject, DTF(_Timestamp) AS _xTimestamp, BytesToKbMb(_Size) AS _xSize, (SELECT _Monitored FROM MailBoxes WHERE _BoxNo = :Mailbox) AND NOT EXISTS (SELECT * FROM emReadStatus WHERE _fkEMails = _MsgNo and emReadStatus._fkStaff = $StaffID) AS _xUnRead, EXISTS (SELECT _Blacklisted FROM emBlacklist WHERE _Blacklisted = _emSender) AS _xBlacklisted, * FROM $source WHERE _fkMailBoxes = :MailBox $UsePrivacy AND (_fkStaff IS NULL OR _fkStaff = $StaffID OR _Private IS NULL OR _Private = FALSE) ORDER BY _TimeStamp DESC All the $... bits are replaced in SQL.Text when the form is opened. EG $source will become either EMails or emArchive I also have the ability to set a filter. Since most of the emails will have an attached document (a cv) and generally its the content of these attachments that's being searched, and they are not indexed the filters can take 20-40 mins to run and are generated in a background thread which writes out filter ID and email ID to a FilterResults table. To select the filtered emails the tree is loaded with available filters instead of mailboxes and when one of those is clicked a slightly modified version of the above query is run. The difference is that the WHERE clause becomes _MsgNo IN (SELECT _fkID FROM FilterResults WHERE _fkFilters = :emSearch) This means that I get a sensitive result set based on emails or emArchive. It has two advantages 1. all of the operations that can be carried out both for "normal" email view and filtered view are the same 2. I have an extra button to allow a row to be deleted from the FilterResults table - leaves the email alone but removes from the filtered results. The Problem Because of the SELECT in the WHERE clause the query takes time to run. That's not really a problem but if the button to remove an email from the filtered results is clicked the query is reevaluated and that is the problem. I can live with it taking upto 60 seconds from the specific filter being selected to the display being ready but not upto 60 seconds each time someone clicks remove this one from the filter results. I can't use a join for two reasons 1. it would produce an insensitive result set and I wouldn't be able to remove a email from the displayed results 2. because its having to drag all the data across and write out a table its a lot (think minutes) slower I'm trying to avoid a separate form or a form with lots of "if its filtered do this otherwise do that"s but I think that's where I'll end up unless someone has a brilliant idea. I shall now sit back and wait for the brilliant ideas to pour in! Roy Lambert Roy Lambert |
Thu, Mar 3 2011 10:42 AM | Permanent Link |
John Hay | Roy
I am not sure if it would help but could you have a table (memory?) with hidden ids in it? The delete burtton adds the _msgno to the hidden table and the where clause checks wether _msgno is in the hidden table. John |
Fri, Mar 4 2011 4:39 AM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | John
>I am not sure if it would help but could you have a table (memory?) with hidden ids in it? The delete burtton adds the >_msgno to the hidden table and the where clause checks wether _msgno is in the hidden table. Interesting idea John. That might let me use a JOIN, combine that with creating a master-detail relationship to the Emails/emArchive table so I only have to move the visible in the grid data to the canned dataset and it might just work. Oh what a fun weekend I'm going to have. Roy Lambert |
Fri, Mar 4 2011 10:39 AM | Permanent Link |
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com | Roy,
<< I also have the ability to set a filter. Since most of the emails will have an attached document (a cv) and generally its the content of these attachments that's being searched, and they are not indexed the filters can take 20-40 mins to run and are generated in a background thread which writes out filter ID and email ID to a FilterResults table. >> Is there any particular reason that you're not using the text indexing for this ? -- Tim Young Elevate Software www.elevatesoft.com |
Fri, Mar 4 2011 11:30 AM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | Tim
>Is there any particular reason that you're not using the text indexing for >this ? We're talking email attachments. EMails of interest are transferred to ELN which is the table that relates to contacts the ones left in emails are the unsolicitated (non-spam) stuff. The reasons are 1. Disk space 2. Time required for indexing vs frequency of searching 3. When I first set things up I wasn't sure how to handle the attachments which could have a .pdf a .doc and a .xls all in the attachment in terms of the filter type column No. 2 is probably the most important. I would probably want to create an intermediate file to download emails into prior to storing in the real table otherwise it would just slow things down to much. Roy Lambert |
Sun, Mar 6 2011 9:46 AM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | John
I have things working nice and fast now. Its more complex than your idea. Just for info what I do is: 1. SELECT LIST(CAST(_fkID AS VARCHAR)) AS FList FROM FilterResults WHERE _fkFilters = :filter 2. Use the result of the above to create the query on emails. I have to "rewrite" the query each time a different filter is selected because the WHERE clause is WHERE _MsgNo IN (...) and you can't pass the in-list as a parameter 3. If I delete one of the emails or remove from the filter add _MsgNo to a stringlist and set query.filter := NOT _MsgNo IN invisible.commatext Totally different to your suggestion but without it I would never have thought of the approach so thanks. Roy Lambert |
Mon, Mar 7 2011 10:28 AM | Permanent Link |
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com | Roy,
<< We're talking email attachments. EMails of interest are transferred to ELN which is the table that relates to contacts the ones left in emails are the unsolicitated (non-spam) stuff. The reasons are 1. Disk space 2. Time required for indexing vs frequency of searching 3. When I first set things up I wasn't sure how to handle the attachments which could have a .pdf a .doc and a .xls all in the attachment in terms of the filter type column >> Well, remember that the filter type is just a "hint" to your text filtering module. If the text that you're filtering is in MIME format, then it will already have content types that indicate how the various content is encoded. IOW, you could just set the filter type column to "MIME". << No. 2 is probably the most important. I would probably want to create an intermediate file to download emails into prior to storing in the real table otherwise it would just slow things down to much. ?? >> As long as emails are being stored once, and then only retrieved, the only real catch is getting the block size okay. We store all of our emails in a DBISAM table with a BLOB block size of 512 bytes, and it would work equally well for ElevateDB. We literally have GBs of BLOBs stored in them - all inbound and outbound emails with attachments. -- Tim Young Elevate Software www.elevatesoft.com |
Mon, Mar 7 2011 11:50 AM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | Tim
>Well, remember that the filter type is just a "hint" to your text filtering >module. If the text that you're filtering is in MIME format, then it will >already have content types that indicate how the various content is encoded. >IOW, you could just set the filter type column to "MIME". I know that now but this was all set up when I was converting from ElevateDB to DBISAM and full text indexing was very (and I do mean very) confusing. ><< No. 2 is probably the most important. I would probably want to create an >intermediate file to download emails into prior to storing in the real table >otherwise it would just slow things down to much. ?? >> > >As long as emails are being stored once, and then only retrieved, the only >real catch is getting the block size okay. We store all of our emails in a >DBISAM table with a BLOB block size of 512 bytes, and it would work equally >well for ElevateDB. We literally have GBs of BLOBs stored in them - all >inbound and outbound emails with attachments. I disagree, in terms of receiving emails in, checking for spam and storing to disk adding in full text indexing for attachments (I already have it in place for the message body) would substantially increase the time taken for each email. I know ElevateDB is a lot faster than DBISAM for FTI but it would impose a significant overhead. However, a wild idea has just been sparked off - an extra column which on receipt of the email is null (or X or something) have an extra thread which checks for these rows and alters the column to M and use that column to determine wether to index attachments or not by having a text filter which simply returns and empty string. What do you think? Roy Lambert |
Mon, Mar 7 2011 12:00 PM | Permanent Link |
Roy Lambert NLH Associates Team Elevate | Woops
ElevateDB to DBISAM should read DBISAM to ElevateDB Roy Lambert |
Sun, Mar 13 2011 5:43 PM | Permanent Link |
Tim Young [Elevate Software] Elevate Software, Inc. timyoung@elevatesoft.com | Roy,
<< However, a wild idea has just been sparked off - an extra column which on receipt of the email is null (or X or something) have an extra thread which checks for these rows and alters the column to M and use that column to determine wether to index attachments or not by having a text filter which simply returns and empty string. >> Sure, that would work great. Having the filter return an empty string would effectively make it invisible to the text indexing, and any modifications that turned it back on would simply cause the text indexing to pick up the new text like it was just added by the application or user. -- Tim Young Elevate Software www.elevatesoft.com |
This web page was last updated on Tuesday, April 30, 2024 at 03:55 PM | Privacy PolicySite Map © 2024 Elevate Software, Inc. All Rights Reserved Questions or comments ? E-mail us at info@elevatesoft.com |