Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 10 of 31 total
Thread #8963 BLOB block buffers corrupt
Wed, May 11 2011 5:41 AMPermanent Link

Dave Sellers

Hi

I know this has been covered before, I've read many of the previous
posts but...

I'm using 4.29 build 2 and have been since soon after it was released.  

We used to get a lot of instances of this problem and v4.29 cured them
as I recall and they've been rare since then - occasional instances
could usually be traced to a reboot or network issue.  However, over
the last month or so they have come back to haunt us and are driving us
crazy.  We have literally hundreds of  databases and each month we add
between 1.5M and 2M rows to Full Text Indexed tables - some of the
larger ones now have 8GB+ .BLB files.  When they get corrupted we have
found that repopulating from scratch is quicker than a repair so we do
that but it can take 2-3 days to repopulate.

I have blamed antivirus interference during population.
I have blamed backups running during population.
I have blamed other instances of our software opening the database
during population (these are all single user, not C/S apps)

The application that populates the tables typically accesses the
database files across the LAN and I'm beginning to think that this is
perhaps at the root of our problem.

The inserts are processed in transactions of 100 records, I think I
read in a previous post that the Commit calls FlushBuffers.

Is network access a bad idea for such intensive operations? Can anyone
give any suggestions for other possible causes?

Any help gratefully received, this is driving my customer nuts and
they, in turn, are driving me nuts.

Regards
Dave
Wed, May 11 2011 5:43 AMPermanent Link

Dave Sellers

I should have added that according to IT dept, antivirus and backups do
not interfere with the database files.
Wed, May 11 2011 6:19 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Dave


I moved to ElevateDB and stoped using DBISAM around 4.25 so I'm out of date. I do have a couple of questions

1. can you produce a test case where you can reliably produce the problem?
2. are you using threads? If so is everything in the thread isolated properly?

Couple of comments:

8Gb is a lot of data to be shovelling around over a LAN. C/S may well be better / faster and its possible to build single user c/s apps.

If you think there's a change of other users opening the app then build in something to stop them. Use semaphore flags or write / delete a file or open the tables exclusively.

Question 1 above is the really important one. Without a sample Tim (or anyone else) can only guess at the problem.

Roy Lambert [Team Elevate]
Wed, May 11 2011 7:28 AMPermanent Link

Dave Sellers

Hi Roy

Sadly a test case has proved impossible to organize thus far.

My feeling is also that there is simply too much LAN traffic to be
reliable.  In the absence of any better ideas what I propose is:

Rename the database folder to ensure exclusive access.
Copy it to local drive (it's a gigabit LAN so time hit not an issue)
Process the local copy.
If no errors, copy it back across the LAN and delete the renamed
version.

I suspect the performance increase of working on a local drive may well
outweigh the copy overheads but even if it doesn't, I get the double
bonus of a backup copy should the update hit block buffer corruption,
or other, errors.  A few minutes lost is nothing compared to the
disruption of corrupted files!

Thanks for your quick response.

Regards
Dave
Wed, May 11 2011 8:21 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Dave

>Sadly a test case has proved impossible to organize thus far.

I know the feeling. Years ago I had something similar (don't remember what version but I think 3.xx) where it was a problem caused by background threads. Neither I nor Tim could eyeball a problem with my code and it took months before I could generate a test case. Tim fixed it fast then.

>My feeling is also that there is simply too much LAN traffic to be
>reliable. In the absence of any better ideas what I propose is:

It could be with all the buffers involved, it could even be that 8Gb blobs and (probably) bigger index files DBISAM is stretched to far. I'd suggest emailing Tim directly and asking his opinion.

I have had problems with wireless connections but never with wired.

>Rename the database folder to ensure exclusive access.
>Copy it to local drive (it's a gigabit LAN so time hit not an issue)
>Process the local copy.
>If no errors, copy it back across the LAN and delete the renamed
>version.

If it does bomb at least you've eliminated the LAN as a culprit Smiley

You didn't say if you're using threads or not, and I do have two other questions.

3. you're using full text indices - are you using a custom function or the built in one. If the latter you could get a considerable improvement by rolling your own.

4. do you know when (roughly) it bombs? Is it, as you seem to be indicating when you're adding data - can the users edit / add data?

Roy Lambert [Team Elevate]
Wed, May 11 2011 8:34 AMPermanent Link

Dave Sellers

>
> It could be with all the buffers involved, it could even be that 8Gb
> blobs and (probably) bigger index files DBISAM is stretched to far.
> I'd suggest emailing Tim directly and asking his opinion.

If it doesn't crash it works fine provided you have a TDBISAMEngine
component and LargeFileSupport set true.

>
> I have had problems with wireless connections but never with wired.

Same here but this is a very busy network, literally millions of files
created/edited/copied around almost on a daily basis.

>
> If it does bomb at least you've eliminated the LAN as a culprit Smiley

That's true.  I'm still far from convinced that their IT people have
actually ring-fenced these files from AV and/or backup

> You didn't say if you're using threads or not, and I do have two
> other questions.

Sorry, single threaded.

>
> 3. you're using full text indices - are you using a custom function
> or the built in one. If the latter you could get a considerable
> improvement by rolling your own.

Well I've had a rubicon solution waiting to replace the FTI for over a
year but can they get round to testing it.....  To answer the question
though (with another question), it's all standard, which part of it
would you hand roll?

>
> 4. do you know when (roughly) it bombs? Is it, as you seem to be
> indicating when you're adding data - can the users edit / add data?

The process deletes a bunch of rows (SQL) and then adds a bunch in a
loop: Edit, Field.Value := x, Field.LoadFromFile(afile), Post.

Dave
Wed, May 11 2011 8:53 AMPermanent Link

Dave Sellers

> The process deletes a bunch of rows (SQL) and then adds a bunch in a
> loop: Edit, Field.Value := x, Field.LoadFromFile(afile), Post.

Should have added that is during this process that corruption is
encountered

Dave
Wed, May 11 2011 9:01 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Dave


>Well I've had a rubicon solution waiting to replace the FTI for over a
>year but can they get round to testing it..... To answer the question
>though (with another question), it's all standard, which part of it
>would you hand roll?

It depends on the content that you're indexing. Whilst you can set a dirty great exclude list for words that aren't wanted it may be better to combine that with a few other parameters. It can be amazing how the index file shrinks. There's a bit of a trade off between file size, improved index performance and processing time increase. I had MinWordLength set to 3 and MaxWordLength set to 30. Its what I use in my homebrew mail/news client and very effective.

>> 4. do you know when (roughly) it bombs? Is it, as you seem to be
>> indicating when you're adding data - can the users edit / add data?
>
>The process deletes a bunch of rows (SQL) and then adds a bunch in a
>loop: Edit, Field.Value := x, Field.LoadFromFile(afile), Post.

From that you're mixing sql and table operations - no reason why it shouldn't work but we're dealing with computers here so if the table is open when you're doing the sql try closing it and opening it after the sql has completed, maybe flushing buffers in between (close and reopen the engine should do it).

A thought that's going through my mind is that DBISAM reuses record space and this includes blobs. The free space table (or whatever Tim uses) could be getting stuffed. I don't know what DBISAM uses or how, but if its being written asynchronously with the table data it would be very possible for it to get out of sync between the various buffers.

Roy Lambert [Team Elevate]





procedure Tmnd.ftiMassage(Sender: TObject; const TableName, FieldName: string; var TextToIndex: string);
var
Cntr: integer;
ThisUn: string;
Skip: boolean;
WorkStr: string;
sl: TStringList;
slCnt: integer;
MinWordLength: integer;
MaxWordLength: integer;
ftiWebStuff: boolean;
ftiHelp: TDBISAMTable;
ftiSession: TDBISAMSession;
URLandEmail: string;
const
Delims = [' ', ',', '.', ';', ':', '!', '"', '?', '(', ')', '/', '\', '>', '<', '[', ']', '}', '{'];
Alphas = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '_', '-'];
function RemoveHTML(FormattedText: string): string;
var
 CharsToProcess: integer;
 CurrentChar: integer;
 TestChar: Char;
 function HTMLTag: string;
 var
  Tag: string;
  CRCheck: string;
  EqPos: integer;
  procedure FindClosingTag(WhatTag: string);
  var
   TagLength: integer;
   function NoTagMatch: boolean;
   var
    Cntr: integer;
   begin
    Result := False;
    for Cntr := 1 to TagLength do begin
     if WhatTag[Cntr] <> UpperCase(FormattedText[CurrentChar + Cntr]) then begin
      Result := True;
      Break;
     end;
    end;
   end;
  begin
   TagLength := Length(WhatTag);
   while (CurrentChar < CharsToProcess - TagLength) and (NoTagMatch) do inc(CurrentChar);
   CurrentChar := CurrentChar + TagLength + 1;
  end;
 begin
  Tag := '';
  while (FormattedText[CurrentChar] <> '>') and (CurrentChar <= CharsToProcess) do begin
   Tag := Tag + FormattedText[CurrentChar];
   inc(CurrentChar);
  end;
  Tag := Tag + FormattedText[CurrentChar]; //This should get the >
  Tag := UpperCase(Tag);
  if Tag = '<HEAD>' then begin
   FindClosingTag('</HEAD>');
  end else if Tag = '<XML>' then begin
   FindClosingTag('</XML>');
  end else if Tag = '<TITLE>' then begin
// We need to dump everything until the closing tag
   FindClosingTag('</TITLE>');
   Result := '';
  end else if Copy(Tag, 1, 6) = '<STYLE' then begin
   FindClosingTag('</STYLE>');
// We need to dump everything until the closing tag - especially css stuff
   Result := '';
  end else if Tag = '<BR>' then begin
   Result := #13#10;
  end else if Copy(Tag, 1, 2) = '<P' then begin
   Result := #13#10;
  end else if Tag = '</DIV>' then begin
   if CurrentChar < CharsToProcess then begin
    if (FormattedText[CurrentChar - 6] <> '>') and (FormattedText[CurrentChar - 10] <> '<') then Result := #13#10 else begin
     CRCheck := FormattedText[CurrentChar - 10] +
      FormattedText[CurrentChar - 9] +
      FormattedText[CurrentChar - 8] +
      FormattedText[CurrentChar - 7] +
      FormattedText[CurrentChar - 6];
     if UpperCase(CRCheck) <> '<DIV>' then Result := #13#10;
    end
   end else Result := '';
  end else if (Copy(Tag, 1, 3) = '</H') and (Tag[4] in ['0'..'9']) then begin
   Result := #13#10;
  end else Result := '';
 end;
 function SpecialChar: string;
 var
  HTMLChar: string;
 begin
  HTMLChar := '';
  while (FormattedText[CurrentChar] <> ';') and (CurrentChar <= CharsToProcess) do begin
   HTMLChar := HTMLChar + FormattedText[CurrentChar];
   inc(CurrentChar);
  end;
  HTMLChar := LowerCase(HTMLChar + FormattedText[CurrentChar]); //This should get the ;
  Result := '';
 end;
begin
 if 0 <> Pos('<html', LowerCase(FormattedText)) then begin
  Result := '';
  CharsToProcess := Length(FormattedText);
  CurrentChar := 1;
  while CurrentChar <= CharsToProcess do begin
   TestChar := FormattedText[CurrentChar];
   case TestChar of
    #0..#9, #11, #12, #14..#31: {do nothing};
    '<': Result := Result + HTMLTag;
    '&': Result := Result + SpecialChar;
   else Result := Result + TestChar;
   end;
   inc(CurrentChar);
  end;
 end else Result := FormattedText;
end;
function LineIsUUEncoded: boolean;
var
 uuCntr: integer;
const
 TableUU = '`!"#$%&''()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_';
begin
 Result := False;
 if (Length(WorkStr) > MinWordLength) and (((((Length(WorkStr) - 1) * 3) / 4) = Pos(WorkStr[1], TableUU) - 1) and (WorkStr[1] < 'a')) then begin
// Only if it hits here is there a possibility that its UUEncoded, but we need to check to make sure
  Result := True;
  for uuCntr := 1 to Length(WorkStr) do begin
   if 0 = Pos(WorkStr[uuCntr], TableUU) then begin
    Result := False;
    Break;
   end;
  end;
 end;
end;
function GetWebStuff: string;
var
 wsl: TStringList;
 wCntr: integer;
 lCntr: integer;
 Line: string;
 DelimPos: integer;
const
 wDelims = [' ', ',', ';', ':', '!', '"', '?', '(', ')', '/', '/', '>', '<', '[', ']', '}', '{'];
 function SeekEnd: string;
 var
  Chop: integer;
 begin
  Chop := lCntr + 1;
  while Chop <= Length(Line) do begin
   if (Line[Chop] in wDelims) then Break else inc(Chop);
  end;
  Result := Copy(Line, DelimPos, Chop - DelimPos) + ' ';
  Delete(Line, DelimPos, Chop - DelimPos);
  lCntr := DelimPos;
 end;
begin
{
Rules
1. If the line is less than 10 characters just forget it
2. Only interested in URLs starting www.
3. I'm only interested in the base URL ie if any / then chop before it
4. I'm only interested in one line at a time
}
 wsl := TStringList.Create;
 wsl.Text := TextToIndex;
 TextToIndex := '';
 Result := '';
 for wCntr := 0 to wsl.Count - 1 do begin
  DelimPos := 1;
  if Length(wsl[wCntr]) > 10 then begin
   Line := wsl[wCntr];
   lCntr := 1;
   while lCntr <= Length(Line) do begin
    if Line[lCntr] in wDelims then DelimPos := lCntr + 1;
    if Line[lCntr] = '@' then Result := Result + SeekEnd;
    if LowerCase(Copy(Line, lCntr, 4)) = 'www.' then Result := Result + SeekEnd;
    inc(lCntr);
   end;
   if Line <> '' then TextToIndex := TextToIndex + #13#10 + Line;
  end else TextToIndex := TextToIndex + #13#10 + wsl[wCntr];
 end;
 wsl.Free;
end;
procedure AddWordToBeIndexed;
begin
 if ThisUn = '' then Exit;
 if ThisUn[Length(ThisUn)] = '-' then Delete(ThisUn, Length(ThisUn), 1);
 if (Length(ThisUn) > MinWordLength) and (Length(ThisUn) <= MaxWordLength) then TextToIndex := TextToIndex + ThisUn + ' ';
 ThisUn := '';
end;
begin
if (TextToIndex = '') or (FieldName = '_Flags') then Exit;
if FieldName = '_Headers' then begin
 TextToIndex := GetWebStuff;
 Exit;
end;
ftiSession := MakeDBISAMSession;
ftiHelp := MakeDBISAMTable('ftiHelper', 'Memory', ftiSession);
ftiHelp.Open;
MinWordLength := ftiHelp.FieldByName('_MinWordLength').AsInteger;
MaxWordLength := ftiHelp.FieldByName('_MaxWordLength').AsInteger;
ftiWebStuff := ftiHelp.FieldByName('_ftiWebStuff').AsBoolean;
ftiHelp.Close;
ftiHelp.Free;
ftiSession.Free;
sl := TStringList.Create;
URLandEmail := GetWebStuff;
if FieldName = '_Message' then sl.Text := RemoveHTML(TextToIndex) else sl.Text := TextToIndex;
TextToIndex := '';
for slCnt := 0 to sl.Count - 1 do begin
 WorkStr := sl[slCnt];
 Skip := False;
 if LineIsUUEncoded then Break; // assumption is the rest of the message is UU stuff
 if Length(WorkStr) > MinWordLength then begin
  for Cntr := 1 to length(WorkStr) do begin
   if (not Skip) and (WorkStr[Cntr] in Alphas) then begin
    ThisUn := ThisUn + WorkStr[Cntr];
    Skip := false;
    if (Cntr = length(WorkStr)) or (WorkStr[Cntr + 1] in Delims) then AddWordToBeIndexed;
   end else begin
    if (Cntr = length(WorkStr)) or (WorkStr[Cntr + 1] in Delims) then AddWordToBeIndexed else begin
     Skip := true;
     ThisUn := '';
    end;
   end;
   if Skip then Skip := (Cntr < Length(WorkStr)) and (WorkStr[Cntr + 1] in Delims);
  end;
 end;
end;
if ftiWebStuff and (URLandEmail <> '') then TextToIndex := TextToIndex + URLandEmail;
if TextToIndex <> '' then Delete(TextToIndex, Length(TextToIndex), 1); // get rid of the trailing space
sl.Free;
end;
Wed, May 11 2011 9:23 PMPermanent Link

Gregory Sebastian

Hi Dave,
The database is huge. Have you checked that its on a NTFS partition and not
a FAT32. I'm not sure but 8GB may already be approaching the File Size
limits for a FAT32 partition.

I would try to just permanently put the database on a local hard drive since
its a single user app. If you really must have it over the LAN, I could
seriously consider converting it to C/S as Roy already suggested. I think
that just either of these 2 changes alone should significantly improve
robustness and save you a lot of pain.

> The process deletes a bunch of rows (SQL) and then adds a bunch in a
> loop: Edit, Field.Value := x, Field.LoadFromFile(afile), Post.

This block (deletes, edit, Post)  is done within a single transaction ? I
see no harm it in being it a single transaction providing the app doesn't
open a dialog to get any user input in during an open transaction. If it
doesn't affect your database integrity, perhaps you can try splitting the
transactions i.e deletes in one transaction and table loop/edit in another
transaction. Also just in case you missed it, have you checked that you
disabled Table Controls while iterating the table ?

Regards
Greg

Thu, May 12 2011 6:53 AMPermanent Link

Dave Sellers

Roy, Gregory

Thanks for both your suggestions.  

They have me chasing my tail on another matter now, I will get back to
this in the next day or two.

Regards
Dave
Page 1 of 4Next Page »
Jump to Page:  1 2 3 4
Image