Icon View Thread

The following is the text of the current message along with any replies.
Messages 1 to 8 of 8 total
Thread AV in background thread
Mon, Jun 2 2014 4:56 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

I'm still trying to identify the cause of an intermittent  AV in a thread (and to save people asking yes I've tried MadExcept). My current attempt is to move all the code into the foreground temporarily until I can sort it, run from within the IDE (D2006) in the hope that when it crashes it will do so in a manner that will allow me to sort it out.

Its been running like that for 8 or so days now. I've had two exceptions both of which originate from the following chain

function TBackgroundEMail.SystemIsolated: boolean;
var
OpLocks: TAppLocks;
begin
OpLocks := TAppLocks.Create(Engine.FindSession(iParams.IsolateSessionName));
Result := OpLocks.IsLocked(lckIsolated);
FreeAndNil(OpLocks);
end;

function TAppLocks.IsLocked(const CheckLock: string): boolean;
begin
if not lckTable.Active then lckTable.Open else lckTable.Refresh;
if lckTable.Locate('_ID', CheckLock, [loCaseInsensitive]) then begin
 Result := ObtainLock(CheckLock);
 if Result then ReleaseLock(CheckLock);
 Result := not Result;
end else Result := False;
end;

function TAppLocks.ObtainLock(const RowToLock: string): boolean;
begin
if not lckTable.Active then lckTable.Open else lckTable.Refresh;
if lckTable.Locate('_ID', RowToLock, [loCaseInsensitive]) then begin
 try
  lckTable.LockCurrentRecord;
  Result := True;
 except
  Result := False;
 end;
end else begin
 lckTable.Insert;
 lckTable.FieldByName('_ID').AsString := RowToLock;
 try
  lckTable.Post;
  lckTable.LockCurrentRecord;
  Result := True;
 except
  Result := False;
 end;
end;
end;

procedure TEDBDataSet.LockCurrentRecord;
begin
  CheckBrowseMode;
  UpdateCursorPos;
  try
     FHandle.LockRow(GetRowObjectFromBuffer(TRecordBuffer(ActiveBuffer)));
  except
     on E: Exception do
        begin
        if (not (E is EEDBAbortException)) then
           raise EEDBError.Create(E)
        else
           SysUtils.Abort;
        end;
  end;
  GetCalcFields(ActiveBuffer);
  DataEvent(deRecordChange,0);
end;

My question is: is there a way the exception raised in TEDBDataSet.LockCurrentRecord could manage to avoid my exception handling and cause an AV?

The bit I'm particularly wondering about is OpLocks := TAppLocks.Create(Engine.FindSession(iParams.IsolateSessionName)); but if so this should be trapped in the calling routine

procedure TBackgroundEMail.Execute;
var
CrashCount: integer;
wResult: TWaitResult;
begin
CrashCount := 0;
PostMessage(iParams.ReportBack, EMailServiceStarted, 0, 0);
if iParams.Archiving <> '' then begin
 iExtrasTime := StrToTimeDef(iArchiveControl[4], Now);
 iExtrasDate := Date;
end;
while not Terminated do begin
 try
  if not SystemIsolated then begin
   iCrashPoint := 0;
   PostMessage(iParams.ReportBack, EMailServiceStarted, 0, 0);
   SendQueuedEmails;
   PostMessage(iParams.ReportBack, EMailServiceStarted, 0, 0);
   if not SystemIsolated then FetchAnyEmails;
   PostMessage(iParams.ReportBack, EMailServiceStarted, 0, 0);
   if (not SystemIsolated) and (iParams.Archiving <> '') then ArchiveData;
   iCrashPoint := 0;
   if not Terminated then begin
    PostMessage(iParams.ReportBack, EMailServiceInactive, 0, 0);
    if (not Terminated) and Assigned(iSignal) then begin
     wResult := iSignal.WaitFor(iParams.LoopTime);
     if wResult <> wrTimeout then begin
      ReleaseService;
      Terminate;
     end;
    end;
   end;
  end else begin
   ReleaseService;
   if not (Terminated) and Assigned(iSignal) then begin
    wResult := iSignal.WaitFor(iParams.LoopTime);
    if wResult = wrTimeout then begin
     if not SystemIsolated then InitialiseService;
    end else begin
     Terminate;
    end;
   end;
  end;
 except
  inc(CrashCount);
  ReleaseService;
  if CrashCount <= 2 then begin
   InitialiseService;
  end else begin
   PostMessage(iParams.ReportBack, EMailServiceCrash, iCrashPoint, 0);
   Terminate;
  end;
 end;
end;
PostMessage(iParams.ReportBack, bgndClosed, 0, 0);
end;


Roy Lambert
Mon, Jun 2 2014 7:18 AMPermanent Link

Matthew Jones

Roy Lambert wrote:

> I'm still trying to identify the cause of an intermittent  AV in a
> thread (and to save people asking yes I've tried MadExcept). My

What is the purpose of the SystemIsolated? It appears to tell you that
at some moment it was not locked, but the thread could switch away and
then be locked and your code continues regardless. You probably need a
critical section or other semaphore. But perhaps I don't understand the
locking you are using.

Anyway, why didn't madExcept catch it? In my thread Execute function, I
wrap it all in a try/except, and when an except happens I call
         HandleException(etNormal, errInfo);
so that madExcept catches the stack. I have it set to auto-continue.
This should allow you to work it out.

--

Matthew Jones
Mon, Jun 2 2014 8:56 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Matthew

>> I'm still trying to identify the cause of an intermittent AV in a
>> thread (and to save people asking yes I've tried MadExcept). My
>
>What is the purpose of the SystemIsolated? It appears to tell you that
>at some moment it was not locked, but the thread could switch away and
>then be locked and your code continues regardless. You probably need a
>critical section or other semaphore. But perhaps I don't understand the
>locking you are using.

SystemIsolated is using TAppLocks which is
(*
Based on AppUserCount which is based on UserCount / TAccessControl
as developed by Millennium Software, LLC, http://www.1000years.com
*)

Essentially it used Loc/Unlock CurrentRecord to maintain a set of statuses which are automatically unlocked when the program using them closes or crashes. There are two of these status flags that are important here. One says is the system is doing something that requires exclusive access (SystemIsolated) and if so the email subsystem should just not try anything and the other is the EMail is running flag taht's set in the bits that send/fetch emails. If that's set it should stop the exclusive actions from being initiated.

The reason I do it this way (thanks Terry) is that the users don't have to do anything other than restart things in the event of a crash. In the older versions of the program I used to write out small text files to disk and they had to be deleted if things crashed.

>Anyway, why didn't madExcept catch it?

That is an excellent question Smileyunfortunately neither I nor Mathias (ie the author) can answer that categorically. This is what he told me over on the embarcadero groups

-----------------------------------------------------------------------------------------------
> exception message  : Access violation at address 202833D8.
> Read of address 202833D8.

Whenever an AV looks like this (same address twice) it means
that the crashing thread jumped to a code location which is
not allocated. So the big question is: Why did it jump there?
Could have many reasons. Some examples:

1) Maybe you called a function variable which had a random value.
2) Maybe you called a method of an object which was already freed.
3) Maybe the call stack got corrupted.
4) Maybe a DLL was freed which was loaded at 2028xxxx, and now the
main thread tried to execute code from that DLL (e.g. because the
DLL created a window or something), but the DLL was already freed
in the meanwhile.

Unfortunately the crash report doesn't tell us more than the
above, which doesn't really help you too much, I guess...  ;-/
-----------------------------------------------------------------------------------------------

The biggest problem is its intermittent (may not happen for several days), could be related to specific emails that are receivedt, and I have no idea where the AV is originating. Any test takes days to see it I've finally squashed it - the longest between AVs so far is 10 days.

At one point I had every (and I mean every) function/procedure wrapped in a try..except block in the except part I set it to write a message to a text file. Unfortunately when the AV finally occurred no text file was written Frown

Roy Lambert
Mon, Jun 2 2014 10:11 AMPermanent Link

Matthew Jones

Most odd! Do you have a way to feed dummy data in? I once had a bug
that would occur after about 20 hours, and by accelerating everything
we got it to fail within 10 mins every time. (Turned out to be a bug in
the memory manager of Mono, which fortunately got fixed just before the
team got fired by Novell!). Anyway, I think I'd be looking outside the
box here. Turn on all memory checking in case it is a memory overwrite.
Use the FastMem DLL that does all the trimmings. Watch for warnings etc.

Good luck!

--

Matthew Jones
Mon, Jun 2 2014 12:57 PMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Matthew

>Most odd! Do you have a way to feed dummy data in? I once had a bug
>that would occur after about 20 hours, and by accelerating everything
>we got it to fail within 10 mins every time. (Turned out to be a bug in
>the memory manager of Mono, which fortunately got fixed just before the
>team got fired by Novell!).

My theory is that if I knew enough to do that I could solve the &^%^%$£ thing Smiley

>Anyway, I think I'd be looking outside the
>box here. Turn on all memory checking in case it is a memory overwrite.
>Use the FastMem DLL that does all the trimmings. Watch for warnings etc.

It already is. This is even worse than when I had to track down a problem in DBISAM with threads - there I had an idea just took ages to build a test case.

>Good luck!

Thank you. I am thinking of booking myself into the nearest looney bin - seems an easier option.

Roy Lambert

Mon, Jun 2 2014 3:54 PMPermanent Link

Raul

Team Elevate Team Elevate

On 6/2/2014 4:56 AM, Roy Lambert wrote:
> I'm still trying to identify the cause of an intermittent  AV in a thread (and to save people asking yes I've tried MadExcept). My current attempt is to move all the code into the foreground temporarily until I can sort it, run from within the IDE (D2006) in the hope that when it crashes it will do so in a manner that will allow me to sort it out.
> Its been running like that for 8 or so days now. I've had two exceptions both of which originate from the following chain

I'd consider throwing some entry/exit log entries for each of the
functions and listing state for relevant objects. If this is something
that happens occasionally you'd have a food trace of which function
resulted in the AV and what the various objects/variables were. I've
found this to be lot more useful than stack traces. Just make it a
compile conditional so you can turn it on easily whenever needed.


> procedure TEDBDataSet.LockCurrentRecord;
> My question is: is there a way the exception raised in TEDBDataSet.LockCurrentRecord could manage to avoid my exception handling and cause an AV?

Assuming ObtainLock is the only place you call them no it should handle
them all since both calls there to LockCurrentRecord are wrapped in
try/except.



> The bit I'm particularly wondering about is OpLocks := TAppLocks.Create(Engine.FindSession(iParams.IsolateSessionName)); but if so this should be trapped in the calling routine

You can try trapping the creation but generally you check and make sure
it got created with something like :

OpLocks := TAppLocks.Create(...
if assigned(OpLocks) then]
begin
  Result := OpLocks.IsLocked(lckIsolated);
  FreeAndNil(OpLocks);
end;

Raul
Tue, Jun 3 2014 3:19 AMPermanent Link

Roy Lambert

NLH Associates

Team Elevate Team Elevate

Raul


>I'd consider throwing some entry/exit log entries for each of the
>functions and listing state for relevant objects. If this is something
>that happens occasionally you'd have a food trace of which function
>resulted in the AV and what the various objects/variables were. I've
>found this to be lot more useful than stack traces. Just make it a
>compile conditional so you can turn it on easily whenever needed.

Ultimately I think I'll have to go down that route. I'm trying to avoid it because of the amount of data and the amount of work. I've been trying to narrow it down a bit. I've decided its not my text filter or word generator since I turned off full text indexing and still have the error. It might be any of the libraries used which will make it a bit difficult Frown

Ah well.

Roy Lambert
Tue, Jun 3 2014 9:55 AMPermanent Link

Raul

Team Elevate Team Elevate

On 6/3/2014 3:19 AM, Roy Lambert wrote:
> Ultimately I think I'll have to go down that route. I'm trying to avoid it because of the amount of data and the amount of work. I've been trying to narrow it down a bit. I've decided its not my text filter or word generator since I turned off full text indexing and still have the error. It might be any of the libraries used which will make it a bit difficult Frown
>

Roy,

This is pure logging mechanics but i found it useful to redirect these
kinds of log entries into windows debug log using OutputDebugString. My
core logging function has a ifdef to send these entries there.

I like it cause it's just those entries instead of entire app log. When
not running in delphi (where you'd see these in the event log) i run the
SysInternals DebugView to capture log
(http://technet.microsoft.com/en-us/sysinternals/bb896647).

Raul
Image