Opened 17 years ago

Closed 14 years ago

#170 closed defect (wontfix)

Windows structured exceptions produce RuntimeError with some MSVC versions

Reported by: Sidnei da Silva Owned by: Olly Betts
Priority: normal Milestone:
Component: Xapian-bindings Version: SVN trunk
Severity: normal Keywords:
Cc: Richard Boulton, Mark Hammond Blocked By:
Blocking: Operating System: Microsoft Windows

Description (last modified by Olly Betts)

We are having an issue on a testing machine. We are running stress tests on it, and Xapian eventually raises a RuntimeError, "unknown Xapian error".

This is on Windows, using the Xapian Python bindings.

Richard mentioned that exception.i has a catch(...) that catches all the unknown exceptions.

He also mentioned that this might have something to do with Windows Structured Exceptions, and that Mark had investigated this previously so he thought it had been fixed.

Change History (13)

comment:1 by Richard Boulton, 17 years ago

Cc: richard@… added
Owner: changed from Richard Boulton to Charlie Hull

Assigning to Charlie, since he knows much more about windows than I do (and adding myself to CC).

comment:2 by Richard Boulton, 17 years ago

I believe the problem is that some (as yet) unknown windows-specific error is occurring in the Xapian code. This error is being not a subclass of Xapian::Error (ie, not a Xapian error) and is also not a subclass of std::exception (ie, not a standard C++ exception). As far as I know, this only leaves one possible class of errors, which are errors thrown by the windows-specific API calls.

As I understand it, these errors are not C++ errors, but are SEH windows errors.

Therefore, there is no type which can be specified in a catch clause which will

catch the error. However, Windows implements C++ errors using the SEH system, which has the side effect that a "catch (...)" statement will catch any unhandled SEH exception as well as catching an unhandled C++ exception: the down side is that there is no way to get useful information about the error.

Xapian's python bindings have a "catch(...)" clause to avoid crashing if an unknown error occurs, and in this situation a Python "exceptions.RuntimeError("unknown error in Xapian") exception is generated (as opposed to a "xapian.RuntimeError()" which would indicate an unhandled Xapian error, and a bug in the exception handling code which failed to catch the exact class of Xapian error).

Sidnei confirmed on IRC that the error is an exceptions.RuntimeError.

comment:3 by Richard Boulton, 17 years ago

Looking at articles, it seems like what is necessary to catch the error in a way which allows us to see its contents is to first install a structured error handler which converts the windows exception to a C++ exception (ie, with a type which can be caught) and then modify the list of catches in the python bindings to catch this. Then, we can display the contents of the error.

Once we've done this, we should be able to see what's going wrong, and then have a go at actually fixing the error.

However, I don't have the skills neccessary to do this: Mark / Charlie - do you follow my explanation enough to go from here? (Or see that it's wrong in some way?)

comment:4 by Charlie Hull, 17 years ago

I think I understand the problem. One thing to check would be that we're using the correct compiler flags - we use -EHsc : http://msdn2.microsoft.com/en-us/library/1deeycx5(VS.80).aspx

I *think* this is right.

If so, the next step is to find all the places we call Windows API functions (I'd guess there's a lot in Mark's remotedb stuff and a few in the Flint/Quartz database code) and wrap them with something that catches the exceptions we're looking for. I can do the latter but I think Mark is best placed to do the former?

comment:5 by Mark Hammond, 17 years ago

I think that making the change will allow us to see the underlying error is an "access violation" - which is useful, but probably not as useful as it sounds - the end result will probably be simply replacing "unknown error" with "access violation", but will not actually get us any closer to diagnosis of the issue.

comment:6 by Richard Boulton, 17 years ago

If we can get a SEH handler working, can we also get it to add a backtrace to the exception message. That _would_ get us closer to working out what happened.

There seem to be several snippets of code available on the web which generate backtraces for SEH errors, but I don't know enough to know if any of them really work, or which is appropriate.

comment:7 by Richard Boulton, 17 years ago

Summary: RuntimeError maskes underlying exceptionRuntimeError masks underlying exception

comment:8 by Mark Hammond, 17 years ago

I've determined experimantelly that given an "access violation", VS2003 and VS2005 behave differently wrt the 'catch(...)' clause. When building with 2005, a failing Python test will cause a 'hard' crash (ie, create a dump file or offer to start the debugger), whereas identical code built with 2003 will cause a "RuntimeError: unknown error in Xapian" Python exception. I can't find the 2003 behaviour documented; the 2005 behaviour matches the documentation (ie, explicit code is required to turn an SEH exception into a c++ exception).

Note that is might not end up being the compiler version per-se - Python is built with 2003, so this behaviour may be explained by a mismatch in the versions, rather than the version itself.

I'm still trying to work out the best way to get diagnostic info if this happens in the field. codeguru.com has some lgpl code that may be useful, but IMO, if we can convince the default windows handler (ie, DrWatson) to give us a stack-trace, we are better off...

comment:9 by Mark Hammond, 17 years ago

Just for posterity: The following test.cpp:

#include <iostream> using namespace std; void main(void) {

try {

strcpy(0, "boom");

} catch (...) {

cout << "in exception handler" << endl;

}

}

On VC6 (12.00.8804) and 7 (13.10.6030), 'in exception handler' will be printed.

On VC8 (14.00.50727.762) a crash dialog is presented - ie, the null pointer

write is not caught by that handler. All were compiled with /EHsc as the only option.

If nothing else, this means the behaviour of xapian will depend on the compiler, which isn't ideal. My recommendation would be to remove the (...) handler, so that (a) all compilers behave the same and (b) diagnostic info is available when it does happen.

comment:10 by Olly Betts, 17 years ago

Owner: changed from Charlie Hull to Olly Betts
rep_platform: MacintoshPC
Summary: RuntimeError masks underlying exceptionWindows structured exceptions produce RuntimeError with some MSVC versions

Stealing this bug, and updating the summary...

It seems a little odd to be comparing the behaviour of code which invokes undefined behaviour! I guess MSVC must define the behaviour for cases like this, as it's permitted to do.

I'm not really convinced that removing the default catch is a good fix. Perhaps that would actually be better, but I feel that's something to consider on its own merits - changing behaviour on all platforms to bring consistency between MSVC versions seems wrong.

comment:11 by Mark Hammond, 17 years ago

Note that the intention of removing the handler is not simply to make things consistent among MSVC versions, but simply to offer a way of debugging any crashes that do occur. We can continue to create a hacked build should that be necessary in the future, but the fact we need to make a custom build to debug crashes seems like a fairly fundamental problem - obviously that is a matter of opinion though...

comment:12 by Olly Betts, 17 years ago

Operating System: Microsoft Windows
Status: newassigned

Removing the default catch is really just a workaround though, and it changes behaviour for all platforms. It seems to me the ideal fix would be to get older MSVC versions to treat SEH exceptions as newer versions do (though perhaps that's simply not possible).

Incidentally, I've found verification that the behaviour change you've observed is intentional:

http://blogs.msdn.com/dcook/archive/2007/03/28/exceptional-wisdom.aspx

comment:14 by Olly Betts, 14 years ago

Description: modified (diff)
Resolution: wontfix
Status: assignedclosed

MSVC 8.0 and newer don't catch SEH exceptions in catch (...). Since the key issue here was that Xapian catching SEH exceptions makes debugging them harder, this ticket is only relevant for compiler versions prior to MSVC 8.0, which was released in November 2005 (over 4 years ago).

And it seems that Microsoft no longer support the C++ part of VS 2003, which was the version before 8.0:

http://msdn.microsoft.com/en-us/vstudio/aa948854.aspx

So it seems time has made this issue less relevant. Our informal policy is that we don't go out of our way to support platforms and compilers which aren't supported "upstream", and since nobody is working on this issue, I'm closing it as "wontfix".

Note: See TracTickets for help on using tickets.