Opened 17 years ago

Closed 17 years ago

Last modified 16 years ago

#137 closed defect (released)

python bindings do not release the GIL

Reported by: Mark Hammond Owned by: Olly Betts
Priority: normal Milestone: 1.0.0
Component: Xapian-bindings Version: SVN trunk
Severity: normal Keywords:
Cc: Olly Betts, Richard Boulton, Charlie Hull, Sidnei da Silva Blocked By:
Blocking: #118 Operating System: All

Description (last modified by Richard Boulton)

Python's "global interpreter lock" (GIL) should be released each time any "system" operation is performed. Failure to do so will mean that other Python threads are not able to run while the operation takes place. This is particularly a problem when using a remote database, but should be done for all calls into xapian which may cause IO of any kind.

The easiest place to enable this is in the %exception definition, although it means the GIL is released for *all* calls into xapian - which should be fine. I'm attaching a patch.

Attachments (3)

bug-137.patch (726 bytes ) - added by Mark Hammond 17 years ago.
patch (but it turns out generate-python-exceptions.in is changed, not except.i)
patch8.patch (752 bytes ) - added by Charlie Hull 17 years ago.
Fixup bindings build file for Windows to add -threads
pythreads.patch (1.6 KB ) - added by Olly Betts 17 years ago.
Patch xapian_wrap.cc to use pythreads

Download all attachments as: .zip

Change History (34)

by Mark Hammond, 17 years ago

Attachment: bug-137.patch added

patch (but it turns out generate-python-exceptions.in is changed, not except.i)

comment:1 by Richard Boulton, 17 years ago

Owner: changed from Olly Betts to Richard Boulton

comment:2 by Richard Boulton, 17 years ago

I'll take a look at the change in the code which this patch generates, but I'm not convinced yet that applying this patch is the right approach: many calls to xapian (eg, RSet::add_document()) are very short, and I'm not sure that the cost of releasing the thread locking for them is worthwhile. It certainly should be dropped for potentially long-running calls though. Just applying it for Database::flush(), Database::add_document() and Enquire::get_mset() would probably cover 90% of the cases it's needed for.

Actually, I'm a bit surprised, because I thought I'd checked that the thread lock was dropped by this kind of call, and I thought it was. Looking at the generated code, though, I can see that it isn't being.

comment:3 by Richard Boulton, 17 years ago

Hmm - this patch causes smoketest to segfault on my machine. Will investigate further.

comment:4 by Richard Boulton, 17 years ago

Ah - it looks like a problem with match deciders written in Python: this makes sense, because the lock won't be being re-acquired before the Python callback is called. It's slightly odd that this is causing a problem in a single-threaded testcase, but I've been able to get a backtrace yet so that I can see exactly where the problem is happening.

comment:5 by Richard Boulton, 17 years ago

Blocking: 120 added

This should be fixed in the 1.0 series, but doesn't need to block 1.0 (since it doesn't change API or ABI).

comment:6 by Richard Boulton, 17 years ago

Okay, what it looks like is needed is for the match deciders (and any other callbacks) to be passed the saved PyThreadState pointer somehow, and for them to call PyEval_RestoreThread(save) before calling back to Python, and to set the saved pointer back to the return value of PyEval_SaveThread() after the call back to Python returns.

The icky bit is that I don't see exactly how we can pass the saved threadstate pointer around with the swig director stuff.

comment:7 by Richard Boulton, 17 years ago

Cc: charlie@… added

Ah - it looks like SWIG now has support for all this automatically, by adding the "-threads" option to the SWIG command. I've done this for the unix build system and it seems to work here; hopefully this will resolve the issue.

I'm CCing Charlie because I'd like him to fix up the windows makefiles and check that the python testsuite still passes for him. Charlie: can you mark this bug as resolved once that's done?

Mark: I'd be grateful if you could cast an eye over the differences in the generated code with this option on, as opposed to with it turned off. You know more about python threading than most people (half the pages I turned up when checking on this were emails written by you...), so having your opinion on the code generated by swig would be good. In particular, I'm most concerned about the code generated for directors: eg, the "_wrap_MatchDecider_call" method.

It looks right to me, though.

comment:8 by Charlie Hull, 17 years ago

As SWIG now requires various Win32 thread stuff, we need to put in a #include "windows.h" somewhere in the generated .cc file (or even safewindows.h) - I don't know how to make this happen, but I can continue testing once someone has done so.

by Charlie Hull, 17 years ago

Attachment: patch8.patch added

Fixup bindings build file for Windows to add -threads

comment:9 by Richard Boulton, 17 years ago

Cc: olly@… added

According to Charlie, compilation passes if he inserts #include "windows.h" near the top of his xapian_wrap.cc file.

I can't see how to persuade swig to do this. Perhaps it should be added by default in the bit of swig-generated code which defines a "simple thread abstraction for pthreads on win32" (as described by a code comment says).

I can hack up a patch for Swig if that's the appropriate solution. Olly - does this make sense to you?

comment:10 by Richard Boulton, 17 years ago

attachments.isobsolete: 01

(From update of attachment 69) Applied (though it actually did something else because the bit which added -threads was already applied)

comment:11 by Olly Betts, 17 years ago

I'd say it's a SWIG bug - SWIG shouldn't automatically insert code which requires the user to pull in particular headers. I suspect this works OK for C (prototypes just get invented by the compiler and are probably good enough for the code to work) which is why this hasn't been noticed before.

The problem I see is that windows.h is rather namespace polluting, so this should probably be done after ensuring that NOMINMAX and WIN32_LEAN_AND_MEAN are defined (like safewindows.h does) to try to minimise clashes with code users are trying to wrap.

Including winbase.h directly would be better if that's a kosher thing to do.

Or perhaps it would actually be a lesser evil to just provide the 2 function prototypes and one typedef which actually seem to be used if that's feasible? They're in the mingw headers, which are public domain.

Does this 'swig -threads' change mean we're going to need to probe for pthread stuff on UNIX, or does Python already pull in the required libraries?

comment:12 by Richard Boulton, 17 years ago

I suspect that providing the function prototypes would be the neatest way to go; I don't imagine they're subject to change.

I wouldn't have expected the -threads change to mean we're going to need to probe for pthread stuff on UNIX; it should be able to use Python's own threading interfaces (which are stubbed if Python is built without threading). But the SWIG implementation seems to use pthread stuff directly for protecting access to Director classes, so I suppose we do need to probe for pthread stuff: at least, since we're doing "#include <pthread.h>", we need to check that that file is present on the system. I shouldn't expect we need to link with libpthread, though.

I think that's arguably another bug in SWIG, really: I don't see why it can't use Python's built-in threading, as exposed in "pythread.h". Perhaps pythread.h is not present in some old versions of python which SWIG is aiming to support?

comment:13 by Olly Betts, 17 years ago

You could look at the swig SVN history to see how the code came about. Or you could try asking on swig-dev, though Marcelo was the most active on python and seems to have stopped working on SWIG. Other people do know about python though, and SVN would reveal who added the code.

If the pythread stuff is new, it wouldn't be hard to use it only for newer python versions I suspect, as it's hidden behind macros already. We can easily test python 2.2 (ixion has it) and for Xapian's purposes a patch for SWIG using pythread.h conditional on python >= 2.2 (with a comment noting that we haven't access to test with older versions) would be OK I think.

Actually, I wouldn't be suprised to find SWIG didn't work out of the box with Python 2.1 - when Xapian was still supporting it, I had to send in patches to fix support at least twice.

comment:14 by Mark Hammond, 17 years ago

hrm - this appears painful. By enabling swig threading, there seems to be 2 side-effects:

1) A locking mechanism is used to protect access to some variables etc - apparently only for a "director", which apparently is used for callbacks into Python from C++.

2) It enables the management of the lock for Python calling C++.

It appears the code for (1) is what is causing our problem. It also appears that (2) will only work for Python 2.3 and later - it uses a new "GILState" API introduced in 2.3. It probably is possible to avoid passing '-thread' to SWIG, and manually managing the lock - my last patch did only (2), but adding (1) need not be that hard, especially if we are happy to live with the same "2.3 or later only" limitation SWIG itself has.

FWIW, a #include of windows.h after a #define of WIN32_LEAN_AND_MEAN gets things building again. An appropriate place to add this patch appears to be director.swg in the Lib/python directory in swig.

comment:15 by Olly Betts, 17 years ago

Python 2.2 is the cut-off for the minimum Python version we support because it's the oldest version with the features we currently need.

If there are justifications for requiring at least Python 2.3, that's not necessarily a problem. You probably have more idea that I do what Python versions are in common use, but the last Python 2.2 release seems to be 2.2.3 on May 30, 2003, so it looks like upstream support ended nearly 4 years ago.

Also Debian etch has Python 2.3 as the minimum version.

Although when I suggested we drop 2.1 support 6 months ago, Philip Neustrom said "Dropping support for 2.3 or 2.2 would be a mistake right now, however" but didn't elaborate as to why he thought that:

http://thread.gmane.org/gmane.comp.search.xapian.general/3389/focus=3394

2.3.6 was released less than 6 months ago, so I think making the minimum version 2.4 would be much harder to justify.

Python directors are indeed for callbacks - we use them to allow C++ classes to be subclassed in Python, and virtual methods to be routed correctly (e.g. MatchDecider is passed to Xapian and called from C++ during the match, and we want the Python method to be used).

My preference from what I've heard so far would be to use the SWIG feature rather than trying to reinvent a similar wheel which we'd then have to maintain.

We can easily fix problems with the SWIG feature if we have decent patches, as

I have commit access to the SWIG SVN repo!

comment:16 by Olly Betts, 17 years ago

Looking at the code SWIG generates, it seems to just no-op the GIL handling macros for Python < 2.3, so we could just note in the documentation that Python threading won't be so effective when using Xapian with Python 2.2.

comment:17 by Mark Hammond, 17 years ago

Yes - I didn't look that closely at the NOP macros - I just assumed they may fail but I think you are correct.

The following patch to swig seems to do the trick (against swigwin-1.3.31):

--- director.swg.old 2006-09-25 22:02:32.000000000 +1000 +++ director.swg 2007-04-24 13:41:21.719537200 +1000 @@ -343,6 +343,8 @@

#ifdef THREAD # define PTHREAD

# if defined(_WIN32)
defined(WIN32)

+# define WIN32_LEAN_AND_MEAN +# include "windows.h"

# define pthread_mutex_lock EnterCriticalSection # define pthread_mutex_unlock LeaveCriticalSection

comment:18 by Olly Betts, 17 years ago

mark says winbase.h doesn't work, so that's not an option

by Olly Betts, 17 years ago

Attachment: pythreads.patch added

Patch xapian_wrap.cc to use pythreads

comment:19 by Mark Hammond, 17 years ago

yeah, that works for me, and visually looks good too - nice! It would be great if you can paste in the final patch from director.swg when you can.

comment:20 by Olly Betts, 17 years ago

Owner: changed from Richard Boulton to Olly Betts

Richard suggested this approach (so presumably is happy with it), you seem to like it, and it looks good to me, so let's go with it.

Turns out that the same patch applies successfully to director.swg! Just apply with:

patch Lib/python/director.swg < tmp.patch

The swig testsuite for python isn't happy, but it doesn't immediately look related so perhaps it was broken already. I'm in the middle of working on something else so I can't dig deeper right now. But if you can try the patch against director.swg when you get a chance that would be useful.

Re-assigning to myself, since getting this sorted and applied to SWIG is probably mostly down to me now...

comment:21 by Olly Betts, 17 years ago

Status: newassigned

comment:22 by Olly Betts, 17 years ago

OK, simply cleaning out my swig build tree and rebuilding fixed the issue (I think it was due to mismatching python versions), and the SWIG testsuite results are unchanged by the patch.

comment:23 by Mark Hammond, 17 years ago

patch applies to director.swg here and everything appears to build and work fine.

comment:24 by Olly Betts, 17 years ago

Blocking: 118 added; 120 removed

I've put a patch in the SWIG patch tracker for comments:

http://sourceforge.net/tracker/index.php?func=detail&aid=1710341&group_id=1645&atid=301645

If none of the SWIG developers who regularly work on SWIG/Python have commented negatively by the time we're almost ready to roll 1.0.0, I'll make sure it's committed and then use a SWIG snapshot with it in for the Xapian release. So I'm marking this bug as a 1.0.0 blocker so I don't forget!

comment:25 by Richard Boulton, 17 years ago

I've been testing with this patch for the last week or so, it all seems to work fine here.

I wonder if we should host a snapshot of SWIG somewhere on the Xapian website for developers to download, when we make the 1.0.0 release. (Assuming that there isn't a new SWIG release with all the fixes we need by that point.) Requiring developers to install certain packages is one thing, but when we start requiring them to check out dependencies from SVN I think we're putting up too high a barrier to contributors.

Any idea when a SWIG 1.3.32 release is likely to happen?

comment:26 by Olly Betts, 17 years ago

Sorry, no idea about SWIG releases.

I've no real problems with hosting a SWIG tarball, though SWIG doesn't support "make dist" so it would require special effort to work out how to do it.

Not sure I agree about having to check out SWIG from SVN to work on the bindings being a problem. It's marginally harder than downloading a tarball, but we're talking about people who already have to check out Xapian from SVN. And using a SWIG SVN tree means they can just use "svn up" each time they need a newer SWIG for Xapian, rather than having to download a whole new tarball.

Hmm, what we could do is add an svn:externals property to Xapian SVN which pulls in the currently recommended SWIG revision automatically. Then "svn up" on the Xapian tree would just pull that in, and the requirement would be nicely versioned along with xapian-bindings code! But let's look at this AFTER 1.0.0.

comment:27 by Richard Boulton, 17 years ago

Cc: richard@… added

I like the idea about using SVN:externals - I've made a separate bug (#147) to track this idea, and targetted it for the 1.0 release series.

comment:28 by Mark Hammond, 17 years ago

Assuming we expect a new version of SWIG in a 'reasonable' timeframe, I'd recommend just letting this issue sit for a while and see what happens. IIUC, we expect most people to use a "bootstrapped" version anyway, and the extra complexity and support costs this nicety would add probably is better spent elsewhere. If someone pops up on xapian-dev with this problem, we can deal with it on a case by case basis.

FWIW, I'm currently "stuck" on SWIG with this message:

|checking for c:\utils\swigwin-1.3.31\swig.exe... |c:\utils\swigwin-1.3.31\swig.exe -fakeversion 1.3.32 |configure: error: SWIG >= 1.3.32 (SVN rev 9651 or later) required (you have ) |configure failed

Note that '-fakeversion 1.3.32 -version' does work when I execute it manually, and note the blank version it says I have. It's possible is failing to read the version from stdout, resulting in an empty $v in the script. That is as much as I currently know.

comment:29 by Olly Betts, 17 years ago

Resolution: fixed
Status: assignedclosed

I've committed the patch to SWIG SVN and xapian snapshots are now being generated with a new version of SWIG. Releases are built by the same script, so this bug can now be closed.

comment:30 by Olly Betts, 17 years ago

Operating System: All
Resolution: fixedreleased

Fixed in 1.0.0 release.

comment:32 by Richard Boulton, 16 years ago

Description: modified (diff)
Milestone: 1.0.0
Note: See TracTickets for help on using tickets.