Opened 11 years ago

Closed 11 years ago

#341 closed defect (fixed)

Python bindings should keep a reference to user-subclassable proxies when registered with consumers/users

Reported by: James Aylett Owned by: Richard Boulton
Priority: normal Milestone: 1.0.11
Component: Xapian-bindings Version: SVN trunk
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Operating System: All

Description (last modified by Olly Betts)

Due to #186 and the way the bindings work, if (for instance) you create a ValueRangeProcessor proxy object and register it with the QueryParser proxy object, then throw away your reference to the ValueRangeProcessor proxy, python is left with no references to it, and so will garbage collect it. This then causes a segfault when the underlying Xapian QueryParser object tries to go ahead and use the ValueRangeProcessor object.

A proposed interim solution until this can be solved more cleanly is to stash a reference to the VRP proxy on the QueryParser proxy. Something like:

real_add_valuerangeprocessor = QueryParser.add_valuerangeprocessor            
def intercept_add_valuerangeprocessor(self, vrproc):                          
    self._vrps.append(vrproc)                                                 
    real_add_valuerangeprocessor(self, vrproc)                                
QueryParser._vrps = []                                                        
QueryParser.add_valuerangeprocessor = intercept_add_valuerangeprocessor       

in python/extra.i should do it. Note that this is only one situation where we want to be able to make things easier for python users; this should be considered in every situation where something is user-subclassable, and then documented very clearly.

Change History (12)

comment:1 by Richard Boulton, 11 years ago

Milestone: 1.1.1
Owner: changed from Olly Betts to Richard Boulton
Status: newassigned

Marking for 1.1.1 - it would be useful to do this soon, if we're going to bother, but it doesn't need to block 1.1.0, which is actually getting quite plausibly close now!

(Once fixed, it's probably a reasonable candidate for backporting to 1.0.x, too.)

comment:2 by Richard Boulton, 11 years ago

Oh - also, note that this isn't needed with PostingSources which implement clone() now - this is handled by calling clone() in query construction, to obtain a PostingSource which is owned by the query.

comment:3 by Olly Betts, 11 years ago

Description: modified (diff)

If this works, I think we should bother - moving to boost::shared_ptr is probably 2.0 territory and this repeatedly trips up users.

James: Did you check this works and fixes the problem?

comment:4 by James Aylett, 11 years ago

I hadn't tested it, but I have now and in doing so I've noticed a similar hack being used for TermFreqSource in DocSim (this is from a Richard branch, I assume; it's not in trunk). I've conformed my code to match that (which was much better style anyway):

# When we set a ValueRangeProcessor into the QueryParser, keep a python
# reference so it won't be deleted. This hack can probably be removed once
# xapian bug #186 is fixed.
__queryparser_add_valuerangeprocessor_orig = QueryParser.add_valuerangeprocessor
def _queryparser_add_valuerangeprocessor(self, vrproc):
    self._vrps.append(vrproc)
    return __queryparser_add_valuerangeprocessor_orig(self, vrproc)
_queryparser_add_valuerangeprocessor.__doc__ = __queryparser_add_valuerangeprocessor_orig.__doc__
QueryParser._vrps = []
QueryParser.add_valuerangeprocessor = _queryparser_add_valuerangeprocessor
del _queryparser_add_valuerangeprocessor

comment:5 by Richard Boulton, 11 years ago

That patch uses a shared _vrps for all QueryParser objects. We really want it to be a per-QueryParser object. I put together a patch which does this, and a regression test, and applied it to trunk in r12073. This can sensibly be backported to 1.0.x, I think.

comment:6 by Richard Boulton, 11 years ago

I can't think of any other places in the API where it would be useful to do this, other than for PostingSources being passed into queries. that's a much more complex job to fix, though doable, because queries can be combined with each other - you need to modify all the query constructors to also copy the property across from all the children, and I'm not sure you can do this without making a big Query proxy class - this would be quite invasive, and inappropriate to commit at the present time.

It's not such an issue for PostingSources anyway, because all the built-in ones support clone. It's not possible to implement clone() for ones implemented in python, though, so it's still worth fixing if we can.

comment:7 by Olly Betts, 11 years ago

There are other places:

  • QueryParser::set_stopper()
  • TermGenerator::set_stopper()
  • Enquire::set_sort_by_key() and friends

(those are the ones Search::Xapian currently deals with).

comment:8 by Olly Betts, 11 years ago

Actually, for Query it would probably be better for each Python Query object to just keep a reference to a list of the Query objects used to build it.

comment:9 by Olly Betts, 11 years ago

Milestone: 1.1.11.0.11

PostingSource isn't in 1.0.x, and the others should be easy to fix (we only need to track the last set object in each case), so I think we should try to do this for 1.0.11.

comment:10 by Richard Boulton, 11 years ago

Fixed in trunk (other than for PostingSource) in r12138. This patch is good for backport to 1.0.x, but also includes some deprecations warnings, which shouldn't be backported to 1.0.x since the deprecation is only marked as starting at 1.1.0

comment:11 by Richard Boulton, 11 years ago

As of r12143, trunk also keeps track of posting sources, so I declare this ticket fixed, other than the backport to 1.0.11.

comment:12 by Olly Betts, 11 years ago

Resolution: fixed
Status: assignedclosed

Backported in for 1.0.11 in r12146.

Note: See TracTickets for help on using tickets.