Opened 11 years ago

Closed 11 years ago

#341 closed defect (fixed)

Python bindings should keep a reference to user-subclassable proxies when registered with consumers/users

Reported by: james Owned by: richard
Priority: normal Milestone: 1.0.11
Component: Xapian-bindings Version: SVN trunk
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Operating System: All

Description (last modified by olly)

Due to #186 and the way the bindings work, if (for instance) you create a ValueRangeProcessor proxy object and register it with the QueryParser proxy object, then throw away your reference to the ValueRangeProcessor proxy, python is left with no references to it, and so will garbage collect it. This then causes a segfault when the underlying Xapian QueryParser object tries to go ahead and use the ValueRangeProcessor object.

A proposed interim solution until this can be solved more cleanly is to stash a reference to the VRP proxy on the QueryParser proxy. Something like:

real_add_valuerangeprocessor = QueryParser.add_valuerangeprocessor            
def intercept_add_valuerangeprocessor(self, vrproc):                          
    real_add_valuerangeprocessor(self, vrproc)                                
QueryParser._vrps = []                                                        
QueryParser.add_valuerangeprocessor = intercept_add_valuerangeprocessor       

in python/extra.i should do it. Note that this is only one situation where we want to be able to make things easier for python users; this should be considered in every situation where something is user-subclassable, and then documented very clearly.

Change History (12)

comment:1 Changed 11 years ago by richard

  • Milestone set to 1.1.1
  • Owner changed from olly to richard
  • Status changed from new to assigned

Marking for 1.1.1 - it would be useful to do this soon, if we're going to bother, but it doesn't need to block 1.1.0, which is actually getting quite plausibly close now!

(Once fixed, it's probably a reasonable candidate for backporting to 1.0.x, too.)

comment:2 Changed 11 years ago by richard

Oh - also, note that this isn't needed with PostingSources? which implement clone() now - this is handled by calling clone() in query construction, to obtain a PostingSource? which is owned by the query.

comment:3 Changed 11 years ago by olly

  • Description modified (diff)

If this works, I think we should bother - moving to boost::shared_ptr is probably 2.0 territory and this repeatedly trips up users.

James: Did you check this works and fixes the problem?

comment:4 Changed 11 years ago by james

I hadn't tested it, but I have now and in doing so I've noticed a similar hack being used for TermFreqSource? in DocSim? (this is from a Richard branch, I assume; it's not in trunk). I've conformed my code to match that (which was much better style anyway):

# When we set a ValueRangeProcessor into the QueryParser, keep a python
# reference so it won't be deleted. This hack can probably be removed once
# xapian bug #186 is fixed.
__queryparser_add_valuerangeprocessor_orig = QueryParser.add_valuerangeprocessor
def _queryparser_add_valuerangeprocessor(self, vrproc):
    return __queryparser_add_valuerangeprocessor_orig(self, vrproc)
_queryparser_add_valuerangeprocessor.__doc__ = __queryparser_add_valuerangeprocessor_orig.__doc__
QueryParser._vrps = []
QueryParser.add_valuerangeprocessor = _queryparser_add_valuerangeprocessor
del _queryparser_add_valuerangeprocessor

comment:5 Changed 11 years ago by richard

That patch uses a shared _vrps for all QueryParser? objects. We really want it to be a per-QueryParser? object. I put together a patch which does this, and a regression test, and applied it to trunk in r12073. This can sensibly be backported to 1.0.x, I think.

comment:6 Changed 11 years ago by richard

I can't think of any other places in the API where it would be useful to do this, other than for PostingSources? being passed into queries. that's a much more complex job to fix, though doable, because queries can be combined with each other - you need to modify all the query constructors to also copy the property across from all the children, and I'm not sure you can do this without making a big Query proxy class - this would be quite invasive, and inappropriate to commit at the present time.

It's not such an issue for PostingSources? anyway, because all the built-in ones support clone. It's not possible to implement clone() for ones implemented in python, though, so it's still worth fixing if we can.

comment:7 Changed 11 years ago by olly

There are other places:

  • QueryParser::set_stopper()
  • TermGenerator::set_stopper()
  • Enquire::set_sort_by_key() and friends

(those are the ones Search::Xapian currently deals with).

comment:8 Changed 11 years ago by olly

Actually, for Query it would probably be better for each Python Query object to just keep a reference to a list of the Query objects used to build it.

comment:9 Changed 11 years ago by olly

  • Milestone changed from 1.1.1 to 1.0.11

PostingSource isn't in 1.0.x, and the others should be easy to fix (we only need to track the last set object in each case), so I think we should try to do this for 1.0.11.

comment:10 Changed 11 years ago by richard

Fixed in trunk (other than for PostingSource?) in r12138. This patch is good for backport to 1.0.x, but also includes some deprecations warnings, which shouldn't be backported to 1.0.x since the deprecation is only marked as starting at 1.1.0

comment:11 Changed 11 years ago by richard

As of r12143, trunk also keeps track of posting sources, so I declare this ticket fixed, other than the backport to 1.0.11.

comment:12 Changed 11 years ago by olly

  • Resolution set to fixed
  • Status changed from assigned to closed

Backported in for 1.0.11 in r12146.

Note: See TracTickets for help on using tickets.