Opened 17 years ago
Closed 9 years ago
#186 closed defect (fixed)
User subclassable classes should be reference counted
Reported by: | Richard Boulton | Owned by: | Olly Betts |
---|---|---|---|
Priority: | normal | Milestone: | 1.3.5 |
Component: | Library API | Version: | git master |
Severity: | normal | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Operating System: | All |
Description (last modified by )
MatchDeciders should be subclasses of RefCntBase, so that they can be reference counted. This would allow classes like MultipleMatchDecider to keep a reference count to the match deciders in it, avoiding the risk of the deciders used in it being freed before the match decider is used.
This will involve and API and ABI change, so will have to wait for 1.1.0
Attachments (3)
Change History (42)
comment:1 by , 17 years ago
Blocking: | 160 added |
---|
comment:2 by , 17 years ago
Status: | new → assigned |
---|
comment:3 by , 17 years ago
I don't think this will work (otherwise we'd already be doing it!)
If Xapian::MatchDecider
just subclasses RefCntBase
then it can be reference
counted internally, but not externally. That just won't work. For example, the
user might not allocate the MatchDecider
explicitly:
Xapian::MatchDecider foo; enq.set_matchdecider(&foo);
Or it could be allocated using new[]
.
Even if we restrict the user to using new
, then the external pointer won't be
reference counted. How do we know when the last pointer to the object has gone
when there may be an unknown number of external pointers to it?
So Xapian::MatchDecider
would have to be like most other API classes and contain
a RefCntPtr<...>
member pointing to an internal class. But then the user can't
subclass usefully - sub-classing MatchDecider
would add member variables to the
pointer, not the object.
I agree a way to reference count MatchDecider
and other similar classes would be
nice, but the RefCntBase
approach doesn't seem to work.
comment:4 by , 17 years ago
Blocking: | 182 added |
---|
comment:5 by , 17 years ago
Cc: | added |
---|
comment:6 by , 17 years ago
Hmm - after reading your comments, the only way I can see that it could work
would be to require the user to keep all references to the MatchDecider
object
in the form of RefCntPtr<...>
wrappers. This is probably far too restrictive,
and error-prone. So, unless anyone has any bright ideas, I suspect we'll have
to drop this suggestion (and bug #182, as a result).
comment:7 by , 17 years ago
Owner: | changed from | to
---|---|
Status: | assigned → new |
I don't know how to do this, so I'm not working on it, so assigning it to nobody.
comment:10 by , 17 years ago
attachments.isobsolete: | 0 → 1 |
---|
comment:11 by , 17 years ago
This seems quite a fundamental change, so I'd quite like to have a good look at the patch before we apply it, but I won't be able to do so until next week.
comment:12 by , 17 years ago
I've made a branch ("stemrefcnt") which contains an updated version of the patch. I've also added a "StemWithNoStemList" class on this branch, which is a stemmer with a list of exceptions which aren't stemmed.
I think the stemmer implementation in xapian-core on this branch is suitable for merging to HEAD. However, stemrefcnt/xapian-bindings needs updating accordingly. This isn't trivial, but I think some of the recent features added to swig may be sufficient to allow it to be wrapped.
comment:13 by , 17 years ago
attachments.isobsolete: | 0 → 1 |
---|---|
Operating System: | → All |
comment:16 by , 17 years ago
Description: | modified (diff) |
---|---|
Milestone: | → 1.1 |
comment:17 by , 17 years ago
Blocking: | 160 removed |
---|
comment:19 by , 16 years ago
On reflection, I think the only way to make MatchDeciders, ValueRangeProcessors, and other user-defined subclasses reference counted is to expose the reference counting mechanism fully. Without this, I don't think there's any hope of making the bindings work correctly with the reference counting.
SWIG has support for reference counting (and other smart pointers) with some of its backends - but I'm not convinced it has support for all the backends which we require.
My favoured long-term solution would be to move xapian (at least, the publically exposed parts of xapian) over to using Boost::shared_ptr, and other smart pointers. This would make it easier to use xapian correctly from other Boost using C++ projects, and avoid us having to maintain our own smart pointer implementations, and make it easier to integrate with the SWIG reference counting support.
However, doing this is probably too major a change to be considered for 1.1, so I suggest postponing until a fictional future 2.0 release.
comment:20 by , 16 years ago
Milestone: | 1.1.0 → 2.0.0 |
---|
Yes, I think the boost implementations are probably the way to go, but indeed this is a big incompatible API change, so our release numbering means 2.0.0 is the time for that.
comment:21 by , 16 years ago
Summary: | MatchDeciders should be reference counted → User subclassable classes should be reference counted |
---|
Just to note that the symptoms of the lack of C++ reference counting for the classes where this is currently visible (which doesn't include MatchDecider) is fixed for Python (#341), and other bindings could probably do something similar.
I'm going to retitle actually.
comment:22 by , 15 years ago
Blocking: | 182 removed |
---|
Removing "blocking", since removing reciprocal "blocked-by" from #182 didn't seem to do so automatically.
comment:23 by , 12 years ago
Cc: | added; removed |
---|---|
Description: | modified (diff) |
Milestone: | 2.0.0 → 1.3.x |
Owner: | changed from | to
I've had an idea about this - if we provide ref() and unref() methods, we can make this work for the bindings without having to expose how the ref counts are handled internally, and provide a nice simple API to the mechanism required.
And we can also provide a "release()" method (maybe there's a better name?) which allows C++ users to hand ownership of the object to the class it is set on (see the new testcase in the patch I'm about to attach).
The really nice thing is that if you don't call these methods, things work as they do already - i.e. this is API compatible with existing user code (but not ABI compatible). So it's 1.3.x material.
comment:24 by , 12 years ago
Cc: | removed |
---|
I replaced the patch with one which also removes the current workaround for Perl too.
comment:25 by , 12 years ago
Milestone: | 1.3.x → 1.3.2 |
---|---|
Status: | new → assigned |
There are some issues still with the bindings, shown by the python tests.
Having poked it for a while, I think we need to "disown" (in SWIG terms) the wrapped subclassed objects when we pass them to C++, so that the C++ half holds a reference to the Python (or other language) half, and then sort out a few kinks so everything works smoothly with that done.
Marking this for 1.3.2.
comment:28 by , 10 years ago
So some experimental hacking later, I have a hand-modified version which works for Python and VRPs.
To make this work we need to call vrproc.__disown__()
just after we call the C++ add_valuerangeprocessor()
method, and we need to unref()
the C++ vrp object then too (otherwise both the C++ and Python objects will end up with one reference and neither will get deleted), but we need to only do this if the object hasn't already been disowned.
This is going to need a bit of work to get SWIG to generate code which makes this happen, but at least we now have a plan.
comment:29 by , 10 years ago
[5d10895593a6e277544640092b0e526d26ecb6be] adds optional reference counting of ValueRangeProcessor
objects to the C++ API, and [07d1ffcd10d8bbe6111d2e35a2f99486f0147b08] updates the bindings to ignore the new release()
method and suppress a new warning from SWIG due to the C++ change.
Other subclassable functors also need doing, and then the bindings need modifying to make use of this. I don't see how to do that at all cleanly without extending SWIG to support this, so this part isn't feasible for 1.3.3 (at least not without delaying 1.3.3 unreasonably) so updating the milestone.
comment:30 by , 10 years ago
Milestone: | 1.3.3 → 1.3.4 |
---|
comment:32 by , 9 years ago
Version: | SVN trunk → git master |
---|
I think that we should implement this for the other functor classes for C++ before 1.4.0, as that is easy to do.
However we shouldn't try to make use of this in the bindings, as that's more involved (and seems to need changes to SWIG, or more cunning hackery than I've so far managed to come up with). Several of the bindings (Python, PHP, Perl) already have separate tracking code so there will be no observable behaviour change, while for the other languages, this means that cases which failed before now work, so it's reasonable as a change to make mid-1.4.x - the tracking code for Python and PHPwasn't added at the start of release series after all.
comment:33 by , 9 years ago
Actually, it's only some functor classes that this is useful for - ones which just get passed to a method and used by it don't need it, only cases where the functor is stored by the object for later use.
Classes to do:
ErrorHandler
(? might just get removed/replaced)ExpandDecider
(? lingering references exist viaExpandDeciderAnd
)LatLongMetric
(? is this class just too much generality?)KeyMaker
PostingSource
(? has clone(), but bindings need to handle specially)Stopper
Weight
(? not wrappable for bindings it seems)
comment:35 by , 9 years ago
Weight
isn't relevant here - the weight object passed to Enquire::set_weighting_scheme()
doesn't need to live on after the call (internally that object is cloned, and the clone used further).
So updated list of classes still to deal with:
ErrorHandler
(? might just get removed/replaced)ExpandDecider
(? lingering references exist viaExpandDeciderAnd
)LatLongMetric
(? is this class just too much generality?)KeyMaker
PostingSource
(? has clone(), but bindings need to handle specially)
comment:36 by , 9 years ago
KeyMaker
done in [c90a93c55f5d7f49c8bee47a9b9473a3e46d7c05/git].
Updated list of classes still to deal with:
ErrorHandler
(#3 suggests keeping this class but changing how it is used - adding ref counting now means we can make that change in 1.4.x)ExpandDecider
(? lingering references exist viaExpandDeciderAnd
)LatLongMetric
(? is this class just too much generality?)PostingSource
(? has clone(), but bindings need to handle specially)
comment:37 by , 9 years ago
ErrorHandler
now is optionally referenced counted in [08f9f7fa2684d8171dc954aa6c92866b3946956f/git]. Nothing in the library yet uses this, but it's there for the future.
Updated list of classes still to deal with:
ExpandDecider
(? lingering references exist viaExpandDeciderAnd
)LatLongMetric
(? is this class just too much generality?)PostingSource
(? has clone(), but bindings need to handle specially)
comment:38 by , 9 years ago
ExpandDecider
done in [6a76ceed1546e1f298b23f2a593eb66e7adb8d25/git].
Remaining classes to deal with:
LatLongMetric
(? is this class just too much generality?)PostingSource
(? has clone(), but bindings need to handle specially)
comment:39 by , 9 years ago
LatLongMetric
objects are currently cloned by the objects they are set on, so don't need reference counting as things stand.
I'm not entirely sure this cloning is a good idea - Weight
objects support clone()
, but they have to be cloned for each weighted subquery; PostingSource
objects support clone()
, but they need to be cloned for each subdatabase. It seems LatLongMetric
only supports clone()
as a way to avoid holding a pointer/reference to an external LatLongMetric
object, and making it optionally reference counted is more consistent and probably makes it a little simpler to implement your own metric.
comment:40 by , 9 years ago
OK, so the reason why LatLongMetric
has clone()
is so that LatLongPostingSource::clone()
can in turn clone the metric. Optionally reference counting the metric would provide the same ability with less object copying and better consistency, though this only happens once per sub database per query run, so really isn't a hot spot.
If we change that, it would make sense to change to passing a pointer instead of a reference. The bindings would also need an updating to keep a reference to the currently set metric.
Overall, I'm inclined to leave this as-is for 1.4.0. I think we may want to review the geospatial API at some future date anyway (and I've opened #712 for that), but more real world use would be useful to inform that.
So that leaves just PostingSource
to consider...
comment:41 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
PostingSource
done in [0660a9a99bcd09ff137e75e4581729a2beb0a5c2/git]. I've opened #714 to cover the work still needed on the bindings, so this ticket can be closed.
Marking as blocking 1.1.0 so we don't forget.