Opened 17 years ago
Last modified 19 months ago
#199 assigned enhancement
Merge MultiValueCountMatchSpy
Reported by: | Richard Boulton | Owned by: | Richard Boulton |
---|---|---|---|
Priority: | normal | Milestone: | 2.0.0 |
Component: | Library API | Version: | git master |
Severity: | normal | Keywords: | |
Cc: | Blocked By: | #198 | |
Blocking: | Operating System: | All |
Description (last modified by )
matchspy.h on the matchspy branch contains a MultiValueCountMatchSpy implementation, which allows a document to have multiple values in a single slot. This is definitely a useful feature when implemented faceting systems, but the current implementation is a bit unpleasant: it uses a pair of public classes to serialise and unserialise lists of values to be held in the slots.
A nicer approach would be to make values capable of storing multiple values. (See bug #198).
Attachments (4)
Change History (31)
comment:1 by , 17 years ago
Blocking: | 154 added |
---|
comment:2 by , 17 years ago
Blocked By: | 198 added |
---|
comment:3 by , 17 years ago
Status: | new → assigned |
---|
comment:4 by , 17 years ago
Blocking: | 200 added; 154 removed |
---|---|
Summary: | Tidy up matchspy.h before release → Tidy up matchspy.h for next release |
comment:6 by , 17 years ago
Operating System: | → All |
---|---|
Summary: | Tidy up matchspy.h for next release → Tidy up matchspy.h for inclusion in a release |
Reword summary
comment:8 by , 16 years ago
Description: | modified (diff) |
---|---|
Milestone: | → 1.1 |
comment:9 by , 16 years ago
Blocking: | 160 removed |
---|
comment:10 by , 16 years ago
Owner: | changed from | to
---|---|
Status: | assigned → new |
comment:11 by , 16 years ago
Milestone: | 1.1.0 → 1.1.1 |
---|
Bumping milestone to 1.1.1 as there are unresolved issues and this isn't an incompatible change.
comment:13 by , 15 years ago
Priority: | normal → high |
---|
by , 15 years ago
Attachment: | matchspy.patch added |
---|
Patch for this - merge of most of matchspy branch
comment:14 by , 15 years ago
Status: | new → assigned |
---|
Attached matchspy.patch. This patch merges most of the changes on the matchspy branch to trunk, with the exception of the code which deals with a way to support multiple values in a single slot (eg StringListSerialiser, and MultiValueMatchSpy).
It also addresses ticket #183.
I havn't yet run coverage tests on this patch, but the code is in pretty good shape, I think (I did do coverage tests on some earlier versions of the code in the matchspy branch). I'll report back here when I've done some coverage tests.
comment:15 by , 15 years ago
A few comments:
- As I mentioned on IRC, MultiMatchSpy can probably just have a reference to the vector of MatchSpy objects in Xapian::Enquire rather than making a copy.
- If that works, then we can just make the MultiMatchSpy object an auto variable (since it's just a wrapper around a reference member variable) which avoids having to allocate it on the heap and use AutoPtr to manage its cleanup.
- The "new" matchspy is applied after matchdecider, but the old one was applied before. Without thinking about it much, I think this is probably saner, but I thought I should point it out in case it's a mistake.
- This adds some "naked" structs to the API, which isn't something we currently have, and tends to tie down the implementation rather a lot, so I think we need to carefully consider doing this. Classes with private members readable via getters inlined from the API headers would generate equally good code, not restrict future changes as much, and may be easier to wrap nicely with SWIG.
I noticed a few typos and minor tweaks I don't think are at all controversial, so I've just committed changes for those as that's less work that trying to describe them...
Not completely finished looking through, but it's nearing bedtime so I'll take a further look later.
comment:16 by , 15 years ago
Applying the new matchspy after the match decider was deliberate, since it should only really be applied to the documents which get into the mset. However, I don't think it's very important since we plan to deprecate matchdecider soon.
I've applied patches for all the other comments, since I agreed with them!
by , 15 years ago
Attachment: | matchspy2.patch added |
---|
Updated patch (with all files included this time)
comment:17 by , 15 years ago
Matchspy branch mostly merged to trunk. The outstanding parts are: StringListSerialiser (and unserialiser), and MultiValueCountMatchSpy. These are waiting on a resolution to ticket #198.
by , 15 years ago
Attachment: | matchspy_changes_13246_13247.patch added |
---|
Remaining diffs on the branch
comment:18 by , 15 years ago
Priority: | high → low |
---|
The remaining changes are API-compatible additions, so updating the priority to "low".
comment:19 by , 15 years ago
Summary: | Tidy up matchspy.h for inclusion in a release → Merge MultiValueCountMatchSpy |
---|---|
Type: | defect → enhancement |
Retitling to something more appropriate (I figure the MatchSpy subclass is the interesting one - whether it uses StringListSerialiser or multi-values or something else is largely an implementation detail).
Also marking as "enhancement".
comment:21 by , 15 years ago
Description: | modified (diff) |
---|
Update summary to reflect the current state: all the other matchspies which were mentioned are now gone.
comment:22 by , 15 years ago
Description: | modified (diff) |
---|
comment:23 by , 15 years ago
Priority: | low → normal |
---|
I've been looking at ticket #198, and I'm unconvinced it's something which we're going to be able to support without an excessive amount of work, and not until 1.3.0 at the earliest (and even then, not for the chert backend). In the interests of making it possible in the reasonably near future to do facet calculations with multiple values in a slot in a single document, I think the StringListSerialiser concept is the way to go. It can probably be tidied up quite a bit, though.
On thought is that if we're adding this, we should probably add support for using values serialised with StringListSerialiser for sorting (ie, add a KeyMaker which understands the serialised format, and just picks the first string). We might be able to find nicer names, and a nicer API for the classes, too.
Side note: this ticket was marked as priority:low. I think that the lack of this will be a frequent complaint when users start trying to use the faceting support in the 1.2.0 release, so it should at least be a priority:normal ticket.
comment:24 by , 15 years ago
The "low" priority was from prioritising bugs to address (or not) before 1.2.0. Generally you can't read much into our bug priorities anyway, at least historically.
This certainly isn't 1.2.0 material, and right now I'd rather not be too distracted by discussion of issues which aren't relevant for that goal, so let's just see how the faceting support is actually received rather than trying to guess what user reaction might be.
comment:25 by , 13 years ago
Hi,
I need to store tags for documents in Xapian. A document can have multiple tags, so I'm looking for something like MultiValueCountMatchSpy to do faceting search. (IMHO, this is a frequent need.)
I'm already using ValueCountMatchSpy for single-valued fields, this is working nicely and I'm an happy xapian user :)
I hope to see this feature in the next release.
Thanks. Laurent
comment:27 by , 9 years ago
Milestone: | 1.3.x → 1.4.x |
---|
As noted in #198:
Nobody's working on this, and while there's interest in it, I don't think we have a clear design. It's also possible to use multi-valued facets already, you just need to unpack them in your own matchspy class. So I think it isn't worth holding up 1.4.0 for.
I think the same applies here
comment:28 by , 19 months ago
Milestone: | 1.4.x → 2.0.0 |
---|---|
Version: | SVN trunk → git master |
Olly's backed the matchspy stuff out of SVN HEAD for now, so we've got until the 1.0.4 release to tidy this up. Changing to block #200 instead of #154.