Opened 14 years ago
Closed 14 years ago
#503 closed enhancement (fixed)
Add Python PostingSource example from Xappy to docs
Reported by: | Joost Cassee | Owned by: | Olly Betts |
---|---|---|---|
Priority: | normal | Milestone: | 1.2.6 |
Component: | Xapian-bindings (Python) | Version: | 1.2.2 |
Severity: | normal | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Operating System: | All |
Description (last modified by )
The Xappy source code contains a perfect example of a weight-only (non-filtering) PostingSource written in Python. This would be a good addition to the postingsource docs. I have slightly edited the original.
class ExternalWeightPostingSource(xapian.PostingSource): """ A Xapian posting source returning weights from an external source. """ def __init__(self, db, wtsource): xapian.PostingSource.__init__(self) self.db = db self.wtsource = wtsource def init(self, db): self.alldocs = db.postlist('') def get_termfreq_min(self): return 0 def get_termfreq_est(self): return self.db.get_doccount() def get_termfreq_max(self): return self.db.get_doccount() def next(self, minweight): try: self.current = self.alldocs.next() except StopIteration: self.current = None def skip_to(self, docid, minweight): try: self.current = self.alldocs.skip_to(docid) except StopIteration: self.current = None def at_end(self): return self.current is None def get_docid(self): return self.current.docid def get_maxweight(self): return self.wtsource.get_maxweight() def get_weight(self): doc = self.db.get_document(self.current.docid) return self.wtsource.get_weight(doc)
Change History (10)
comment:1 by , 14 years ago
Component: | Other → Xapian-bindings (Python) |
---|---|
Description: | modified (diff) |
Milestone: | → 1.2.3 |
Owner: | changed from | to
Version: | → 1.2.2 |
comment:2 by , 14 years ago
By the way, please add a note to the Python documentation that the database reference passed into PostingSource.init(db)
(not __init__()
) by Xapian should not be stored as a class attribute. Xapian will remove the underlying C++ object after leaving the method, and the Python application will segfault if you try to use it later on.
By the way2: it would be nice if Trac users could edit their own ticket description; there is still one typo in there...
follow-up: 5 comment:3 by , 14 years ago
Hmm, I'm sure I wrote a response to comment:2 already. I guess I must have previewed it but failed to actually submit it or something.
Charlie says it's fine for future relicensing, so the Lemur (C) isn't an issue.
Not being able to store the passed database sounds like a bug in the Python wrappers to me.
And I think users should now be able to edit ticket descriptions (I didn't realise they couldn't - thanks for pointing that out).
comment:5 by , 14 years ago
Replying to olly:
Not being able to store the passed database sounds like a bug in the Python wrappers to me.
I cannot reproduce this problem in version 1.2.3.
comment:6 by , 14 years ago
Description: | modified (diff) |
---|
I've made some further edits - fixing a typo in the wiki markup, removing the reset() method, renaming xapdb to db, and removing the ProcessedDocument reference (which I think must be a xappyism).
Richard said on IRC he'd like to have this actually tested (I think he means dynamically) so that the docs don't have an incorrect example.
comment:7 by , 14 years ago
Milestone: | 1.2.4 → 1.2.5 |
---|
Had a report on IRC that this example crashes, so we should definitely at least check it works before adding it to the docs:
| eugene_beast> well, python process aborts if i'm copying the example from #503 and trying to search for something
Not worth delaying 1.2.4 further for this, so bumping milestone.
comment:9 by , 14 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Works for me if I fill in a suitable class for wtsource:
class WeightSource: def __init__(self): pass def get_maxweight(self): return 1234.; def get_weight(self, doc): return doc.get_docid()
I wonder if eugene_beast fail to supply a suitable class there. Anyway, the example does work.
I'm going to try to slot this in now.
comment:10 by , 14 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Added to postingsource.rst in r15507. It isn't automatically tested or anything like that, but I have manually checked it before added it at least.
Richard seemed keen to have it automatically tested, which seems a really nice idea, but more than I have time to do right now, so I've opened #547 for that.
Marking for 1.2.3, though that's pending on us being OK to relicense this in the future. Richard, who wrote this? The (C) headers on the file list you and Lemur (which shouldn't be a problem, though we should explicitly check) and Pablo Hoffman who I don't think I know.
We should kill reset() from it if it really is for backward compatibility - compatibility with 1.1.x isn't interesting at this point, and going forward a clean example is more important.
Probably also better to rename xapdb to just db for the new context.