Opened 9 years ago
Last modified 20 months ago
#687 assigned defect
64-bit docids in the bindings
Reported by: | Olly Betts | Owned by: | Olly Betts |
---|---|---|---|
Priority: | normal | Milestone: | 2.0.0 |
Component: | Xapian-bindings | Version: | git master |
Severity: | normal | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Operating System: | All |
Description
You can now use 64-bit docid (and termcount), so marking #385 fixed seems appropriate.
But we ought to add testcases that this works for each of the bindings - it's not something that necessarily will automatically.
I wrote a simple testcase for PHP (patch attached), which passes with 64 bit docid (promising) but fails with 32 bit docid (as the C++ version does). I don't see any easy way to determine the type widths from the bindings as things are though, which makes adding tests problematic.
The failure mode isn't good either - you just get the docid quietly wrapping. If we fix that for C++ then the testcases could at least check for it working or giving that exception. Or perhaps we expose the information about type widths through the bindings.
Attachments (1)
Change History (6)
by , 9 years ago
Attachment: | php-64bit-docid-test.patch added |
---|
comment:1 by , 9 years ago
I think we need some way to determine the largest possible docid.
There's actually two such values, as there's what the type supports, and what the backend supports. Currently these are the same by default, but if 64-bit docid is enabled, the current backends don't support that directly.
So I'm not sure if this should be a static value (perhaps for the bindings only, as in C++ it's just numeric_limits<Xapian::docid>::max()
), or a method such as Database::get_max_docid()
(which method which reports the largest docid which can be used for this object (and would take multi-dbs into account). Or perhaps both.
For the bindings, there's potentially a third limit, if the type Xapian::docid
is mapped to is narrower than the C++ type (e.g. if the language has no 64 bit type).
comment:2 by , 9 years ago
Status: | new → assigned |
---|
The glass backend can now handle 64-bit document ids.
This probably isn't actually a 1.4.0 blocker, as we any constants or methods which would be added would be API additions and not break the ABI, though it would be good to document what the status of 64-bit docid support is for each language.
comment:3 by , 9 years ago
Milestone: | 1.3.5 → 1.4.x |
---|
On further reflection, I think we punt on this for 1.4.0 - 64 bit docids aren't the default, so you're choosing an ABI-incompatible build to start with.
It's also going to be significant work to check the consequences for all the languages we have bindings for. E.g. it looks like PHP's integer type can be 32 or 64 bits depending on the platform (I guess it maps to C long
), and large values quietly turn into a floating point value (which presumably means that where PHP has a 32-bit integer type, integer values are effectively precisely representable up to where C double
stops being able to represent consecutive integers). This all needs careful research (or existing in-depth knowledge), and careful construction of tests cases.
comment:4 by , 5 years ago
Component: | Library API → Xapian-bindings |
---|---|
Summary: | 64 bit docid follow-on → 64-bit docids in the bindings |
Version: | SVN trunk → git master |
comment:5 by , 20 months ago
Milestone: | 1.4.x → 2.0.0 |
---|
Patch to add testcase for PHP