Ticket #284 (assigned defect)
occasional DatabaseModifiedErrors
| Reported by: | mrks | Owned by: | olly |
|---|---|---|---|
| Priority: | normal | Milestone: | 1.0.10 |
| Component: | Backend-Flint | Version: | 1.0.7 |
| Severity: | normal | Keywords: | |
| Cc: | mrks@… | Blocked By: | |
| Operating System: | Linux | Blocking: |
Description
I use xapian-core-1.0.7 with the corresponding perl bindings. I run a 1 writer/N reader setup, and I do reopen() a database-handle before each query. Nevertheless I casually get DatabaseModifiedErrors.
This is what I found out so far:
* The errors occurs after explicit flushing my most frequented index. The error does less often occur, if I do a sleep(1) after each explicit flush() before applying no changes (without flush) to the index, and it never occured so far with a sleep(4). This is my workaround.
* I already set XAPIAN_FLUSH_THRESHOLD to a large value (100000).
* I patched the xapian-core lib to log all calls of FlintDatabase::set_revision_number(), and the throw-points of the XapianModifiedErrors, which turned out that the exception gets thrown in FlintTable::set_overwritten().
* I patched again to get the caller and found out that set_overwritten() got called by FlintTable::block_to_cursor(), which I patched again to expose the condions:
if (REVISION(p) > REVISION(C_[j + 1].p)) {
fprintf(stderr, "set_overwritten: from block_to_cursor() %d > %d\n", REVISION(p), REVISION(C_[j + 1].p));
set_overwritten();
return;
}
and it turned out:
set_overwritten: from block_to_cursor() 10194 > 10192 terminate called after throwing an instance of 'Xapian::!DatabaseModifiedError' (...) set_overwritten: from block_to_cursor() 10195 > 10193 terminate called after throwing an instance of 'Xapian::!DatabaseModifiedError' set_overwritten: from block_to_cursor() 10195 > 10193 terminate called after throwing an instance of 'Xapian::!DatabaseModifiedError' (...) set_overwritten: from block_to_cursor 10199 > 10197 terminate called after throwing an instance of 'Xapian::!DatabaseModifiedError' set_overwritten: from block_to_cursor 10199 > 10197 terminate called after throwing an instance of 'Xapian::!DatabaseModifiedError'
I originally tested this with xapian-1.0.6, but it also occurs in 1.0.7.
I run xapian on Ubuntu Linux 8.04 (Hardy) with a 2.6.24-19-server kernel and an ext 3 file filesystem. The machine is an IBM x3650 with 40 GB RAM, and a ServeRAID-8k Controller running a Raid 10 over 6 SAS-Disks.
My most frequented index (the one that drops the exceptions) contains about 850.000 documents, needs 11 Gb of disk space, gets 5-15 updates per second, and about 20-25 search hits per second. I flush() this index every 10 minutes (which takes about 60-100 seconds + 4 seconds workaround delay ;-)
