Opened 14 years ago

Closed 14 years ago

#486 closed defect (fixed)

add_document() followed by failing replace_document() leaves partial changes

Reported by: Olly Betts Owned by: Olly Betts
Priority: normal Milestone: 1.0.21
Component: Backend-Chert Version: 1.2.0
Severity: normal Keywords:
Cc: Carl Worth Blocked By:
Blocking: Operating System: All

Description

If (without a commit) you do:

Xapian::Document doc;
doc.add_term("foo");
Xapian::docid did = db.add_document(doc);
doc.add_term("abc");
doc.add_term(string(1000, 'm'));
doc.add_term("xyz");
db.replace_document(did, doc);

then replace_document() throws an exception for the long term, but adds "abc" first, so the implicit flush from db's dtor will write a document with part of the changes from the updated document to the database.

(Reported by Carl Worth as affecting notmuch, though I gave him a workaround).

Change History (5)

comment:1 by Richard Boulton, 14 years ago

One solution would be for modification methods like add_document() and replace_document() to call cancel() if an error occurred, to rollback any changes since the last flush. This would stop the database getting into an inconsistent state on error, but might roll back more changes than really desired.

Alternatively, this could be done after a basic validation of the document (ie, checking that the term lengths aren't too long), so a full roll-back only happened if an error occurred after that validation (if validation failed, the current modification operation would be stopped before any changes were made, but earlier modifications would not be rolled back).

comment:2 by Olly Betts, 14 years ago

I don't much like the idea of having to process the terms twice, though I guess we only need to do so in the case when the document already has a pending update, which should be an unusual case in general. But in some situations, it might be very common (if you get a constant stream of document updates and the same document is often updated multiple times between commits - I suspect wiki changes tend to cluster like this for example).

In the case that triggered this ticket, notmuch wants the docid before it really adds the document (I'm not quite sure why) so it was using code like the example. I suggested Carl gets the next docid using db.get_lastdocid() + 1 and then calls replace_document() to add it.

Always rolling back all pending changes doesn't sound great either. Rolling back changes if we are modifying a document which was added or modified since the last commit is avoids that, but at the price of less consistent behaviour.

comment:3 by Olly Betts, 14 years ago

Actually, pre-checking the terms doesn't really fix the wider issue - although "term too long" is how it failed in this case, any failure (e.g. out of memory) while updating the cached posting changes suffers from the same issue...

comment:4 by Olly Betts, 14 years ago

Milestone: 1.2.11.0.21
Status: newassigned

I was able to reproduce an issue when starting with an empty db (r14704 on trunk, needs backport) but for the general case we already call cancel in all cases of failure to add/replace/delete a document as far as I can see. Carl doesn't have a reproducer, so I think we'll have to assume the notmuch user hit the issue I found.

comment:5 by Olly Betts, 14 years ago

Resolution: fixed
Status: assignedclosed

Backported for 1.0.21 in r14707.

Note: See TracTickets for help on using tickets.