#675 closed defect (fixed)
Exception in [Chert/Glass]Cursor::find_entry()
Reported by: | German M. Bravo | Owned by: | Olly Betts |
---|---|---|---|
Priority: | high | Milestone: | 1.2.21 |
Component: | Backend-Chert | Version: | |
Severity: | normal | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Operating System: | All |
Description
Glass and Chert dies if I do the following steps:
- open a NEW empty database for Writing
- open the same database for Reading
- read from the Reading database, for each document get the value of a slot
- index documents to the Writing database
- commit the Writing database
- reopen the Reading database
- read from the Reading database, for each document get the value of a slot
- repeat all until it crashes
This seems very related to issue #654 to me, although it's not entirely the same, there the remote backend dies due an exception while accessing the cursors during find_entry().
Attachments (3)
Change History (13)
comment:1 by , 10 years ago
by , 10 years ago
Attachment: | the_error.cpp added |
---|
comment:2 by , 10 years ago
ChertTable::open()
closes the table, which frees C[j].p
and maybe (when it's lazy) do_open_to_read()
doesn't create memory for those, yet returns true. Then later, when ChertTable::find
is called, it uses C_[j].p
but its memory doesn't exist as it has already been freed.
Maybe ChertTable::do_open_to_read
should always initialize the C
cursor when returning true? (or even maybe always?)... also, the same might apply to ChertTable::do_open_to_write
as well.
comment:3 by , 10 years ago
I believe this is the same bug: https://bugs.kde.org/show_bug.cgi?id=341990
comment:4 by , 10 years ago
The cursor received in ChertTable::find()
, C_
, is a different object than that for which memory is allocated during ChertTable::do_open_to_read
, so it remains with the old freed pointers during ChertTable::close
. In other words, ChertTable::C
is updated with new memory allocated, but ChertValueManager::cursor
(the one calling ChertCursor::find_entry()
-> ChertTable::find()
) isn't so ChertValueManager::reset()
needs a cursor.reset(0);
(since the cursor will no longer be valid after the reopen) and value_manager.reset();
should be called *after* opening the tables in ChertDatabase::open_tables_consistent()
.
comment:6 by , 10 years ago
Component: | Other → Backend-Chert |
---|---|
Status: | new → assigned |
Thanks for the patch, but I'm not quite sold on the logic behind the change.
ChertTable::cursor_get()
returns a new cursor, not the table's built-in cursor, so I'm not seeing how we end up with a stale cursor here, since the cursor which ChertTable::close()
releases the memory for shouldn't be the cursor which ChertValueManager
has (though valgrind's output does seem consistent with that being what's happening).
I'll have a poke and see if I can either convince myself your patch is right, or if not, work out what's actually going on.
Did you already agree to the patch licensing requirements?
http://trac.xapian.org/browser/git/xapian-core/HACKING#L1376
comment:7 by , 10 years ago
Milestone: | → 1.3.3 |
---|
Aha, so the new ChertCursor
copies the pointer to the root block from the cursor in the table, so that's why this the freeing of the blocks in built-in cursor in ChertTable::close()
breaks the external cursor which the ChertValueManager
holds.
In glass, we reference count the cursor blocks with copy-on-write, so this shouldn't be an issue - the root block will be valid until all references are gone.
I think the other problem (which affects both chert and glass) is when the table gains levels over a reopen - then the cursor in the ChertValueManager
doesn't have enough levels and we end up iterating off the end of it. I've not yet confirmed that's what's happening, but assuming it is the patch does indeed look correct.
(Mostly orthogonally, it would be nice to reduce the rather needless churn of memory allocations that currently happens when reopen()
finds a new revision...)
comment:8 by , 10 years ago
Yes, I agree with the licensing of the patches. Is there anything else besides saying I do that I have to do?... I'd actually really love the whole Xapian was MIT licensed.
Regarding Glass, the problem is also there in that backend as well (for some reason). I'm not sure how the GlassValueManager::cursor
gets those pointers, but the memory is being released by the GlassTable::close()
and later is tried to be used. In glass it might be the reference counter isn't being increased for some reason when cursor is copied/assigned, however simply adding that cursor.reset(0)
to the GlassValueManager::reset()
doesn't seem to fix the issue as it does in Chert... and thinking about it, even if it did keep the blocks alive (if the reference counting there was working), would we still want to use the old blocks after the reopen? I mean, the close()
during the reopen resets the cursors for the tables, I don't think we'd still would want to use a stale cursor for the value manager... but I might be understanding cursors an blocks incorrectly.
comment:9 by , 10 years ago
Milestone: | 1.3.3 → 1.2.21 |
---|
Before we actually use the data in the cursor, we check B->cursor_version != version
and if it isn't, we call rebuild()
which updates the cursor to match the table.
The table's cursor_version
gets incremented each time add()
or del()
is called if at least one cursor has been created since the last call (the reason for this condition is to avoid churning the counter unnecessarily and having it wrap round).
Looking over the code involved, I see two bugs here though - we don't set cursor_created_since_last_modification
in the cursor rebuild()
method, but that flag is meant to really indicate if any cursor might exist with the current cursor_version
, and after rebuild()
a cursor now does.
Also, we don't consider incrementing cursor_version
after cancel()
or reopen()
(the latter being very relevant here).
Fixing those, your testcase passes cleanly. I've written a testcase based on it and committed the fixes + the testcase in [1900eca6d325e9185b9f9f5c75a4a4a545616d7f/git]. Testing I found that the fixes for reopen()
and rebuild()
are both required to fix the testcase.
I think we're now good after those changes - although with chert the cursor still has a reference to a freed root block, the cursor will now get rebuilt before being used, so that doesn't actually matter as that dud reference will never get used.
If this doesn't fix the problem in you full system, please comment with details.
Marking for backporting for 1.2.21.
comment:10 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Backported for brass and chert in [826d1a19cc356e7bf66c1681626e70af32967447].
Applied equivalent fixes for flint in [d784290ce015958474f965817f7a41f1483c3e03].
I've attached source code that triggers the problem.