#824 closed defect (fixed)
Out-of-bounds array access on table open if root info level is corrupt
Reported by: | group13 | Owned by: | Olly Betts |
---|---|---|---|
Priority: | normal | Milestone: | 1.4.25 |
Component: | Backend-Glass | Version: | 1.4.24 |
Severity: | normal | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Operating System: | All |
Description
In the Glass backend, the RootInfo level field comes from disk and is used before it is checked, which means that e.g. this snippet...
void GlassTable::basic_open(const RootInfo * root_info, glass_revision_number_t rev) { ... level = root_info->get_level(); ... for (int j = 0; j <= level; ++j) { C[j].init(block_size); }
...is UB if the level in the file exceeds BTREE_CURSOR_LEVELS for whatever reason (perhaps file corruption on disk). If so, even if the table is otherwise unused, this runs past the end of the C array here and again when the GlassTable is destroyed. In practice, this results in a segmentation fault/crash; because it happens so early, it cannot be recovered through xapian-check.
We would instead expect that a Xapian::DatabaseCorruptError or similar would be thrown in this case.
We've confirmed that this happens with 1.4.24, on Linux, using the Glass backend in single-file mode. The Chert backend may be affected as well.
This issue was found by running the AFL++ fuzzer to generate corrupted single-file Glass database files.
Attachments (2)
Change History (8)
comment:1 by , 11 months ago
Milestone: | → 1.4.25 |
---|---|
Resolution: | → fixed |
Status: | new → closed |
comment:2 by , 11 months ago
Thanks for the quick response!
Is it expected that the patched code sets the level member first and then checks the value? When trying to confirm the fix with our test input, it looks like the following now happens:
- DatabaseCorruptError is now thrown as expected
- As part of unwinding, the GlassTable is destroyed
- As part of GlassTable's destructor, the close method is called
- As part of GlassTable::close, a loop similar to the one above runs through all cursors to destroy them
- This loop uses the impossible level value from the file to know how many levels there are, and we hit the same UB/segfault.
This may be easier to reproduce with a test file that is wildly out of range rather than just out of range.
comment:3 by , 11 months ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
This may be easier to reproduce with a test file that is wildly out of range rather than just out of range.
FWIW my logic for picking the first "just wrong" value was to catch if the check added to the code was at the wrong threshold.
Your reasoning looks correct, but I'm surprised CI didn't catch this as it runs a build (and the testsuite) using ubsan. I can see valgrind not spotted a small overrun as there are members after this array in the class and valgrind works on already compiled code so likely only sees the allocated block, but ubsan should be able to add checks based on the actual members.
It'd be helpful to provide your test input when reporting these sort of bugs...
comment:4 by , 11 months ago
It'd be helpful to provide your test input when reporting these sort of bugs...
Of course. I've added one of the crashing inputs, as well as the starting seed we used.
The test input is naturally "more corrupted than necessary" for this issue. Please let us know if you would like to see a cleaned-up/minimized version as well (so one that is only corrupt in the way mentioned above).
comment:5 by , 11 months ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Of course. I've added one of the crashing inputs, as well as the starting seed we used.
Thanks, fixed in 44aaa6fdfdccb4f2708da0605fa6efcc327d7908 - I've moved the check to where the value is read from disk and decoded (which I probably should have done to start with, but it requires moving where we define the constant to check against). Backported for 1.4.25 as 61249b067d1477cc099df342ada2f96b24822f42.
I've also fixed a similar issue with lack of vetting of the blocksize which is read from the same file in the same method - that fix is 218e5c4b591c7ec5a17b0d78c6bf0a72876f6176, backported as 7e5c9a4963e4c1701b64f1fcf81a0c2c089874a6.
The test input is naturally "more corrupted than necessary" for this issue. Please let us know if you would like to see a cleaned-up/minimized version as well (so one that is only corrupt in the way mentioned above).
In this case just your reproducer was enough - it allowed me to verify the fix actually addressed the problem(s) it triggered.
comment:6 by , 11 months ago
Confirmed, we now get an "Impossibly deep Btree" message as expected. Thanks again!
Fixed in git master by 86b5e2ce03427c28f11c17bfc9bcc05d2a9f9966 and dd768623e8bda0582685b8943891a585ae13b611.
Backported for 1.4.25 as b92a101363a78041c0fb0b2a376cb81e54207512 and 0209e8dd7e535f08ff0fdec720ee9a92a5c47a4c; also fixed for chert backend by 9c6a7a3a61098d5a22041138dd36a2c909b10b73.