#823 closed defect (fixed)

Unit test ioblock1 failure

Reported by: someplaceguy Owned by: Olly Betts
Priority: normal Milestone: 1.4.25
Component: Test Suite Version: 1.4.24
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Operating System: Linux

Description

The ioblock1 unit test is failing for me on Xapian 1.4.24:

$ ./runtest ./unittest -v ioblock1
Running test: ioblock1... FAILED
unittest.cc:945: (statbuf.st_blocks) >= (BLOCK_SIZE / 512 * 2)
Evaluates to: 1 >= 4

The issue seems to be that the unit test expects the filesystem to allocate at least 2048 bytes (i.e. 4 * 512B) for this file.

However, I'm running ZFS with compression enabled (which is a very common configuration for ZFS), but for this file, ZFS only allocates 512 bytes worth of data:

$ hexdump -C .unittest_ioutils1
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000400  78 78 78 78 78 78 78 78  78 78 78 78 78 78 78 78  |xxxxxxxxxxxxxxxx|
*
00000800  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00003000  78 78 78 78 78 78 78 78  78 78 78 78 78 78 78 78  |xxxxxxxxxxxxxxxx|
*
00003400

$ stat .unittest_ioutils1
  File: .unittest_ioutils1
  Size: 13312           Blocks: 1          IO Block: 13312  regular file
Device: 0,30    Inode: 4497130     Links: 1
Access: (0644/-rw-r--r--)  Uid: (30001/ nixbld1)   Gid: (30000/  nixbld)
Access: 2023-12-19 20:37:53.120683138 +0000
Modify: 2023-12-19 20:44:54.815831585 +0000
Change: 2023-12-19 20:44:54.815831585 +0000
 Birth: 2023-12-19 20:37:53.120683138 +0000

Attachments (2)

xapian-1.4-ioblock-fix.patch (1.8 KB ) - added by Olly Betts 11 months ago.
candidate patch
xapian-1.4-ioblock-SEEK_HOLE-fix.patch (1.1 KB ) - added by Olly Betts 10 months ago.
Use SEEK_HOLE

Download all attachments as: .zip

Change History (14)

comment:1 by Olly Betts, 11 months ago

Milestone: 1.4.25

Ah, so the block count is of compressed blocks, not of the blocks of data that are the contents of the file.

This check could be weakened to >= 1 since the data can't compress to nothing, but I think we could easily write pseudo-random data which shouldn't compress significantly and thus allow the existing check to work in your case.

Such compression makes the following check for whether the current FS supports sparse files or not trickier, at least if there's an FS that supports compression like this but doesn't support (or allows turning off support for) sparse files - the blocks without data written are all-zero so they'd compress really well which looks a lot like sparse files. However I think it doesn't matter for what the testcase actually needs to know - it wants to test writing a block over 4GB into a file works correctly, and the concern is avoiding creating an enormous temporary file - sparse files allow that, but so should FS-level compression. Just tweaking the comments and messages is probably enough there.

by Olly Betts, 11 months ago

candidate patch

comment:2 by Olly Betts, 11 months ago

Please can you try the attached patch on your compressed ZFS setup.

comment:3 by someplaceguy, 11 months ago

Please can you try the attached patch on your compressed ZFS setup.

I can confirm that the test now writes random data to the file, but unfortunately the test still fails:

$ rm .unittest_ioutils1; ./runtest ./unittest -v ioblock1; stat .unittest_ioutils1; sleep 5; echo; stat .unittest_ioutils1

Running test: ioblock1... FAILED
unittest.cc:955: (statbuf.st_blocks) >= (BLOCK_SIZE / 512 * 2)
Evaluates to: 1 >= 4


  File: .unittest_ioutils1
  Size: 13312           Blocks: 1          IO Block: 13312  regular file
Device: 0,30    Inode: 947166      Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2023-12-20 16:36:35.801828622 +0000
Modify: 2023-12-20 16:36:35.810828709 +0000
Change: 2023-12-20 16:36:35.810828709 +0000
 Birth: 2023-12-20 16:36:35.801828622 +0000

  File: .unittest_ioutils1
  Size: 13312           Blocks: 9          IO Block: 13312  regular file
Device: 0,30    Inode: 947166      Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2023-12-20 16:36:35.801828622 +0000
Modify: 2023-12-20 16:36:35.810828709 +0000
Change: 2023-12-20 16:36:35.810828709 +0000
 Birth: 2023-12-20 16:36:35.801828622 +0000

The issue seems to be that st_blocks is still 1 when the unit test fails. However, as you can see, after waiting a few seconds, st_blocks becomes 9.

At first, I suspected that the unit test wasn't calling fsync() or fdatasync(), but in fact, it is calling fdatasync():

$ strace ./runtest ./unittest -v ioblock1

(...)

openat(AT_FDCWD, ".unittest_ioutils1", O_RDWR|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 4
pwrite64(4, "_\313\363.\273\307\231\230'r'\31\16u\335U\231$\265\246\5\16\370\270\325\3606\0.\376.\236"..., 1024, 1024) = 1024
pread64(4, "_\313\363.\273\307\231\230'r'\31\16u\335U\231$\265\246\5\16\370\270\325\3606\0.\376.\236"..., 1024, 1024) = 1024
fdatasync(4)                            = 0
pwrite64(4, "\356\340\335\215\v*\6i`\307\317z\374\374&`\230\277ad\370\324\327\221\326\20w\tj\"E\3"..., 1024, 12288) = 1024
pread64(4, "\356\340\335\215\v*\6i`\307\317z\374\374&`\230\277ad\370\324\327\221\326\20w\tj\"E\3"..., 1024, 12288) = 1024
fdatasync(4)                            = 0
newfstatat(4, "", {st_mode=S_IFREG|0644, st_size=13312, ...}, AT_EMPTY_PATH) = 0
futex(0x7f929e9ac070, FUTEX_WAKE_PRIVATE, 2147483647) = 0
write(1, " \33[1m\33[31mFAILED\33[0m\n", 21 FAILED
) = 21
write(1, "unittest.cc:955: (statbuf.st_blo"..., 85unittest.cc:955: (statbuf.st_blocks) >= (BLOCK_SIZE / 512 * 2)
Evaluates to: 1 >= 4

) = 85

I've looked to see if this could be a bug in ZFS and I found this ZFS issue:

https://github.com/openzfs/zfs/issues/13991

Quoting from a comment there from a ZFS developer:

(...) fsync() makes no difference here, because it doesn't commit the transaction, but rather just ensures the change is on the journal (ZIL) so it can be replayed in a crash.

I don't know if this can be really called a bug. POSIX and the various manpages all seem say "number of blocks allocated", which I can argue is technically correct (ZFS doesn't allocate anything until it actually comes time to write things).

(...) So my inclination is to say "not a bug". But, if you've got a program that reasonably relies on st_blocks being instantly accurate, or a better source of how this should be required to work, please reply with details.

comment:4 by someplaceguy, 11 months ago

Unfortunately the accuracy/timeliness of st_blocks seems to be a bit of a complicated issue in ZFS.

Perhaps the best course of action would be to perform 2 tests for sparsity:

1) The first test would follow the suggestion of the ZFS developer to use lseek(SEEK_DATA/SEEK_HOLE) to figure out whether the file has holes or not (i.e. indicating that it is sparse). This would work for ZFS and all other filesystems that implement this interface.

2) However, some filesystems might not implement this interface and instead they might claim that the entire file is a non-hole even though they do support sparse files. As a fallback, a second test would be performed, using st_blocks like Xapian currently does.

This should probably be robust enough to work in almost all cases.

comment:5 by Olly Betts, 11 months ago

SEEK_HOLE seems a good approach - it is supported by several platforms already and approved for the next POSIX revision (though it does look like a dummy implementation will conform).

Maybe it's better to just rely on SEEK_HOLE and skip the test if not supported (or if it says "no holes") - ZFS demonstrates the reported block count can under-report so I'm not seeing how we can build a test on the block count for "big sparse file won't eat a lot of diskspace" without the risk of it incorrectly deciding it won't, and we want to avoid testcases being flaky or using GBs of temporary files.

by Olly Betts, 10 months ago

Use SEEK_HOLE

comment:6 by Olly Betts, 10 months ago

Status: newassigned

Please test with your ZFS setup and xapian-1.4-ioblock-SEEK_HOLE-fix.patch - the ioblock1 testcase should now pass (not fail or skip), assuming you're on a platform with working SEEK_HOLE support.

comment:7 by someplaceguy, 10 months ago

Unfortunately, if I apply the latest patch, it results in the test being skipped.

I enabled verbosity and added this debug patch:

  • tests/unittest.cc

    diff --git a/tests/unittest.cc b/tests/unittest.cc
    index 8d5e2bb..837ef7f 100644
    a b try {  
    950951        SKIP_TEST("Skipping rest of testcase - SEEK_HOLE failed");
    951952    }
    952953    if (hole >= statbuf.st_size) {
     954        printf("hole: %jd, st_size: %jd\n", (intmax_t) hole, (intmax_t) statbuf.st_size);
    953955        close(fd);
    954956        io_unlink(tmp_file);
    955957        SKIP_TEST("Skipping rest of testcase - sparse files not supported");

Here's the result:

Running test: ioblock1...hole: 13312, st_size: 13312
 SKIPPED
unittest.cc:957: Skipping rest of testcase - sparse files not supported

Note that in ZFS, by default blocks are logically 128 KiB in size, although they can be compressed down to the block device's sector size (4096 bytes or 512 bytes, usually), or even 0 bytes (i.e. a hole) if the block contains only zeros.

So I believe what's happening is that the entire contents of the file are being stored in a single block, and that's why no hole can be seen.

That said, note that users can configure ZFS filesystems to have a block size of (up to) 16 MiB currently. Furthermore, it's even possible to increase this limit by simply changing a kernel module parameter.

Last edited 10 months ago by someplaceguy (previous) (diff)

comment:8 by Olly Betts, 10 months ago

Does it work if you increase the size created at this step from 12K to (say) 132K?

That said, note that users can configure ZFS filesystems to have a block size of (up to) 16 MiB currently. Furthermore, it's even possible to increase this limit by simply changing a kernel module parameter.

I don't think we want to try to create a large file just to more reliably test if we can create an even larger file without risking filling the disk. It's not a problem for this test to skip in some situations where it could actually fully run, but ideally it should fully run in as many cases as possible without potentially creating large files. Also running in default configurations is especially helpful as that'll help find problems which are specific to a particular platform or FS.

We should probably adjust the message to something like "couldn't detect support for sparse files" though.

comment:9 by someplaceguy, 10 months ago

Does it work if you increase the size created at this step from 12K to (say) 132K?

Sorry, just to be clear, could you be more specific? I mean, how would you like me to change the code, exactly?

comment:10 by Olly Betts, 10 months ago

Change both occurrences of 12 in these lines to 132 (BLOCK_SIZE is 1024 here):

    io_write_block(fd, buf.data(), BLOCK_SIZE, 12);
    out.resize(BLOCK_SIZE);
    io_read_block(fd, &out[0], BLOCK_SIZE, 12);
    TEST(buf == out);
Last edited 10 months ago by Olly Betts (previous) (diff)

comment:11 by someplaceguy, 10 months ago

That doesn't seem to work, the test is still being skipped:

Running test: ioblock1...hole: 136192, st_size: 136192
 SKIPPED
unittest.cc:957: Skipping rest of testcase - sparse files not supported

I think it's not working because the test is doing two writes:

  • One write to offsets 1024..2047, corresponding to the first 128K ZFS block (which covers offsets 0..131071)
  • Another write to offsets 135168..136191, corresponding to the second 128K ZFS block (which covers offsets 131072..262143).

Since the test is writing both to the first and to the second ZFS blocks, there is no hole in-between.

However, if I change 132 to a bigger offset, like 270 (for example), then it should work fine because the first write would still be done to the first ZFS block, while the second write would be done to the *third* ZFS block. This would leave a hole in the middle (i.e. the second ZFS block).

As expected, after changing 132 to 270 the test passes:

Running test: ioblock1... ok
PASS: unittest

comment:12 by Olly Betts, 10 months ago

Resolution: fixed
Status: assignedclosed

Oh yes, sorry I missed that.

I've committed essentially your adjusted version except with both blocks moved up to leave 128K of potential sparseness at the start of the file (so as to minimise the disk space needed if sparse files aren't actually supported) as 8a05a114382916fd0ca141533e8c0e87da652280 and backported for 1.4.25 as b47cb5f846a527051c6d3c83c39d1c34b0a75a91.

Thanks for all the testing.

Note: See TracTickets for help on using tickets.