Opened 5 months ago

Last modified 5 months ago

#832 new task

Compiling xapian with meson. Strange error in tests.

Reported by: mgautier Owned by: Olly Betts
Priority: normal Milestone:
Component: Build system Version:
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Operating System: All

Description (last modified by mgautier)

Hi,

I am currently trying to compile xapian with meson build system. Once done, I plan to either provide a patch here (if you are interested) or publish the build system on https://mesonbuild.com/Wrapdb-projects.html

You can find the current status of xapian with meson here : https://github.com/openzim/xapian-meson

Here few status information (for context):

  • We are interested on glass backend only so I concentrate on this only for now.
  • Works only creating static library. Got undefined reference else (both linux and windows)
  • Some changes are independents of meson and you probably want them anyway (you may find them on the github repository)

But for my issue, even if it seems to work pretty nicely on Linux (static lib) but on Windows, tests are broken.

Running test_api, I have some strange behavior. Some tests (qpmemoryleak1 and newfreelistblock1) fails sometime (with FAIL: DatabaseError: Commit failed (Permission denied)) and sometime not. (with multi_glass or glass backend).

On top of that, the number of skip tests are not constant.

On libzim side (https://github.com/openzim/libzim) (which use xapian build on windows). Our test (xapian creation) always crashes (segfault) when compacting the backend database with flag Xapian::Compactor::FULLER. It never fails without this flag, whatever xapian unittest are failing or not.

I am a bit stuck here. I hope that some of you would be able to help me.

---

For more context, we are already building xapian using autotools on windows on appveyor CI (https://github.com/kiwix/kiwix-build/blob/main/appveyor/build_xapian.sh). But we are in the process of homogenize our build system and I never succeed to build correctly again xapian using autotools locally. (spaces in path are a nightmare)

Thanks

Change History (8)

comment:1 by Olly Betts, 5 months ago

I'd been wondering about looking at switching to meson at some point so I'm open to the idea, but at this point in the release cycle I think it would need to be targetted at the release series after next. We're getting close to starting the next release series (down to a handful of open tickets) and it's already been much too long since we started 1.4.x. I would still consider merging additional changes at this point, but changing the entire build system seems too much (and there isn't a working patch yet).

I really can't think of a reason for the failures you're seeing, but I'm afraid I just don't have the spare cycles to investigate at the moment.

I don't think I've seen any of those problems with the current build system. The only currently sometimes flaky tests are some of the timed ones (which you can set export AUTOMATED_TESTING=1 to skip). We have sometimes hit platform-specific flakiness, but I think we now have fixed or worked around all known cases of that.

For more context, we are already building xapian using autotools on windows on appveyor CI

We also have builds in our CI:

comment:2 by Olly Betts, 5 months ago

Works only creating static library. Got undefined reference else (both linux and windows)

It struck me that this is rather suspicious - there shouldn't be any undefined references (and there aren't with the current build system) so this suggests to me that you aren't putting the correct set of files into the library (or you're misbuilding them in some way that means symbols are missing, perhaps due to symbol visibility).

I'd suggest working on solving this as it may be the cause of the other problems and should be easier to debug (first compare the list of sources in the library with the current build system and yours; if the same, check the compilation options used match).

Also even if this is not the root of the problem, a shared library build would be a prerequisite for switching build systems so it needs resolving.

it would need to be targetted at the release series after next

BTW I should probably note that I'm expecting the release series after next to follow much sooner than 1.4.0 to now (or 1.2.0 to 1.4.0). The key thing to do for it is to complete the relicensing of libxapian.

comment:3 by mgautier, 5 months ago

Description: modified (diff)

I'd been wondering about looking at switching to meson at some point so I'm open to the idea, but at this point in the release cycle I think it would need to be targetted at the release series after next

Sure, I was more looking for help as I'm a bit stuck here. My work is not yet ready for integration now.

It struck me that this is rather suspicious - there shouldn't be any undefined references (and there aren't with the current build system) so this suggests to me that you aren't putting the correct set of files into the library (or you're misbuilding them in some way that means symbols are missing, perhaps due to symbol visibility).

Thanks, I will search on this direction.

Aside of meson, I think the two commits https://github.com/openzim/xapian-meson/commit/4b302ec704f7b80860a0cf62b5c446341ace8e8b and https://github.com/openzim/xapian-meson/commit/88a068e42ec655853e1fb072295a8d9722c3b64b are commits you want to integrate. Can you directly apply them or you prefer I do separated PR(s) ?

comment:4 by mgautier, 5 months ago

Description: modified (diff)

comment:5 by mgautier, 5 months ago

It struck me that this is rather suspicious - there shouldn't be any undefined references (and there aren't with the current build system) so this suggests to me that you aren't putting the correct set of files into the library (or you're misbuilding them in some way that means symbols are missing, perhaps due to symbol visibility).

Thanks, I will search on this direction

Well, it seems I was wrong with undefined references (and I would prefer being right as it doesn't seems simpler).

Dll need us to mark public (export) symbol with __declspec(dllexport) but you never seems to do it. You have the XAPIAN_VISIBILITY_DEFAULT defined in visibility.h but it is only used for gcc whit symbol visibility support. With autotools on windows it seems that libtool does the job for us in m4/libtool.m4 around line 4010. It seems to create a def file we can pass to compiler and avoid the __declspec.

But meson doesn't do it for us. Strangely, I had the same issue on libzim and open a issue on meson at the time : https://github.com/mesonbuild/meson/issues/7838

Anyway, I have patched visibility.h to define XAPIAN_VISIBILITY_DEFAULT to __deplcspec(dllexport) on msvc. But new problems coming. I have a lot of warning because (private) data member classes are not exported (which seems to be only a warning and we can ignore it https://stackoverflow.com/questions/2132747/warning-c4251-when-building-a-dll-that-exports-a-class-containing-an-atlcstrin#comment66255284_4563701)

But the real problem is that cl complains that intrusive_ptr (~intrusive_ptr) uses undefined type Xapian::Compactor::Internal. This is strange as we actually use ~intrusive_ptr in Xapian::~Compactor in compactor.cc when Internal is actually defined, so it should be ok (at least it is with gcc).

But cl complain as we use ~intrusive_ptr in compactor.h line 185. I suspect some strange rule about symbol definition in dll but I haven't found documentation confirming this. I don't know why it is ok with libtool build. Maybe using a .def file do not trigger the same things.

Anyway, using delete on incomplete type is UB (https://en.cppreference.com/w/cpp/language/delete) so I hope the windows&libtool version correctly call destructor from compactor.cc and not silently generate UB.

---

Sorry if I spam you. I think I will use this issue to share my progress. Hope it is ok for you, I will find somewhere else if not.

comment:6 by Olly Betts, 5 months ago

Aside of meson, I think the two commits ​https://github.com/openzim/xapian-meson/commit/4b302ec704f7b80860a0cf62b5c446341ace8e8b

This isn't the right fix. The condition path && remote is not true for any of the current test harness backends, so your patch effectively just disables these two testcases. The key point here is that remote in the condition means that the test harness managed backend is a remote one, not that the remote backend is enabled in the library. We actually want to run these testcases only for non-remote test harness backends, for which we generate a database and put its path in a stub file to access using xapian-progsrv, then try to check the database (which isn't implemented but we want to at least check it fails with an appropriate exception). Probably the best way to address potentially not having the remote backend is to check for FeatureUnavailableError being thrown instead if XAPIAN_HAS_REMOTE_BACKEND is not defined. I've pushed a change to do that, which should fix this for you.

Also these two testcases are pretty much the same testcase so one should just be removed. I've done that too.

and ​https://github.com/openzim/xapian-meson/commit/88a068e42ec655853e1fb072295a8d9722c3b64b are commits you want to integrate.

This one looks good (aside from whitespace) - applied.

Can you directly apply them or you prefer I do separated PR(s) ?

If you have further fixes PRs are definitely preferred (and against master unless the fix is only relevant for 1.4.x like the second one here). You seem to have imported just xapian-core into a fresh git repo, so I can't just cherry-pick commits across to the xapian repo, and generating and applying patches is fiddly because you seem to have generated files committed and be using different paths. PRs also get CI tested automatically.

comment:7 by Olly Betts, 5 months ago

Yes, libtool has a nice feature that avoids the need for all that tedious annotating of headers with __declspec nonsense with MSVC - see the "Windows DLLs" section of the libtool manual:

https://www.gnu.org/software/libtool/manual/html_node/Windows-DLLs.html

If meson would require us to add and maintain all that I'm much less keen on it. I have better things to do with my time.

I'll take a look at the destructor issue, but not today.

comment:8 by Olly Betts, 5 months ago

The destructor of Compactor looks OK to me - as you say the destructor is defined within the library where Compactor::Internal is visible, so I don't see any undefined behaviour.

Anyway, I have patched visibility.h to define XAPIAN_VISIBILITY_DEFAULT to __deplcspec(dllexport) on msvc.

I've seen suggestions to use the same set of macros for both like this out on the web, but I don't think the semantics of GCC's symbol visibility and MSVC's __declspec for DLLs are identical so I'm dubious it'll just work.

Note: See TracTickets for help on using tickets.