Opened 5 weeks ago

Last modified 3 weeks ago

#833 new defect

The Xapian 1.4.25 causes the program to crash when performing concurrent multi-threaded queries.

Reported by: myx Owned by: Olly Betts
Priority: highest Milestone:
Component: Backend-Glass Version: 1.4.25
Severity: blocker Keywords:
Cc: Blocked By:
Blocking: Operating System: All

Description

Hello!

I compiled the Xapian 1.4.25 Windows version library using msys2 and MSVC 2017, and I wrote a test program to test it. During testing, I found that when using multithreaded queries in Xapian (with read-write locks), the program crashes easily.

In the test code, I used the TestRead class for multithreaded querying. When I pre-construct the Xapian::Query object based on the query conditions and pass it to the member variable during the construction of TestRead, the program crashes when it is used in the worker thread. However, if I create a temporary Xapian::Query object in the worker thread based on the same query conditions, the program runs normally(These two testing methods can be toggled using the macro USE_GLOBAL_XAPIAN_QUERY_OBJECT). I suspect there might be an issue with intrusive_ptr regarding reference counting management.

This same test did not reveal any such exceptions in the Xapian 1.2.25 version.

Initially, I doubted whether there was a problem with the compiled Windows Xapian library, so I tested it again on a Linux (Suse 12.5) system, where the 1.2.25 version worked fine, but the 1.4.25 version still crashed, and the crash behavior was consistent.

You can find the specific code in the attached files, which include a test project for MSVC 2017, the Xapian database files, the compiled library files, and some screenshots of the stack trace at the time of the crash.

Please take some time out of your busy schedule to look into what might be causing this issue and if there are any solutions. Thank you very much!

If you have any questions, please feel free to contact me. I look forward to your reply. Thank you once again!

Attachments (3)

testxapian.zip (14.2 KB ) - added by myx 5 weeks ago.
test code
crash.docx (254.0 KB ) - added by myx 5 weeks ago.
testxapian_minimal.zip (7.4 KB ) - added by myx 3 weeks ago.

Download all attachments as: .zip

Change History (6)

by myx, 5 weeks ago

Attachment: testxapian.zip added

test code

by myx, 5 weeks ago

Attachment: crash.docx added

comment:1 by myx, 5 weeks ago

There is a size limit for attachments. May I send it via email? (I will need an email address.)

comment:2 by Olly Betts, 5 weeks ago

The internal implementation of Xapian::Query changed a lot between 1.2.x and 1.4.x, and notably it doesn't use reference counted internals in 1.2.x but does in 1.4.x.

(We also switched from a home-grown reference counted pointer to one based on Boost's intrusive_ptr, but I'd expect Boost's implementation to be less buggy than our old one as it will have seen a lot more use.)

I tried looking at your code, but it's too complex for me to easily follow what's going on.

Are you sure you are locking between threads sufficiently? Not doing so would explain problems. For example, you can't have two different threads modifying the reference count on a Xapian object concurrently. For example, care is needed if you want to pass ownership of an object between threads.

Otherwise I'd suggest reducing it to a minimal testcase that shows the problem. That may make it obvious what's going on, and if not it'll be more feasible for me to investigate.

Not sure what the file you couldn't attach is, but if it's too large to attach it's unlikely to be useful to me - I really need a small, self-contained reproducer to look at.

Last edited 5 weeks ago by Olly Betts (previous) (diff)

comment:3 by myx, 3 weeks ago

Thank you very much for your reply!

The crash problem I described occurs in situations with multithreaded reads and does not require concurrent read/write scenarios. Additionally, I have added locks during multithreaded reads.

Following your suggestion, I have adjusted the test code structure and simplified it to make the problem clearer.

In the test code, when the macro USE_GLOBAL_XAPIAN_QUERY_OBJECT is defined, the problem can be reproduced. You can take a look at the relevant code, and if you want to debug, you can compile and run the test program to try to reproduce the problem.

Please let me know if there is any new progress, thanks!

by myx, 3 weeks ago

Attachment: testxapian_minimal.zip added
Note: See TracTickets for help on using tickets.