Opened 6 years ago

Last modified 21 months ago

#769 new enhancement

Provide more options for lock retrying

Reported by: bremner Owned by: Olly Betts
Priority: normal Milestone:
Component: Other Version:
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Operating System: All

Description

Currently we have the choice immediately throwing an exception in the case of failing to acquire a write lock or potentially blocking forever with DB_RETRY_LOCK. It would be nice to have a configurable amount of time or (not quite as nice) number of retries before giving up.

Change History (3)

comment:1 by Olly Betts, 6 years ago

On Unix-like platforms, DB_RETRY_LOCK maps to a blocking request for a lock in fcntl() and so there isn't a "number of retries" there. A major motivation for adding this flag was to allow taking advantage of OS-level "block waiting for lock" where possible.

fcntl() doesn't directly support a timeout on a blocking lock.

One option is to repeatedly poll for the lock until the timeout expires, but that seems crude.

Another is to try a blocking lock but interrupt it (with alarm() or a timer like that in matcher/matchtimeout.h but using SIGEV_SIGNAL). Problem is that this signal is visible to the application.

It might be possible to call fcntl() from a thread and terminate that thread via a second thread launched via SIGEV_THREAD if the timer expires or something like that.

When there's a timeout we could first try a non-blocking request for the lock so at least that common case is handled without extra overhead.

comment:2 by Olly Betts, 6 years ago

The other problem with timer_create() is that it seems to have somewhat limited portability in practice - so far we've found that OpenBSD and GNU Hurd just have dummy implementations which always fail with ENOSYS, on AIX it always seems to fail with EAGAIN, and on NetBSD the calls appear to work but the timer doesn't actually seem to trigger after the timeout has passed.

If we're using threads, we could perhaps launch one to perform the fcntl() call and a second which just does sleep() (or a finer granularity variant where supported) and then kills the first thread.

A "sleeper thread" could probably also be used to implement the match timeout on platforms where timer_create() isn't usable (or maybe everywhere even).

comment:3 by Olly Betts, 21 months ago

A "sleeper thread" could probably also be used to implement the match timeout on platforms where timer_create() isn't usable (or maybe everywhere even).

I had a look at implementing that, and came up with a working prototype using pthreads. See #770.

I also worked out why our timer_create() based timeout doesn't work on NetBSD - it doesn't implement SIGEV_THREAD.

I think we want to avoid using timer_create() more. It's not supported at all on some platforms, and SIGEV_THREAD isn't supported on others. SIGEV_SIGNAL is probably a bit better, but more visible to the application.

I think better to create a "timeout" thread which uses sleep() (or some variant with better granularity) and then kills the thread blocked on the lock. That may need C++20 if we want to use C++ threads (otherwise we need to deal with pthreads, which needs special options on some platforms, and also handle platforms without pthreads). I'll try to get the Enquire::set_time_limit() thread-based implementation merged first.

Note: See TracTickets for help on using tickets.