Opened 10 years ago

Last modified 6 months ago

#665 assigned defect

Avoid C library functions not guaranteed to be thread-safe

Reported by: Olly Betts Owned by: Olly Betts
Priority: normal Milestone: 2.0.0
Component: Library API Version: git master
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Operating System: All

Description

Currently we use environmental variables to control a few things (default backend for a newly created DB, flush threshold, replication stuff, cjk ngram, debug logging) but getenv() isn't necessarily thread safe.

Adding a mutex which we lock while calling it wouldn't help as it isn't Xapian potentially calling setenv()/putenv().

In general we'll be unlucky to hit problems, as we don't call getenv() very often, but that's not very reassuring.

I don't see a good fix, aside from not reading environmental variables. Debug logging is a special case, but aside from that I think we should probably try to phase them out - being able to change the default backend without changing the code is kind of nice, but otherwise these things would arguably be better specified in the code.

Change History (25)

comment:1 by Olly Betts, 10 years ago

Summary: getenv() not guaranteed to be thread-safeAvoid C library functions not guaranteed to be thread-safe

I think we should broaden this to cover any library functions which aren't necessarily thread-safe.

According to the Linux man pages, there are potential issues with atoi() (and related functions like atol(), strtoul(), etc which we don't currently use in the library code):

These functions can be safely used in multithreaded applications, as long as setlocale(3) is not called to change the locale during their execution.

I can't see anything in POSIX about this potential issue, so not sure if Linux is failing to conform here. It's not likely to bite in practice, but perhaps we should aim to avoid these functions in the library code.

We use inet_ntoa() (which isn't thread-safe as it has a static return buffer) in one place, to report where the connection is from in net/tcpserver.cc. That'll change when IPv6 support is added.

I'm thinking we should see if we can compile a list of library functions to avoid, and blacklist them somehow (perhaps a header which does #define inet_ntoa DO_NOT_USE_inet_ntoa which gets included in all library code sources).

comment:2 by Olly Betts, 10 years ago

GCC supports "poisoning" identifiers, e.g. #pragma GCC poison inet_ntoa:

https://gcc.gnu.org/onlinedocs/gcc-4.9.0/cpp/Pragmas.html

Less portable, but perhaps less likely to cause issues - if a bad function is used as a method name or a variable, the pragma will at worst prevent compilation, whereas the #define approach could, for example, break the ABI of the built library.

comment:3 by Olly Betts, 10 years ago

Functions which POSIX says are not required to be thread-safe which we're actually using in the library code in current master:

  • getenv() - used to read things like XAPIAN_FLUSH_THRESHOLD.
  • inet_ntoa() - used in the remote server code - while that's in the library, it's only actually used in binaries we build currently.
  • readdir() - used in replication code to delete the old replica - again, in the library but only used in binaries we build

And strerror() is used, but only if neither strerror_r() nor sys_errlist are available.

There's also at least atoi() and strtod() which the Linux man pages note aren't thread-safe if another thread calls setlocale() while they are executing.

comment:4 by Olly Betts, 10 years ago

[ada3bd5f12a444d778d57c0c75cccacc077d5051] updates master to use inet_ntop() instead of inet_ntoa().

comment:5 by Olly Betts, 10 years ago

POSIX requires that readdir() be thread-safe if the DIR* pointer is only used in a single thread, which is how we always use it:

The returned pointer, and pointers within the structure, might be invalidated or the structure or the storage areas might be overwritten by a subsequent call to readdir() on the same directory stream. They shall not be affected by a call to readdir() on a different directory stream.

(my emphasis)

http://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html

comment:6 by Olly Betts, 9 years ago

Milestone: 1.3.x1.4.x

So the list is now:

  • getenv() - used to read things like XAPIAN_FLUSH_THRESHOLD.
  • strerror() (but only if there's no thread-safe alternative).
  • atoi() - quite a few uses.
  • strtod() - only use is in NumberValueRangeProcessor::operator() - also means that behaviour depends on the locale, which is probably not always desirable.

We should have a section in the developer docs about this too.

None of this directly affects the ABI, though eliminating uses of getenv() might need some API additions. I don't think this should block 1.4.0 though.

comment:7 by Olly Betts, 5 years ago

[60bd2fb4e2a559c214f3569d1f5f64f85ba1766f] replaced 17 uses of atoi(). That leaves just 6 in libxapian.

comment:8 by Olly Betts, 5 years ago

c31aa9b56eb858c9acce410d5465e91007d997b4 removes 2 more uses of atoi() from libxapian.

comment:9 by Olly Betts, 5 years ago

Current status:

  • getenv() - used to read things like XAPIAN_FLUSH_THRESHOLD.
  • strerror() (but only if there's no thread-safe alternative).
  • atoi() - api/valuerangeproc.cc (DateRangeProcessor), backends/databasehelpers.h (parsing port number in remote stub lines),
  • strtod() - weight/weightinternal.h parsing double parameters, api/valuerangeproc.cc NumberRangeProcessor::operator() - also means that behaviour depends on the locale, which is probably not always desirable.
  • strtol() - weight/lmweight.cc

And as noted before, we should have a section in the developer docs about this too. Also maybe poison some of these to stop them being reintroduced (at least strtol() seems to have appeared since I last checked). Not sure how easy that is to do just for the library, and we don't need to worry about these in the tools, examples, or testsuite.

comment:10 by James Aylett, 5 years ago

Can we do the poisoning in config.h so it only applies during library compilation? We could then put bless blocks around any uses we want to whitelist, such as in examples and tools. (Clang seems to support GCC poison pragmas, but looks like it doesn't handle blessing. However we could enable this for GCC only, because we really just want CI to fail somewhere. Doesn't have to fail everywhere.)

comment:11 by Olly Betts, 5 years ago

I hadn't considered we could unpoison things - that might work.

Another option that might work is to have a #ifdef XAPIAN_NO_POISON or similar around them in config.h, and just define that before #include <config.h> where we want to whitelist.

I agree we only need to catch this in CI, but it's more helpful to people submitting patches and using clang to have this flagged during early development rather than only after they've submitted a patch.

comment:12 by Olly Betts, 5 years ago

Actually it seems we can't do this in <config.h> because that should be included before any other headers, and:

$ cat poison.cc 
#pragma GCC poison atoi
#include <stdlib.h>
$ g++ -Wall -W -O2 -c poison.cc 
In file included from /usr/include/c++/9/cstdlib:75,
                 from /usr/include/c++/9/stdlib.h:36,
                 from poison.cc:2:
/usr/include/stdlib.h:104:12: error: attempt to use poisoned "atoi"
  104 | extern int atoi (const char *__nptr)
      |            ^
In file included from /usr/include/c++/9/cstdlib:75,
                 from /usr/include/c++/9/stdlib.h:36,
                 from poison.cc:2:
/usr/include/stdlib.h:361:8: error: attempt to use poisoned "atoi"
  361 | __NTH (atoi (const char *__nptr))
      |        ^
In file included from /usr/include/c++/9/stdlib.h:36,
                 from poison.cc:2:
/usr/include/c++/9/cstdlib:91:8: error: attempt to use poisoned "atoi"
   91 | #undef atoi
      |        ^
/usr/include/c++/9/cstdlib:141:11: error: attempt to use poisoned "atoi"
  141 |   using ::atoi;
      |           ^
In file included from poison.cc:2:
/usr/include/c++/9/stdlib.h:56:12: error: attempt to use poisoned "atoi"
   56 | using std::atoi;
      |            ^

So we'd have to poison after we include headers which might reference the identifier - we can't poison atoi and then #include <stdlib.h>, nor any header which might indirectly #include <stdlib.h>.

Perhaps this is better checked by xapian-check-patch - we can easily control which directories that applies a check to.

comment:13 by Olly Betts, 5 years ago

Does GCC actually support a bless pragma? With GCC 9.2.1 I get:

poison.cc:2: warning: ignoring #pragma GCC bless [-Wunknown-pragmas]
    2 | #pragma GCC bless atoi

The only references I found seem to be copies of a blog post wishing it existed.

comment:14 by James Aylett, 5 years ago

Yeah, I didn't read the blog post closely enough in the first place. Having something in xapian-check-patch seems a good plan, and at least the ones we are avoiding at the moment are symbols that won't be used elsewhere (except possibly in comments, which we could cater for).

comment:15 by James Aylett, 5 years ago

We could also define the symbols to something else and get a link error which would be un-poisonable, but it's not going to be as clear an error as we can manage in xapian-check-patch.

comment:16 by Olly Betts, 5 years ago

xapian-check-patch already strips comment content so that shouldn't be an issue.

I think the #define inet_ntoa DO_NOT_USE_inet_ntoa approach would (like the poison pragma) require something like #include "poison.h" to be included after any other header in every source which is part of the library. But that's easy to fail to do (and easy for someone to merge that include into a block of others thinking it's out of place or add an include after it which might work on one platform but fail on another depending which system headers implicitly include others), so we'd really need xapian-check-patch to check it's suitably included where it should be. Probably simpler to just check for a list of bad functions in xapian-check-patch.

Another benefit of checking from a script is we can be a little more nuanced - e.g. strerror() is used in the library, but only as a fallback option if we don't have strerror_r() or sys_errlist or similar - with a script we can allow the use in common/errno_to_string.cc.

Also the poison pragma is based only on identifiers, so e.g. would complain about a class with an atoi() method. I'm not sure if any of the identifiers we would want to poison would be likely method names though.

comment:17 by Olly Betts, 5 years ago

I've now eliminated all uses of atoi() and strtol() in the library itself.

Current status:

  • getenv() - used to read things like XAPIAN_FLUSH_THRESHOLD.
  • strerror() (but only if there's no thread-safe alternative).
  • strtod() - weight/weightinternal.h parsing double parameters, api/valuerangeproc.cc NumberRangeProcessor::operator() - also means that behaviour depends on the locale, which is probably not always desirable. C++17 provides std::from_chars() which we could use if available. Do we want to provide a way to allow , as decimal point though?

Also to do:

  • Add section to the developer docs about this
  • Add checks to xapian-check-patch to prevent more calls to problematic functions from being introduced into the library code

comment:18 by Olly Betts, 5 years ago

Milestone: 1.4.x1.5.0
Status: newassigned
Version: 1.3.2git master

I think we should aim to address this for 1.5.0 (or 1.5.x at least), at least for mainstream platforms (for example, the requirements might be an alternative to strerror() and a C++17 compiler).

comment:19 by Olly Betts, 5 years ago

https://en.cppreference.com/w/cpp/compiler_support seems to suggest that no compiler supports std::from_chars() for FP types yet, and looking at /usr/include/c++/9/charconv from GCC 9 seems to confirm that, so that's no help.

We could write our own or find an existing implementation, but I think it's quite subtle to get completely right.

comment:20 by Olly Betts, 4 years ago

That page now suggests that GCC11 and a recent MSVC version support std::from_chars() for floating point types (search for "Elementary string conversions" to find the relevant entry) so probably using this if available is the best option overall, and will get better with time.

comment:21 by Olly Betts, 13 months ago

Priority: normalhighest

OK, let's use std::from_chars() where it supports double and postpone the rest. With a suitable platform and toolchain that means only getenv() is a concern (and it seems perhaps it's fine on many platforms, it's just that POSIX allows it to return its value in a static buffer; or possibly concurrent setenv() might be problematic too?)

comment:22 by Olly Betts, 8 months ago

The Linux man pages seem to have been rewritten to be much clearer about the nuances of thread safety since we started on this (or maybe I failed to fully take in what they were saying before now).

The key point to grasp is that functions that modify the environment are not safe to call in a multi-threaded program, but that means that in a multi-threaded program they should not be called so the environment can be considered as effectively constant and functions which only read it (like getenv()) are therefore safe.

It's not great that in our situation this logic means relying on the code of the application using Xapian to not modify the environment if it uses threading, but we have to rely on it not to invoke undefined behaviour in general.

I think this means we can probably keep using getenv() but that we should only use it in the thread that we're called in. We don't currently create separate threads of our own, but if/when we start to we should avoid calling getenv() in those threads as the application code may be single-threaded and thus call setenv().

It's still useful to eliminate calls to strtod() as that suffers from changing behaviour if the locale is changed. I have a WIP branch to do that.

comment:23 by Olly Betts, 8 months ago

Checking POSIX, that makes fewer guarantees than the Linux man pages https://pubs.opengroup.org/onlinepubs/9699919799/functions/getenv.html :

The returned string pointer might be invalidated or the string content might be overwritten by a subsequent call to getenv() [...]

So if the application is threaded and calls getenv() on another thread concurrently with us calling getenv(), the string returned to us could be invalidated or overwritten.

comment:24 by Olly Betts, 8 months ago

Milestone: 1.5.02.0.0
Priority: highestnormal

Pushed 6380dda1bdc3b61cb6fb6153765d3ffcd262c2a2 which uses std::from_chars() to parse string to double when available.

The main thing left to do is eliminate uses of getenv() from library code which I think is a longer term goal. Current uses:

  • XAPIAN_MAX_CHANGESETS and XAPIAN_FLUSH_THRESHOLD in the glass backend. Probably these can be left alone and they'll just go away when glass gets removed. We can implement a different approach for honey when we implement update and replication.
  • XAPIAN_CJK_NGRAM (already deprecated in 1.4.11)
  • XAPIAN_DEBUG_LOG and XAPIAN_DEBUG_FLAGS in debug logging code. These only get checked the first time logging occurs so seem less problematic, and I struggle to think of a better way to specify these.

So I don't think there's anything to do for these right now.

Also to do:

  • Add notes to the thread safety docs about this, including which functions may get used which are not guaranteed to be thread-safe (e.g. strerror()), how to check if they are used (config.h?), etc.
  • Add section to the developer docs about what functions to avoid using and why
  • Add checks to xapian-check-patch to help prevent calls to known problematic functions from being introduced into the library code

comment:25 by Olly Betts, 6 months ago

It occurred to me we could perform thread-safe lookup of environment variables by using fork() and looking up the variable in the child process, then sending the value back to the parent (if it's a small integer, it could be via the exit code of the child process, otherwise probably via a pipe).

This clearly has more overhead, but would be fine for e.g. a one-time check for XAPIAN_DEBUG_LOG and XAPIAN_DEBUG_FLAGS to initialise debug logging.

Note: See TracTickets for help on using tickets.