#546 closed defect (fixed)
xapian-replicate reads from a socket without using timeouts
Reported by: | nkvoll | Owned by: | Olly Betts |
---|---|---|---|
Priority: | normal | Milestone: | 1.2.21 |
Component: | Replication | Version: | |
Severity: | normal | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Operating System: | All |
Description
Im using xapian-replicate to replicate an index in two different availability zones in EC2. In one of the zones, the replication stops semi-regularily (a couple of times every day) and I have to manually restart it.
The result of strace and a gdb backtrace is shown here: https://gist.github.com/4925a7eaeece552ccff5
As rboulton pointed out on irc, this does not necessarily work well in dodgy networks.
Maybe a timeout could be implemented (configurable by some command-line switch) that would cause a reconnection if the timeout was hit?
Attachments (1)
Change History (16)
comment:1 by , 14 years ago
comment:2 by , 14 years ago
In order to clarify a little:
I told iptables to drop incoming packets to the port the xapian-replicate client used, then I stopped the xapian-replicate-server and removed the iptables rule (iptables --flush).
At that point, the server has sees no connection to the client, but the client still waits in read(), ever hopeful, waiting for the server to respond -- without any timeout it will probably never leave that state.
comment:3 by , 14 years ago
Paste of external content (please just include such information in the ticket unless it's really enormous, otherwise if the external content goes away context is lost):
# ps aux | grep xapian user 1074 0.1 0.3 26532 2112 ? S May16 4:54 xapian-replicate -v -h [...] # strace -p 1074 Process 1074 attached - interrupt to quit read(3, ... # gdb --pid 1074 GNU gdb (GDB) 7.2-ubuntu Attaching to process 1074 Reading symbols from /usr/local/bin/xapian-replicate...done. Reading symbols from /usr/local/lib/libxapian.so.22...done. Loaded symbols for /usr/local/lib/libxapian.so.22 0x00007f04914db710 in read () from /lib/libc.so.6 (gdb) bt #0 0x00007f04914db710 in read () from /lib/libc.so.6 #1 0x00007f04924b8135 in read (this=0x15472f0, min_len=61, end_time=<value optimized out>) at /usr/include/bits/unistd.h:45 #2 RemoteConnection::read_at_least (this=0x15472f0, min_len=61, end_time=<value optimized out>) at net/remoteconnection.cc:130 #3 0x00007f04924b9b7c in RemoteConnection::get_message_chunk (this=0x3, result=..., at_least=140735456284160, end_time=0) at net/remoteconnection.cc:572 #4 0x00007f0492419c4e in ChertDatabaseReplicator::process_changeset_chunk_blocks (this=<value optimized out>, tablename=<value optimized out>, buf=..., conn=..., end_time=<value optimized out>, changes_fd=-1) at backends/chert/chert_databasereplicator.cc:205 #5 0x00007f049241af60 in ChertDatabaseReplicator::apply_changeset_from_conn (this=<value optimized out>, conn=..., end_time=<value optimized out>, valid=<value optimized out>) at backends/chert/chert_databasereplicator.cc:353 #6 0x00007f04923b6596 in Xapian::DatabaseReplica::Internal::apply_next_changeset (this=0x1543f50, info=0x7fff86e12790, reader_close_time=30) at api/replication.cc:583 #7 0x00007f04923b6d7b in Xapian::DatabaseReplica::apply_next_changeset (this=<value optimized out>, info=0x7fff86e10e00, reader_close_time=4096) at api/replication.cc:280 #8 0x00007f04924c2e43 in ReplicateTcpClient::update_from_master (this=<value optimized out>, path=<value optimized out>, masterdb=..., info=..., reader_close_time=<value optimized out>) at net/replicatetcpclient.cc:60 #9 0x0000000000401a93 in main (argc=<value optimized out>, argv=0x7fff86e12b08) at bin/xapian-replicate.cc:146 (gdb)
comment:4 by , 14 years ago
Component: | Other → Replication |
---|
by , 13 years ago
Attachment: | read-timeout-1.diff added |
---|
patch to add timeout support to xapian-replicate
comment:5 by , 13 years ago
I added a patch (written against 1.2.5, but should apply cleanly to trunk/master) that makes it possible to add a timeout for the reads during the replication from the command line.
comment:6 by , 13 years ago
Milestone: | → 1.2.7 |
---|
There's already lots marked to do for 1.2.6, so marking this for 1.2.7.
As I wondered on IRC, perhaps we can set socket-level timeouts to achieve this - that would allow the timeout just to be set once when we open the socket, rather than having to keep looking up the current time.
comment:7 by , 13 years ago
I noticed when replicating a large index for the first time that the timeout in the patch above will also set a timeout on the replication as a whole, so its probably added to a few too many places.
Setting the timeout on the socket is probably a good idea, but unfortunately I'm not confident enough in C++ to do that.
comment:8 by , 13 years ago
Milestone: | 1.2.7 → 1.3.0 |
---|
Now we've branched off 1.2, this should get addressed in trunk (and then backported if appropriate) so updating milestone.
comment:9 by , 13 years ago
Milestone: | 1.3.0 → 1.3.x |
---|
comment:10 by , 12 years ago
Milestone: | 1.3.x → 1.3.2 |
---|
Marking to consider soonish.
The patch as-is seems too problematic if it adds a timeout on the replication as a whole.
Details of the socket timeout options (from man 7 socket
):
SO_RCVTIMEO and SO_SNDTIMEO Specify the receiving or sending timeouts until reporting an error. The argument is a struct timeval. If an input or output function blocks for this period of time, and data has been sent or received, the return value of that function will be the amount of data transferred; if no data has been transferred and the timeout has been reached then -1 is returned with errno set to EAGAIN or EWOULDBLOCK just as if the socket was specified to be nonblocking. If the timeout is set to zero (the default) then the operation will never timeout. Timeouts only have effect for system calls that perform socket I/O (e.g., read(2), recvmsg(2), send(2), sendmsg(2)); timeouts have no effect for select(2), poll(2), epoll_wait(2), and so on.
Though I've read a few (perhaps dated) warnings that they aren't implemented everywhere.
comment:11 by , 11 years ago
Milestone: | 1.3.2 → 1.3.3 |
---|
comment:12 by , 10 years ago
Status: | new → assigned |
---|
POSIX says Note that not all implementations allow this option to be set:
http://pubs.opengroup.org/onlinepubs/009695399/functions/setsockopt.html
Checking on-line copies of man pages, settings SO_RCVTIMEO
and SO_SNDTIMEO
are supported on:
- Linux: http://linux.die.net/man/7/socket
- FreeBSD: http://www.freebsd.org/cgi/man.cgi?query=setsockopt&sektion=2
- NetBSD: http://netbsd.gw.com/cgi-bin/man-cgi?setsockopt+2+NetBSD-current
- OpenBSD: http://www.openssh.com/cgi-bin/man.cgi/OpenBSD-current/man2/setsockopt.2?query=setsockopt&sec=2
- OS X: https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man2/setsockopt.2.html
- Windows: https://msdn.microsoft.com/en-us/library/windows/desktop/ms740476%28v=vs.85%29.aspx
- HP Tru64: http://h50146.www5.hp.com/products/software/oe/tru64unix/manual/v51a_ref/HTML/MAN/MAN2/0183____.HTM
The Solaris man page doesn't mention these settings, though they are defined in sys/socket.h
. However, it seems Solaris doesn't actually support them: https://issues.apache.org/jira/browse/THRIFT-1371
And AIX allows them to be set, but doesn't use them: http://www-01.ibm.com/support/knowledgecenter/ssw_aix_71/com.ibm.aix.commtrf2/getsockopt.htm?lang=en
So they are supported by at least the common open operating systems, and some of the proprietary ones. I'm inclined to say we just make use of them, and accept a clean patch if someone cares enough to fix this for other platforms (as noted in the comments above, the attached patch makes the whole replication time out if it takes too long, which isn't what we want at all). Having working timeouts on most platforms beats not having them anywhere.
comment:13 by , 10 years ago
We could also enable SO_KEEPALIVE
which means the connection won't hang around forever if the other end goes away, though it could take a couple of hours to notice: http://www.unixguide.net/network/socketfaq/2.8.shtml
It is at least specified by POSIX: http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_10_16
If we enable that and use socket timeouts where supported, then the original issue of having to manually restart it would be gone, though you might end up with a slave lagging until the keepalive packet is next sent.
comment:14 by , 10 years ago
Milestone: | 1.3.3 → 1.2.21 |
---|
[1500c38b4f9720a8973e58ce76ea10beb3203fd2] adds an option to xapian-replicate
to allow specifying a socket level timeout, and also turns on SO_KEEPALIVE
for that socket.
I'm not clear on how to repeat the test with iptables
. It says I need to find the port with lsof -i
, but is that the port on the replication server or client?
And I guess I need to run that while a replication is happening - did you just use a suitably large database that replication ran for long enough that you could look up the port and run the iptables
command, or was there some trick to this?
It would be nice to backport this to 1.2.x, so marking for 1.2.21, at least for now. I'd really like to be able to test this first though.
comment:15 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Backported in [9b77d851031ee5db0d210beba407988beae78570]. The lack of testing that this actually addresses the reported situation is still a concern, but the new timeout defaults to 0, so this shouldn't break current usage of 1.2.x.
As per rboultons request, I tried setting the end_time in apply_changeset_from_conn to RealTime::now() + 1.0, and 1.0, both of which had no effect.
(In order to see if the timeout worked, I started the replication, then told iptables to block the incoming data:
after which strace and gdb showed the process in a similar state. Opening the port in iptables again did not change the replication client state, which continued being stuck in the read.