Opened 13 years ago

Closed 9 years ago

Last modified 9 years ago

#546 closed defect (fixed)

xapian-replicate reads from a socket without using timeouts

Reported by: nkvoll Owned by: Olly Betts
Priority: normal Milestone: 1.2.21
Component: Replication Version:
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Operating System: All

Description

Im using xapian-replicate to replicate an index in two different availability zones in EC2. In one of the zones, the replication stops semi-regularily (a couple of times every day) and I have to manually restart it.

The result of strace and a gdb backtrace is shown here: https://gist.github.com/4925a7eaeece552ccff5

As rboulton pointed out on irc, this does not necessarily work well in dodgy networks.

Maybe a timeout could be implemented (configurable by some command-line switch) that would cause a reconnection if the timeout was hit?

Attachments (1)

read-timeout-1.diff (9.7 KB ) - added by nkvoll 13 years ago.
patch to add timeout support to xapian-replicate

Download all attachments as: .zip

Change History (16)

comment:1 by nkvoll, 13 years ago

As per rboultons request, I tried setting the end_time in apply_changeset_from_conn to RealTime::now() + 1.0, and 1.0, both of which had no effect.

(In order to see if the timeout worked, I started the replication, then told iptables to block the incoming data:

iptables -I INPUT -m tcp -p tcp -j DROP --dport [port-number-from-lsof-i]

after which strace and gdb showed the process in a similar state. Opening the port in iptables again did not change the replication client state, which continued being stuck in the read.

comment:2 by nkvoll, 13 years ago

In order to clarify a little:

I told iptables to drop incoming packets to the port the xapian-replicate client used, then I stopped the xapian-replicate-server and removed the iptables rule (iptables --flush).

At that point, the server has sees no connection to the client, but the client still waits in read(), ever hopeful, waiting for the server to respond -- without any timeout it will probably never leave that state.

comment:3 by Olly Betts, 13 years ago

Paste of external content (please just include such information in the ticket unless it's really enormous, otherwise if the external content goes away context is lost):

# ps aux | grep xapian
user     1074  0.1  0.3  26532  2112 ?        S    May16   4:54 xapian-replicate -v -h [...]

# strace -p 1074
Process 1074 attached - interrupt to quit
read(3,

...

# gdb --pid 1074
GNU gdb (GDB) 7.2-ubuntu
Attaching to process 1074
Reading symbols from /usr/local/bin/xapian-replicate...done.
Reading symbols from /usr/local/lib/libxapian.so.22...done.
Loaded symbols for /usr/local/lib/libxapian.so.22
0x00007f04914db710 in read () from /lib/libc.so.6
(gdb) bt
#0  0x00007f04914db710 in read () from /lib/libc.so.6
#1  0x00007f04924b8135 in read (this=0x15472f0, min_len=61, end_time=<value optimized out>) at /usr/include/bits/unistd.h:45
#2  RemoteConnection::read_at_least (this=0x15472f0, min_len=61, end_time=<value optimized out>) at net/remoteconnection.cc:130
#3  0x00007f04924b9b7c in RemoteConnection::get_message_chunk (this=0x3, result=..., at_least=140735456284160, end_time=0) at net/remoteconnection.cc:572
#4  0x00007f0492419c4e in ChertDatabaseReplicator::process_changeset_chunk_blocks (this=<value optimized out>, tablename=<value optimized out>, buf=..., conn=..., end_time=<value optimized out>, changes_fd=-1) at backends/chert/chert_databasereplicator.cc:205
#5  0x00007f049241af60 in ChertDatabaseReplicator::apply_changeset_from_conn (this=<value optimized out>, conn=..., end_time=<value optimized out>, valid=<value optimized out>) at backends/chert/chert_databasereplicator.cc:353
#6  0x00007f04923b6596 in Xapian::DatabaseReplica::Internal::apply_next_changeset (this=0x1543f50, info=0x7fff86e12790, reader_close_time=30) at api/replication.cc:583
#7  0x00007f04923b6d7b in Xapian::DatabaseReplica::apply_next_changeset (this=<value optimized out>, info=0x7fff86e10e00, reader_close_time=4096) at api/replication.cc:280
#8  0x00007f04924c2e43 in ReplicateTcpClient::update_from_master (this=<value optimized out>, path=<value optimized out>, masterdb=..., info=..., reader_close_time=<value optimized out>) at net/replicatetcpclient.cc:60
#9  0x0000000000401a93 in main (argc=<value optimized out>, argv=0x7fff86e12b08) at bin/xapian-replicate.cc:146
(gdb)

comment:4 by Olly Betts, 13 years ago

Component: OtherReplication

by nkvoll, 13 years ago

Attachment: read-timeout-1.diff added

patch to add timeout support to xapian-replicate

comment:5 by nkvoll, 13 years ago

I added a patch (written against 1.2.5, but should apply cleanly to trunk/master) that makes it possible to add a timeout for the reads during the replication from the command line.

comment:6 by Olly Betts, 13 years ago

Milestone: 1.2.7

There's already lots marked to do for 1.2.6, so marking this for 1.2.7.

As I wondered on IRC, perhaps we can set socket-level timeouts to achieve this - that would allow the timeout just to be set once when we open the socket, rather than having to keep looking up the current time.

comment:7 by nkvoll, 13 years ago

I noticed when replicating a large index for the first time that the timeout in the patch above will also set a timeout on the replication as a whole, so its probably added to a few too many places.

Setting the timeout on the socket is probably a good idea, but unfortunately I'm not confident enough in C++ to do that.

comment:8 by Olly Betts, 13 years ago

Milestone: 1.2.71.3.0

Now we've branched off 1.2, this should get addressed in trunk (and then backported if appropriate) so updating milestone.

comment:9 by Olly Betts, 12 years ago

Milestone: 1.3.01.3.x

comment:10 by Olly Betts, 11 years ago

Milestone: 1.3.x1.3.2

Marking to consider soonish.

The patch as-is seems too problematic if it adds a timeout on the replication as a whole.

Details of the socket timeout options (from man 7 socket):

       SO_RCVTIMEO and SO_SNDTIMEO
              Specify the receiving or sending  timeouts  until  reporting  an
              error.  The argument is a struct timeval.  If an input or output
              function blocks for this period of time, and data has been  sent
              or  received,  the  return  value  of  that function will be the
              amount of data transferred; if no data has been transferred  and
              the  timeout has been reached then -1 is returned with errno set
              to EAGAIN or EWOULDBLOCK just as if the socket was specified  to
              be  nonblocking.   If  the  timeout is set to zero (the default)
              then the operation  will  never  timeout.   Timeouts  only  have
              effect  for system calls that perform socket I/O (e.g., read(2),
              recvmsg(2), send(2), sendmsg(2)); timeouts have  no  effect  for
              select(2), poll(2), epoll_wait(2), and so on.

Though I've read a few (perhaps dated) warnings that they aren't implemented everywhere.

comment:11 by Olly Betts, 11 years ago

Milestone: 1.3.21.3.3

comment:12 by Olly Betts, 9 years ago

Status: newassigned

POSIX says Note that not all implementations allow this option to be set:

http://pubs.opengroup.org/onlinepubs/009695399/functions/setsockopt.html

Checking on-line copies of man pages, settings SO_RCVTIMEO and SO_SNDTIMEO are supported on:

The Solaris man page doesn't mention these settings, though they are defined in sys/socket.h. However, it seems Solaris doesn't actually support them: https://issues.apache.org/jira/browse/THRIFT-1371

And AIX allows them to be set, but doesn't use them: http://www-01.ibm.com/support/knowledgecenter/ssw_aix_71/com.ibm.aix.commtrf2/getsockopt.htm?lang=en

So they are supported by at least the common open operating systems, and some of the proprietary ones. I'm inclined to say we just make use of them, and accept a clean patch if someone cares enough to fix this for other platforms (as noted in the comments above, the attached patch makes the whole replication time out if it takes too long, which isn't what we want at all). Having working timeouts on most platforms beats not having them anywhere.

comment:13 by Olly Betts, 9 years ago

We could also enable SO_KEEPALIVE which means the connection won't hang around forever if the other end goes away, though it could take a couple of hours to notice: http://www.unixguide.net/network/socketfaq/2.8.shtml

It is at least specified by POSIX: http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_10_16

If we enable that and use socket timeouts where supported, then the original issue of having to manually restart it would be gone, though you might end up with a slave lagging until the keepalive packet is next sent.

comment:14 by Olly Betts, 9 years ago

Milestone: 1.3.31.2.21

[1500c38b4f9720a8973e58ce76ea10beb3203fd2] adds an option to xapian-replicate to allow specifying a socket level timeout, and also turns on SO_KEEPALIVE for that socket.

I'm not clear on how to repeat the test with iptables. It says I need to find the port with lsof -i, but is that the port on the replication server or client?

And I guess I need to run that while a replication is happening - did you just use a suitably large database that replication ran for long enough that you could look up the port and run the iptables command, or was there some trick to this?

It would be nice to backport this to 1.2.x, so marking for 1.2.21, at least for now. I'd really like to be able to test this first though.

comment:15 by Olly Betts, 9 years ago

Resolution: fixed
Status: assignedclosed

Backported in [9b77d851031ee5db0d210beba407988beae78570]. The lack of testing that this actually addresses the reported situation is still a concern, but the new timeout defaults to 0, so this shouldn't break current usage of 1.2.x.

Last edited 9 years ago by Olly Betts (previous) (diff)
Note: See TracTickets for help on using tickets.