Opened 5 years ago

Closed 5 years ago

#783 closed defect (fixed)

Remote protocol breaks

Reported by: German M. Bravo Owned by: Olly Betts
Priority: normal Milestone: 1.4.12
Component: Backend-Remote Version: 1.4.11
Severity: major Keywords:
Cc: Blocked By:
Blocking: Operating System: All

Description

Every command in the remote database must wait for a reply from the server, every single one of them. Although some commands may look harmless at first sight, every command could eventually throw an exception. If the remote database doesn't wait for a reply, the exception could end up as a reply to some other command, making all further commands out of sync.

A real world example folows:

Remote database issues MSG_REPLACEDOCUMENT in line 1 doesn't wait for a response so some other thread or something els issues a new command (MSG_REOPEN, line 2) but for some reason, this time MSG_REPLACEDOCUMENT does return an exception!. The MSG_REPOEN in line 2 receives the exception and assumes it's was it's own exception, so [my] code retries a MSG_REOPEN (line 4); this time it gets a REPLY_DONE, actually the REPLY_DONE which was intended for the first MSG_REOPEN (in line 2)... a new command is issued MSG_POSTLIST (line 6) and this one receives REPLY_DONE, which was intended for the second MSG_REOPEN, hence the error in line 8... hell is unleashed now.

 1  << MSG_REPLACEDOCUMENT (18)
 2  << MSG_REOPEN (12)
 3  >> REPLY_EXCEPTION (1)
 4  << MSG_REOPEN (12)
 5  >> REPLY_DONE (2)
 6  << MSG_POSTLIST (11)
 7  >> REPLY_DONE (2)
 8     ERROR: Expecting reply type REPLY_POSTLISTSTART (14) or REPLY_POSTLISTSTART (14), got REPLY_DONE (2)
 9  << MSG_REOPEN (12)
10  >> REPLY_POSTLISTSTART (14)
11     ERROR: Expecting reply type REPLY_UPDATE (0) or REPLY_DONE (2), got REPLY_POSTLISTSTART (14)
12  << MSG_REOPEN (12)
13  >> REPLY_POSTLISTITEM (15)
14     ERROR: Expecting reply type REPLY_UPDATE (0) or REPLY_DONE (2), got REPLY_POSTLISTITEM (15)
15  << MSG_REOPEN (12)
16  >> REPLY_DONE (2)
17  << MSG_POSTLIST (11)
18  >> REPLY_DONE (2)
19     ERROR: Expecting reply type REPLY_POSTLISTSTART (14) or REPLY_POSTLISTSTART (14), got REPLY_DONE (2)
20  << MSG_REOPEN (12)
21  >> REPLY_DONE (2)
22  << MSG_POSTLIST (11)
23  >> REPLY_DONE (2)

Ordering the appropriate sequence of messages, really bad things could happen.

Change History (7)

comment:1 by Olly Betts, 5 years ago

It looks to me like these are all of the problematic cases:

  • MSG_DELETEDOCUMENTTERM
  • MSG_REPLACEDOCUMENT
  • MSG_REPLACEDOCUMENTTERM
  • MSG_CANCEL
  • MSG_SETMETADATA
  • MSG_ADDSPELLING

(There's also MSG_SHUTDOWN but that just expects the connection to be closed in response.)

comment:2 by Olly Betts, 5 years ago

We can regression test some of these easily, e.g.:

    Xapian::Document doc;
    doc.add_term(string(300, 'x'));
    TEST_EXCEPTION(Xapian::InvalidArgumentError,
                   db.replace_document(1, doc));

And similarly for MSG_REPLACEDOCUMENTTERM.

I don't see any easy way to trigger an exception during automated testing for the others though.

comment:3 by Olly Betts, 5 years ago

MSG_REPLACEDOCUMENTTERM is actually OK (I misread, and it returns REPLY_ADDDOCUMENT).

comment:4 by German M. Bravo, 5 years ago

Yes, those are exactly the five cases I found and fixed by retiring REPLY_DONE. Waiting for a reply message did fix many issues I was experiencing. Particularly receiving unexpected exceptions in weird places.

comment:5 by Olly Betts, 5 years ago

If you'd already fixed it, why didn't you attach a patch?

Anyway, fixed in git master in [4de4248be2937defc7cbfb76c3727714c8eabde3].

It'd be good to address for 1.4.x but ideally without a major protocol version bump.

comment:6 by Olly Betts, 5 years ago

Milestone: 1.4.12
Priority: highestnormal
Severity: blockermajor
Status: newassigned
Version: 1.4.11

comment:7 by Olly Betts, 5 years ago

Resolution: fixed
Status: assignedclosed

Backported for 1.4.12 in [9165baabe0b834837f4a42302165bef655442363]. The backport only requires a minor protocol version bump (so existing clients will work with the new server).

Note: See TracTickets for help on using tickets.