Opened 16 years ago

Closed 16 years ago

Last modified 15 years ago

#242 closed defect (fixed)

replicate1 fails on windows

Reported by: Charlie Hull Owned by: New Bugs
Priority: normal Milestone: 1.1.0
Component: Backend-Remote Version: SVN trunk
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Operating System: Microsoft Windows

Description (last modified by Olly Betts)

In remoteconnection.cc we currently use a pair of file descriptors, which may refer to a socket, stdin/stdout or a file. However we also need in the Windows case to use ReadFile/WriteFile which require a HANDLE. We use _get_osfhandle to attempt to translate a file descriptor into a HANDLE, however this is not recommended by Microsoft and may cause problems (we have already found that file pointers for files - not sockets or stdin/stdout - are not advanced automatically, meaning we have to manually use an OVERLAPPED structure to keep file pointers correct). Perhaps remoteconnection.cc should open/close files internally, so we could use Windows HANDLES when we need them, or fd's when we don't. Ref: http://support.microsoft.com/kb/99173

Attachments (1)

patch7.patch (3.8 KB ) - added by Charlie Hull 16 years ago.
Patch to make replicate1 test pass on Windows

Download all attachments as: .zip

Change History (6)

comment:1 by Olly Betts, 16 years ago

AIUI, we use the overlapped stuff to implement a non-blocking read with a timeout, not to "keep file pointers correct" (in fact, the only reason we dabble with HANDLE and ReadFile is so we can implement a non-blocking read with a timeout). Where in the code do you think we're fixing up file pointers?

The current code appears to work. I'm not terribly sympathetic to spending time and effort working around deficiencies in MS APIs which we've already worked around.

by Charlie Hull, 16 years ago

Attachment: patch7.patch added

Patch to make replicate1 test pass on Windows

comment:2 by Olly Betts, 16 years ago

Summary: Mixing Windows API calls may cause problemsreplicate1 fails on windows

Setting the summary to something helpful

comment:3 by Richard Boulton, 16 years ago

Operating System: Microsoft Windows
Resolution: fixed
Status: newclosed

I've now applied this patch (with various tweaks), so (if I've done it correctly), we've now worked around the known problems resulting from use of mixed file handling techniques.

However, I'm aware that the replication stuff exposes (for the first time in the Xapian API) a file descriptor. Therefore, when this becomes part of the supported API, we're going to be stuck with using these workarounds. I'm not sure whether there is a cleaner solution, but if there is, we should find it and ensure that the API allows it to be used, before releasing 1.1.

comment:4 by Olly Betts, 16 years ago

Just for the record, note that the MSDN docs for WriteFile() actually document that the offset in the OVERLAPPED structure isn't updated. So that's not a bug.

Exposing an fd doesn't seem totally ideal if it can be cleanly avoided, but it's certainly possible to be too abstract about such things. I'll have a look and see if anything occurs to me.

We could sidestep using _get_osfhandle() if we implemented this using read()/write() and some sort of alarm which interrupted the read after the timeout. I'm not sure if there's a suitable API for that. One thought is to start a new thread which sleeps for the timeout and then sends a signal, but the thread start overhead is probably excessive for the job. Keeping a special thread about for the job is probably possible.

comment:6 by Olly Betts, 15 years ago

Description: modified (diff)
Milestone: 1.1.0
Version: 1.0.5SVN trunk

This bug claims "Version: 1.0.5", but that makes no sense as database replication (and so the replicate1 testcase) isn't in the 1.0 branch. So I'm changing that to SVN trunk for the record. Shout if I'm confused there.

Also, this fix will be in 1.1.0.

Note: See TracTickets for help on using tickets.