Opened 14 years ago

Closed 14 years ago

#487 closed defect (fixed)

ordecay1 fails on i386 architecture

Reported by: Richard Boulton Owned by: Olly Betts
Priority: normal Milestone: 1.0.21
Component: Build system Version: SVN trunk
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Operating System: All

Description

Currently, test ordecay1 fails on an i386 box as follows:

Running test: ordecay1... FAILED
docids differ at item 22 in range: 44 != 31

If I recompile with CXXFLAGS="-mfpmath=sse -msse", the test passes, so I'm pretty sure this is triggered (or at least exacerbated) by i386 excess precision. It also seems that the matcher is a bit faster with this option (I measured 6% faster, but that was in a situation which is probably close to optimal for showing the difference).

Configure should probably enable the appropriate option to CXXFLAGS on i386 architectures by default. The debian packages will then need to build a separate library for actual 486 processors, so there may need to be a configure flag to override this.

This is certainly a problem with trunk, and will also be a problem with the 1.0 branch if ordecay1 has been backported there.

Change History (4)

comment:1 by Olly Betts, 14 years ago

Component: OtherBuild system

Should be fixed on trunk by r14686, but I've not tested yet.

We're actually now using -mfpmath=sse -msse2 -mtune=generic -march=pentium4 which assumes a Pentium 4. SSE2 added double precision FP instructions, and we use double a lot, and the last two mean that we'll generate code which will work on a Pentium 4, but is optimised to run fast on modern CPUs.

comment:2 by Olly Betts, 14 years ago

Status: newassigned

Fixed to put the new flags in AM_CXXFLAGS so we don't clobber user specified CXXFLAGS, or the default of -O2 -g in trunk r14687.

comment:3 by Olly Betts, 14 years ago

Milestone: 1.2.11.0.21

-march=pentium4 doesn't seem to give a measurable speed-up (from Richard's tests) and it carries a small risk of introducing instructions which don't work on some obscure CPU which implements SSE2, so I've removed that in r14692.

So this is now sorted in trunk. Marking for backport to 1.0.21.

comment:4 by Olly Betts, 14 years ago

Resolution: fixed
Status: assignedclosed

Backported for 1.0.21 in r14693.

Note: See TracTickets for help on using tickets.