Context Navigation

← Previous Ticket
Next Ticket →

#764 closed defect (fixed)

qp_scale1 intermittently fails on 64-core POWER9 workstation

Reported by:	A. Wilcox	Owned by:	Olly Betts
Priority:	normal	Milestone:	1.4.15
Component:	Test Suite	Version:	1.4.6
Severity:	normal	Keywords:
Cc:		Blocked By:
Blocking:		Operating System:	Linux

Description

Five runs on my 3.9 GHz, 64-core POWER9 workstation yield two passes, and various failures on the other three runs:

awilcox on gwyn [pts/2 Mon 16 3:12] tests: ./runtest ./apitest -v qp_scale1
Running tests with backend "none"...
Running tests with backend "inmemory"...
Running tests with backend "glass"...
Running test: qp_scale1... ok
./apitest backend glass: All 1 tests passed.
Running tests with backend "singlefile_glass"...
Running tests with backend "multi_glass"...
Running test: qp_scale1... ok
./apitest backend multi_glass: All 1 tests passed.
Running tests with backend "remoteprog_glass"...
Running tests with backend "remotetcp_glass"...
Running tests with backend "chert"...
Running test: qp_scale1... ok
./apitest backend chert: All 1 tests passed.
Running tests with backend "multi_chert"...
Running test: qp_scale1... ok
./apitest backend multi_chert: All 1 tests passed.
Running tests with backend "remoteprog_chert"...
Running tests with backend "remotetcp_chert"...
./apitest total: All 4 tests passed.
awilcox on gwyn [pts/2 Mon 16 3:12] tests: ./runtest ./apitest -v qp_scale1
Running tests with backend "none"...
Running tests with backend "inmemory"...
Running tests with backend "glass"...
Running test: qp_scale1... ok
./apitest backend glass: All 1 tests passed.
Running tests with backend "singlefile_glass"...
Running tests with backend "multi_glass"...
Running test: qp_scale1... ok
./apitest backend multi_glass: All 1 tests passed.
Running tests with backend "remoteprog_glass"...
Running tests with backend "remotetcp_glass"...
Running tests with backend "chert"...
Running test: qp_scale1... ok
./apitest backend chert: All 1 tests passed.
Running tests with backend "multi_chert"...
Running test: qp_scale1... FAILED
small=0.011864s, large=0.01s
small=0.019998s, large=0.049996s
api_queryparser.cc:2639: (time2) < (time1 * 2.15)
Evaluates to: 0.049996 < 0.0429957


./apitest backend multi_chert: 0 tests passed, 1 failed.
Running tests with backend "remoteprog_chert"...
Running tests with backend "remotetcp_chert"...
./apitest total: 3 tests passed, 1 failed.
awilcox on gwyn [pts/2 Mon 16 3:12] tests: ./runtest ./apitest -v qp_scale1
Running tests with backend "none"...
Running tests with backend "inmemory"...
Running tests with backend "glass"...
Running test: qp_scale1... ok
./apitest backend glass: All 1 tests passed.
Running tests with backend "singlefile_glass"...
Running tests with backend "multi_glass"...
Running test: qp_scale1... FAILED
small=0.003992s, large=0.019981s
api_queryparser.cc:2639: (time2) < (time1 * 2.15)
Evaluates to: 0.019981 < 0.0085828


./apitest backend multi_glass: 0 tests passed, 1 failed.
Running tests with backend "remoteprog_glass"...
Running tests with backend "remotetcp_glass"...
Running tests with backend "chert"...
Running test: qp_scale1... ok
./apitest backend chert: All 1 tests passed.
Running tests with backend "multi_chert"...
Running test: qp_scale1... FAILED
small=0.00814s, large=0.029998s
api_queryparser.cc:2639: (time2) < (time1 * 2.15)
Evaluates to: 0.029998 < 0.017501


./apitest backend multi_chert: 0 tests passed, 1 failed.
Running tests with backend "remoteprog_chert"...
Running tests with backend "remotetcp_chert"...
./apitest total: 2 tests passed, 2 failed.
awilcox on gwyn [pts/2 Mon 16 3:12] tests: ./runtest ./apitest -v qp_scale1
Running tests with backend "none"...
Running tests with backend "inmemory"...
Running tests with backend "glass"...
Running test: qp_scale1... ok
./apitest backend glass: All 1 tests passed.
Running tests with backend "singlefile_glass"...
Running tests with backend "multi_glass"...
Running test: qp_scale1... FAILED
small=0.007257s, large=0.019999s
api_queryparser.cc:2639: (time2) < (time1 * 2.15)
Evaluates to: 0.019999 < 0.0156026


./apitest backend multi_glass: 0 tests passed, 1 failed.
Running tests with backend "remoteprog_glass"...
Running tests with backend "remotetcp_glass"...
Running tests with backend "chert"...
Running test: qp_scale1... ok
./apitest backend chert: All 1 tests passed.
Running tests with backend "multi_chert"...
Running test: qp_scale1... ok
./apitest backend multi_chert: All 1 tests passed.
Running tests with backend "remoteprog_chert"...
Running tests with backend "remotetcp_chert"...
./apitest total: 3 tests passed, 1 failed.
awilcox on gwyn [pts/2 Mon 16 3:12] tests: ./runtest ./apitest -v qp_scale1
Running tests with backend "none"...
Running tests with backend "inmemory"...
Running tests with backend "glass"...
Running test: qp_scale1... ok
./apitest backend glass: All 1 tests passed.
Running tests with backend "singlefile_glass"...
Running tests with backend "multi_glass"...
Running test: qp_scale1... ok
./apitest backend multi_glass: All 1 tests passed.
Running tests with backend "remoteprog_glass"...
Running tests with backend "remotetcp_glass"...
Running tests with backend "chert"...
Running test: qp_scale1... ok
./apitest backend chert: All 1 tests passed.
Running tests with backend "multi_chert"...
Running test: qp_scale1... ok
./apitest backend multi_chert: All 1 tests passed.
Running tests with backend "remoteprog_chert"...
Running tests with backend "remotetcp_chert"...
./apitest total: All 4 tests passed.

I'm not sure how best to handle this. It seems that our package builder crunches through this test so fast that kernel scheduler deltas throw it off.

Change History (4)

comment:1 by A. Wilcox, 7 years ago

Summary:	qp_scale1 intermittantly fails on 64-core POWER9 workstation → qp_scale1 intermittently fails on 64-core POWER9 workstation

comment:2 by Olly Betts, 7 years ago

I'm not sure why you're seeing this test be quite so flaky, but such timed tests of scaling are hard to make 100% robust in the face of uneven loads, etc so if you're running the testsuite as part of an automated build, you really want to set environment variable AUTOMATED_TESTING=1 which will cause all such timed tests to be skipped.

comment:3 by Olly Betts, 6 years ago

I've made some changes which may improve this, though for me it only fails very very rarely so I can't be very sure. These should be in 1.4.15 (once it is release):

626bb0058fdcee9a30e667ebde1ae4ca66e8ff39 (cherry picked from commit c30ec9a6ca9f1a91e0d7c160cabf280c3b1c5299)
399d143dd6a32913c81f782e968ba0640474232d (cherry picked from commit f8383842183b6af604f7f066a9538754d9b263af)
15a48e06de22a8a23a6a2bf26d84e1329fcd595d (cherry picked from commit d94be50772672c7bfbe120a0af7d10feb331ea42)
1e80e20039cd2841a2d40079e06aba21d8aa99a7 (cherry picked from commit a171ad53225e7aaa079b0aef80790f2a24e08c44)

Feedback on whether these changes help would be useful.

If this testcase remains a bit flaky perhaps we should change it to test the scaling behaviour for the current "large" case as the number of repeated copies of the input query string grows (which is more like how our other scaling testcases work) - currently it compares parsing one copy N*M times with parsing N copies M times.

comment:4 by Olly Betts, 3 years ago

Milestone:	→ 1.4.15
Resolution:	→ fixed
Status:	new → closed

No feedback for more than 2 years, so assuming the changes I made have addressed this and will close. If you're still seeing this please reopen.

PackagingXapian documents the recommendation to set AUTOMATED_TESTING in package builds.

Note: See TracTickets for help on using tickets.

Download in other formats: