Opened 7 years ago

Closed 6 years ago

#757 closed defect (fixed)

Testcase qp_scale1 fails consistently with --enable-log

Reported by: Guruprasad Hegde Owned by: Olly Betts
Priority: low Milestone: 1.4.6
Component: Test Suite Version:
Severity: normal Keywords:
Cc: Blocked By:
Blocking: Operating System: All

Description

Of the four tests for scaling, the third one failed always. Increasing factor of 2.15 slightly may work, as the difference is very less.

I found a ticket #365 which is fixed long back.

Test run log on my system:

[gp@gp tests]$ ./runtest ./apitest -v qp_scale1
Running test './apitest -v qp_scale1' under eatmydata and valgrind
Running tests with backend "honey"...
Running tests with backend "none"...
Running tests with backend "inmemory"...
Running tests with backend "glass"...
Running test: qp_scale1... FAILED
small=0.822055s, large=0.694385s
small=2.97991s, large=3.67972s
small=4.84824s, large=10.6422s
/home/gp/gsoc/xapian/xapian-core/tests/api_queryparser.cc:2683: (time2) < (time1 * 2.15)
Evaluates to: 10.6422 < 10.4237


./apitest backend glass: 0 tests passed, 1 failed.
Running tests with backend "singlefile_glass"...
Running tests with backend "multi_glass"...
Running test: qp_scale1... FAILED
small=0.794611s, large=0.679327s
small=3.06748s, large=4.02026s
small=4.93212s, large=11.0806s
/home/gp/gsoc/xapian/xapian-core/tests/api_queryparser.cc:2683: (time2) < (time1 * 2.15)
Evaluates to: 11.0806 < 10.6041


./apitest backend multi_glass: 0 tests passed, 1 failed.
Running tests with backend "remoteprog_glass"...
Running tests with backend "remotetcp_glass"...
./apitest total: 0 tests passed, 2 failed.

Change History (8)

comment:1 by Guruprasad Hegde, 7 years ago

How the value of 2.15 decided? Or it could be that any change in synonym related code introduced additional time complexity?

comment:2 by Olly Betts, 7 years ago

It's a somewhat empirical choice (see #308). In the current version of this testcase, "small" times a loop parsing the same query string q n times, and "large" concatenates n/5 copies of q together and parses that 5 times.

We're trying to check that long query strings don't cause scaling problems, but the test is not ideal as we incur the overheads of parsing a lot more times for the "large" tests. Ideally the test would be constructed such that "small" and "large" would give equal times, and we'd just have a margin to allow for random variations due to other loads on the machine, etc. That's how most most of the scaling tests we have work (see tests/harness/scalability.h which defines O_N_LOG_N etc thresholds which are each the geometric mean between the scaling behaviours we want them to discriminate between - for 10 times the work, O(n) -> 10, O(log(n)) -> 20, so the threshold is set based on sqrt(10*20) which is about 14.1). But this testcase is older that the scalability.h mechanisms.

It'd be better to try to adjust this testcase to be more self-balancing than just nudge the factor further.

But I wonder if there's actually a problem here - are you using ./configure --enable-log? I noticed one of the scaling tests failing with that enabled recently, but haven't had a chance to investigate.

The logging certainly can cause issues - e.g. it can turn a method which takes a Query object from O(1) to O(n) by calling Query::get_description() when it logs the parameter value.

Last edited 6 years ago by Olly Betts (previous) (diff)

comment:3 by Guruprasad Hegde, 7 years ago

are you using ./configure --enable-log?

Yes. You are right, it is because of the log. I tested with log disabled, test passes.

comment:4 by Olly Betts, 6 years ago

Cc: Olly Betts removed
Component: OtherTest Suite
Milestone: 1.4.6
Status: newassigned
Summary: Testcase qp_scale1 fails consistentlyTestcase qp_scale1 fails consistently with --enable-log

The logging certainly can cause issues - e.g. it can turn a method which takes a Query object from O(1) to O(n) by calling Query::get_description() when it logs the parameter value.

This is unfortunate, but seems hard to avoid as creating a query object's description is inherently O(n) for a description which fully describes the object.

I think we probably need to disable scaling tests when --enable-log is used (not ideal but better than them failing) or somehow make them "soft fail" (not ideal either).

We already have an AUTOMATED_TESTING check which skips timed tests like these (because it's really unhelpful for an automated build to fail its tests due to external factors such as abnormal load from other processes on the machine), so enable-log could just hook into that.

Marking for 1.4.6.

comment:5 by Olly Betts, 6 years ago

comment:6 by Olly Betts, 6 years ago

Status: assignedclosed

Backported to RELEASE/1.4 branch in a86bc0a98c15a0fee28e351f8fad7bea3e8eae50.

comment:7 by Olly Betts, 6 years ago

Status: closedreopened

comment:8 by Olly Betts, 6 years ago

Resolution: fixed
Status: reopenedclosed

Reopening then closing with resolution "fixed" (instead of being closed without a resolution).

Note: See TracTickets for help on using tickets.