Opened 7 years ago
Closed 6 years ago
#757 closed defect (fixed)
Testcase qp_scale1 fails consistently with --enable-log
Reported by: | Guruprasad Hegde | Owned by: | Olly Betts |
---|---|---|---|
Priority: | low | Milestone: | 1.4.6 |
Component: | Test Suite | Version: | |
Severity: | normal | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Operating System: | All |
Description
Of the four tests for scaling, the third one failed always. Increasing factor of 2.15 slightly may work, as the difference is very less.
I found a ticket #365 which is fixed long back.
Test run log on my system:
[gp@gp tests]$ ./runtest ./apitest -v qp_scale1 Running test './apitest -v qp_scale1' under eatmydata and valgrind Running tests with backend "honey"... Running tests with backend "none"... Running tests with backend "inmemory"... Running tests with backend "glass"... Running test: qp_scale1... FAILED small=0.822055s, large=0.694385s small=2.97991s, large=3.67972s small=4.84824s, large=10.6422s /home/gp/gsoc/xapian/xapian-core/tests/api_queryparser.cc:2683: (time2) < (time1 * 2.15) Evaluates to: 10.6422 < 10.4237 ./apitest backend glass: 0 tests passed, 1 failed. Running tests with backend "singlefile_glass"... Running tests with backend "multi_glass"... Running test: qp_scale1... FAILED small=0.794611s, large=0.679327s small=3.06748s, large=4.02026s small=4.93212s, large=11.0806s /home/gp/gsoc/xapian/xapian-core/tests/api_queryparser.cc:2683: (time2) < (time1 * 2.15) Evaluates to: 11.0806 < 10.6041 ./apitest backend multi_glass: 0 tests passed, 1 failed. Running tests with backend "remoteprog_glass"... Running tests with backend "remotetcp_glass"... ./apitest total: 0 tests passed, 2 failed.
Change History (8)
comment:1 by , 7 years ago
comment:2 by , 7 years ago
It's a somewhat empirical choice (see #308). In the current version of this testcase, "small" times a loop parsing the same query string q
n
times, and "large" concatenates n/5
copies of q
together and parses that 5 times.
We're trying to check that long query strings don't cause scaling problems, but the test is not ideal as we incur the overheads of parsing a lot more times for the "large" tests. Ideally the test would be constructed such that "small" and "large" would give equal times, and we'd just have a margin to allow for random variations due to other loads on the machine, etc. That's how most most of the scaling tests we have work (see tests/harness/scalability.h
which defines O_N_LOG_N
etc thresholds which are each the geometric mean between the scaling behaviours we want them to discriminate between - for 10 times the work, O(n)
-> 10, O(log(n))
-> 20, so the threshold is set based on sqrt(10*20)
which is about 14.1). But this testcase is older that the scalability.h
mechanisms.
It'd be better to try to adjust this testcase to be more self-balancing than just nudge the factor further.
But I wonder if there's actually a problem here - are you using ./configure --enable-log
? I noticed one of the scaling tests failing with that enabled recently, but haven't had a chance to investigate.
The logging certainly can cause issues - e.g. it can turn a method which takes a Query
object from O(1)
to O(n)
by calling Query::get_description()
when it logs the parameter value.
comment:3 by , 7 years ago
are you using ./configure --enable-log?
Yes. You are right, it is because of the log. I tested with log disabled, test passes.
comment:4 by , 7 years ago
Cc: | removed |
---|---|
Component: | Other → Test Suite |
Milestone: | → 1.4.6 |
Status: | new → assigned |
Summary: | Testcase qp_scale1 fails consistently → Testcase qp_scale1 fails consistently with --enable-log |
The logging certainly can cause issues - e.g. it can turn a method which takes a
Query
object fromO(1)
toO(n)
by callingQuery::get_description()
when it logs the parameter value.
This is unfortunate, but seems hard to avoid as creating a query object's description is inherently O(n)
for a description which fully describes the object.
I think we probably need to disable scaling tests when --enable-log
is used (not ideal but better than them failing) or somehow make them "soft fail" (not ideal either).
We already have an AUTOMATED_TESTING
check which skips timed tests like these (because it's really unhelpful for an automated build to fail its tests due to external factors such as abnormal load from other processes on the machine), so enable-log could just hook into that.
Marking for 1.4.6.
comment:6 by , 7 years ago
Status: | assigned → closed |
---|
Backported to RELEASE/1.4 branch in a86bc0a98c15a0fee28e351f8fad7bea3e8eae50.
comment:7 by , 6 years ago
Status: | closed → reopened |
---|
comment:8 by , 6 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Reopening then closing with resolution "fixed" (instead of being closed without a resolution).
How the value of 2.15 decided? Or it could be that any change in synonym related code introduced additional time complexity?