Opened 6 years ago
Last modified 19 months ago
#767 new enhancement
Make testcases using different backends run in parallel
Reported by: | Guruprasad Hegde | Owned by: | Guruprasad Hegde |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Test Suite | Version: | git master |
Severity: | normal | Keywords: | |
Cc: | Olly Betts, Gaurav Arora | Blocked By: | |
Blocking: | Operating System: | All |
Description
All test cases are run sequentially. Test cases which use different backends can be run in parallel. As a result total elapsed time can be reduced by a great factor.
This ticket tracks initial planning and work done to speed up the test suite.
Plan: We have decided to fork a child process for each backend run test cases related to that backend in the child process. We must devise a way to report the progress from multiple processes running test cases in parallel. Olly suggested about using a pipe to communicate the progress to parent process which performs the display job.
BackendManagerSingleFile
, BackendManagerMulti
and BackendManagerRemote
share glass database hence these can't be run in parallel.
Attachments (1)
Change History (11)
comment:1 by , 6 years ago
comment:2 by , 6 years ago
I tried this - In testrunner.cc
, ignoring the exception handling, output format with Valgrind disabled:
128 int children_count = 0; 129 pid_t pid; 130 #ifdef XAPIAN_HAS_HONEY_BACKEND 131 pid = fork(); 132 if (pid == 0) { 133 do_tests_for_backend(BackendManagerHoney(datadir)); 134 exit(0); 135 } 136 children_count++; 137 #endif 138 139 pid = fork(); 140 if (pid == 0) { 141 do_tests_for_backend(BackendManager(string())); 142 exit(0); 143 } 144 children_count++; 145 146 #ifdef XAPIAN_HAS_INMEMORY_BACKEND 147 pid = fork(); 148 if (pid == 0) { 149 do_tests_for_backend(BackendManagerInMemory(datadir)); 150 exit(0); 151 } 152 children_count++; 153 #endif 154 155 #ifdef XAPIAN_HAS_GLASS_BACKEND 156 { 157 BackendManagerGlass glass_man(datadir); 158 do_tests_for_backend(glass_man); 159 do_tests_for_backend(BackendManagerSingleFile(datadir, &glass_man)); 160 do_tests_for_backend(BackendManagerMulti(datadir, &glass_man)); 161 # ifdef XAPIAN_HAS_REMOTE_BACKEND 162 do_tests_for_backend(BackendManagerRemoteProg(&glass_man)); 163 do_tests_for_backend(BackendManagerRemoteTcp(&glass_man)); 164 # endif 165 } 166 #endif 167 int status; 168 for (int i = children_count; i != 0; --i) { 169 wait(&status); 170 }
Run time with parallel:
real 1m49.777s user 0m48.546s sys 0m31.740s
Run time without parallel:
real 2m2.414s user 0m42.655s sys 0m30.834s
With Valgrind enabled, I got these prints for few testcases - Leak summary:
and 304 bytes in 1 blocks are possibly lost in loss record 52 of 57
. I guess these prints are from Valgrind.
by , 6 years ago
Attachment: | gp_cpu_info.png added |
---|
comment:3 by , 6 years ago
We need to update result_so_far
variable too. So each child process must send subtotal
.
comment:4 by , 6 years ago
Updates about leak summary
error mentioned in comment: All Valgrind errors are reported for backend honey
.
All erros are similar to the one below, I am not sure to which testcase this error is related, since outputs are mixed.
==7621== 304 bytes in 1 blocks are possibly lost in loss record 64 of 70 ==7621== at 0x4C2EEF5: calloc (vg_replace_malloc.c:711) ==7621== by 0x40112B2: allocate_dtv (in /usr/lib/ld-2.27.so) ==7621== by 0x4011C3D: _dl_allocate_tls (in /usr/lib/ld-2.27.so) ==7621== by 0x6B8DBAA: pthread_create@@GLIBC_2.2.5 (in /usr/lib/libpthread-2.27.so) ==7621== by 0x5673E6F: ??? (in /usr/lib/librt-2.27.so) ==7621== by 0x6B94A5E: __pthread_once_slow (in /usr/lib/libpthread-2.27.so) ==7621== by 0x5672CDB: timer_create (in /usr/lib/librt-2.27.so) ==7621== by 0x53B2B9F: TimeOut (matchtimeout.h:84) ==7621== by 0x53B2B9F: ProtoMSet (protomset.h:159) ==7621== by 0x53B2B9F: Matcher::get_local_mset(unsigned int, unsigned int, unsigned int, Xapian::Weight const&, Xapian::MatchDecider const*, Xapian::KeyMaker const*, unsigned int, unsigned int, int, double, double, Xapian::Enquire::docid_order, unsigned int, Xapian::Enquire::Internal::sort_setting, bool, double, std::vector<Xapian::Internal::opt_intrusive_ptr<Xapian::MatchSpy>, std::allocator<Xapian::Internal::opt_intrusive_ptr<Xapian::MatchSpy> > > const&) (matcher.cc:315) ==7621== by 0x53B4611: Matcher::get_mset(unsigned int, unsigned int, unsigned int, Xapian::Weight::Internal&, Xapian::Weight const&, Xapian::MatchDecider const*, Xapian::KeyMaker const*, unsigned int, unsigned int, int, double, Xapian::Enquire::docid_order, unsigned int, Xapian::Enquire::Internal::sort_setting, bool, double, std::vector<Xapian::Internal::opt_intrusive_ptr<Xapian::MatchSpy>, std::allocator<Xapian::Internal::opt_intrusive_ptr<Xapian::MatchSpy> > > const&) (matcher.cc:440) ==7621== by 0x528EDA8: Xapian::Enquire::Internal::get_mset(unsigned int, unsigned int, unsigned int, Xapian::RSet const*, Xapian::MatchDecider const*) const (enquire.cc:327) ==7621== by 0x528F293: Xapian::Enquire::get_mset(unsigned int, unsigned int, unsigned int, Xapian::RSet const*, Xapian::MatchDecider const*) const (enquire.cc:205) ==7621== by 0x2462CB: test_matchtimelimit1() (api_postingsource.cc:657) ==7621== ==7621== 304 bytes in 1 blocks are possibly lost in loss record 65 of 70 ==7621== at 0x4C2EEF5: calloc (vg_replace_malloc.c:711) ==7621== by 0x40112B2: allocate_dtv (in /usr/lib/ld-2.27.so) ==7621== by 0x4011C3D: _dl_allocate_tls (in /usr/lib/ld-2.27.so) ==7621== by 0x6B8DBAA: pthread_create@@GLIBC_2.2.5 (in /usr/lib/libpthread-2.27.so) ==7621== by 0x5673D33: ??? (in /usr/lib/librt-2.27.so) ==7621== by 0x6B8D0BB: start_thread (in /usr/lib/libpthread-2.27.so) ==7621== ==7621== LEAK SUMMARY: ==7621== definitely lost: 0 bytes in 0 blocks ==7621== indirectly lost: 0 bytes in 0 blocks ==7621== possibly lost: 608 bytes in 2 blocks ==7621== still reachable: 175,526 bytes in 68 blocks ==7621== suppressed: 0 bytes in 0 blocks ==7621== Reachable blocks (those to which a pointer was found) are not shown. ==7621== To see them, rerun with: --leak-check=full --show-leak-kinds=all ==7621== 304 bytes in 1 blocks are possibly lost in loss record 64 of 70
I tried running only backend honey
in a separate process (using -bhoney
) that gives no error. A single process might not trigger this error, but this is one observation.
comment:5 by , 6 years ago
Regarding the display of output:
All outputs are written to out
(ostream
object sharing cout
buffer), instead of writing to cout
buffer, store result in buffer owned by out
and wherever endl
manipulator added we can write to pipe?
Still, one question is how the parent should print multiple display request from child processes? Is it possible to allot some set of lines on the console for each backend and update those lines on each request as per backend type?
comment:6 by , 6 years ago
Currently
BackendManagerGlass
creates a set of databases in .glass directory by default. Can we make backend manager to create a separate set of databases in a directory given as a parameter? .glass takes 5MB of disk space, is it already a lot of memory? Or this change doesn't fit well?
We could just use separate cache directories for glass
, remotetcp_glass
, etc, but not only does that use more disk space, but we have to build the same database several times, and there's more disk cache pressure, both of which work against trying to speed things up.
I think more work would be needed to get valgrind to work properly here. In runtest
we tell valgrind not to follow child processes after fork()
(and changing that would make things complicated for remote tests). Probably when valgrind is in use the child process needs to exec()
valgrind with a command to run apitest on just the backend of interest with some option to tell it to report output in TAP format.
Perhaps the simplest first step is to parallelise via the makefile - automake's parallel testharness understands TAP format test output and as the name suggests can run tests in parallel.
For output display, I think sending output in TAP format is the best approach. The child process can just write to the pipe by hooking up its end of the pipe as fd 1 and then using cout
- no need to do anything special at the iostreams level. The parent process will need to handle displaying test results from multiple children in a sensible way - I think just showing each completed test and recording the failed ones to summarise at the end is probably the best approach.
I'm surprised you don't see more gain from parallelism - is one child running all the tests which involve glass?
comment:7 by , 6 years ago
Perhaps the simplest first step is to parallelise via the makefile - automake's parallel testharness understands TAP format test output and as the name suggests can run tests in parallel.
Ok. I am getting familiar about automake parallel test harness.
I'm surprised you don't see more gain from parallelism - is one child running all the tests which involve glass?
Yes. All tests which involve glass run in a single process. Tests related to other backends complete very quickly.
comment:8 by , 5 years ago
Component: | Other → Test Suite |
---|---|
Type: | task → enhancement |
Version: | → git master |
comment:9 by , 19 months ago
https://github.com/xapian/xapian/pull/210 has the prototype using automake's parallel test harness.
I tried updating that (and merging honey into the list of glass-based testsuite backends since we compact the glass DB to give the honey one) and the speed up is disappointing (~20% IIRC), I think mostly because the glass-based list is most of the work. I didn't copy the stats off my laptop but I'll try to remember to add them next time I turn it on.
I have come up with a cheap way to schedule though.
If we annotate testcases (could be automatically derived) with the database names they use then we can partition the testcases such that any which use the same DB are in the same partition. Some use more than one, and that will kind of span between DBs and pull them into the same partition, but many use their own DB or share a DB but without such overlaps.
We order these partitions by decreasing expected time to process (could be just by number of testcases as a simple approximation, but we could feedback from actual runtime), then each worker subprocess just gets the next partition from the list to work through when it needs more to do. This simple greedy algorithm should work well as we have a load of small partitions which should help even out the end of the run between workers.
These partitions can take into account testsuite backends, overlap between them (e.g. glass and honey) and which testcases run for each.
If we aren't running in parallel we could even run testcases in the same order as currently by having a suitable alternative partition set for that.
comment:10 by , 19 months ago
Here are the timings - saving was actually jut under 11% for 4-way parallelism. This was on x86 with eatmydata and without valgrind (just so the runs didn't take so long). /proc/cpuinfo
reports 8 CPUs, but I think it's really 4 + hyperthreading.
time make check -sj4 VALGRIND= AUTOMATED_TESTING=1 real 2m26.917s user 1m8.072s sys 0m36.666s time make check -sj2 VALGRIND= AUTOMATED_TESTING=1 real 2m29.914s user 1m8.009s sys 0m36.334s time make check -s VALGRIND= AUTOMATED_TESTING=1 real 2m44.534s user 1m8.860s sys 0m37.316s
Also notable is that there's not much speed up from 2 to 4 processes.
User and system time is reassuringly similar across the runs too.
Currently
BackendManagerGlass
creates a set of databases in.glass
directory by default. Can we make backend manager to create a separate set of databases in a directory given as a parameter?.glass
takes 5MB of disk space, is it already a lot of memory? Or this change doesn't fit well?