Ticket #156: HACKING

File HACKING, 45.5 KB (added by Richard Boulton, 17 years ago)

Implementation of this suggestion

Line 
1Instructions for hacking on Xapian
2==================================
3
4.. contents:: Table of contents
5
6This file is aimed to help developers get started with working on
7Xapian. The documentation contains a section covering various internal
8aspects of the library - this can also be found on the Xapian website
9<http://www.xapian.org/>.
10
11Extra options to give to configure:
12===================================
13
14Note: Non-developer configure options are described in INSTALL
15
16You will probably want to use some of these if you're going to be developing
17Xapian.
18
19--enable-assertions
20 This enables compiling of assertion code which will throw
21 Xapian::AssertionError if the code detects violating of
22 preconditions, postconditions, or fails other consistency checks.
23
24--enable-assertions=partial
25 This option enables a subset of the assertions enabled by
26 "--enable-assertions", but not the most expensive. The intention is
27 that it should be suitable for use in a real-world system for tracking
28 down problems without imposing too much of an overhead (but note that
29 we haven't yet performed timings to measure the overhead...)
30
31--enable-log
32 This enables compiling into the system of code to generate verbose
33 debugging messages. See "Debugging Messages", below.
34
35--enable-maintainer-mode
36 This tells configure to enable make dependencies for regenerating build
37 system files (such as configure, Makefile.in, and Makefile) and other
38 generated files (such as the stemmers and query parser) when required.
39 These are disabled by default as some make programs try to rebuild them
40 when it's not appropriate (e.g. BSD make doesn't handle VPATH except
41 for implicit rules). If you enable maintainer mode you probably need
42 to use a better make program (GNU make is recommended). You'll also
43 need a non-cross-compiling C compiler for compiling the Lemon parser
44 generator and the Snowball stemming algorithm compiler. The configure
45 script will attempt to locate one, but you can override the
46 autodetection by passing CC_FOR_BUILD on the command line like so:
47
48 ./configure CC_FOR_BUILD=/opt/bin/gcc
49
50--enable-documentation
51 This tells configure to enable make dependencies for regenerating
52 documentation files. By default it uses the same setting as
53 --enable-maintainer-mode.
54
55Debugging Messages
56==================
57
58If you configure with --enable-log, lots of places in the code generate
59debugging messages to tell us what they're up to - this information can be
60very useful for debugging both the Xapian library and code which uses it. But
61the quantity of information generated is potentially vast so there's a
62mechanism to allow you to select where to store the log and which types of
63message you're interested by setting environment variables. You can:
64
65 * set XAPIAN_DEBUG_LOG to be the path to a file that you would like debugging
66 output to be stored in (to override the default of stderr). The first
67 occurrence of %% in the name will be replaced with the process-id.
68
69 * set XAPIAN_DEBUG_FLAGS to the decimal value of a bitmap indicating the types
70 of debugging message you would like to display (the default value is 0,
71 which disables all debug messages). To turn on message type N, bitwise OR
72 XAPIAN_DEBUG_FLAGS with (1<<N) - e.g. for message type 3, OR with 8. To
73 turn on all types, set XAPIAN_DEBUG_FLAGS to -1 (which is all bits set in
74 two's complement binary representation). Each message gives its numerical
75 type in the debug log output.
76
77These environment variables only have any effect if you ran configure with the
78--enable-log option.
79
80Debugging memory allocations
81============================
82
83The testsuite can make use of valgrind to check for memory leaks and reads
84from uninitialised memory during tests. This restricts the platforms which
85we can catch leaks on (valgrind currently supports x86, x86_64, and powerpc
86Linux reliably, with other ports being investigated). However Xapian contains
87very little platform specific code (and most of what there is is Windows
88specific) so even just testing with valgrind on one platform gives good
89coverage.
90
91If you have a new enough version of valgrind installed, it's automatically
92detected by configure and used when running the testsuite. The testsuite runs
93more slowly under valgrind, so if you wish to disable this auto-detection you
94can run configure with:
95
96./configure VALGRIND=
97
98Or you can disable use of valgrind during a particular run of "make check"
99like so:
100
101make check VALGRIND=
102
103Or disable it while running a test directly (under sh or bash):
104
105VALGRIND= ./runtest ./apitest
106
107Running test programs
108=====================
109
110To run all tests, use "make check". You can also run just the subset of
111tests which exercise the remote, quartz, or flint backend using
112"make check-remote", "make check-quartz", or "make check-flint" respectively.
113These are handy shortcuts when doing development work on a particular backend.
114
115The runtest script (in the tests subdirectory) takes care of the details of
116running the test programs (including setting up the environment so they work
117when srcdir != builddir and handling libtool dynamically linked binaries). To
118run a test program by hand (rather than via make) just use:
119
120./runtest ./apitest
121
122You can specify options and arguments. Individual test programs optionally
123take one or more test names as arguments, and you can also pass "-v" to get
124more verbose output from failing tests, e.g.:
125
126./runtest ./apitest -v deldoc1
127
128If the number of the test is omitted, all tests with that basename are run,
129so to run deldoc1, deldoc2, etc:
130
131./runtest ./apitest deldoc
132
133You can also use runtest to run a test program under gdb (or most other tools):
134
135./runtest gdb ./apitest -v deldoc1
136./runtest valgrind ./apitest -v deldoc1
137
138Some test programs take special arguments - for example, you can restrict
139apitest to the flint backend using "-b=flint".
140
141There are a few environmental variables which the testsuite harness checks for
142which you might find useful:
143
144 XAPIAN_TESTSUITE_SIG_DFL:
145 By default, the testsuite harness catches signals and handles them
146 gracefully - the current test is failed, and the testsuite moves onto the
147 next test. If you want to suppress this (some debugging tools may work
148 better if the signal is not caught) set the environment variable
149 XAPIAN_TESTSUITE_SIG_DFL to any value to prevent the testsuite harness
150 from installing its own signal handling.
151
152 XAPIAN_TESTSUITE_OUTPUT:
153 By default, the testsuite harness uses ANSI escape sequences to give
154 colour output if stdout is a tty. You can disable this feature by setting
155 XAPIAN_TESTSUITE_OUTPUT=plain (alternatively, piping the output (e.g.
156 through cat or more) will have the same effect). Auto-detection can be
157 explicitly specified with XAPIAN_TESTSUITE_OUTPUT=auto (or empty). Any
158 other value forces the use of colour. Colour output is always disabled on
159 Microsoft Windows, so XAPIAN_TESTSUITE_OUTPUT has no effect there.
160
161Using various debugging, profiling, and leak-finding tools:
162===========================================================
163
164If you're using GCC 3.4 or newer, you can turn on debugging iterators, etc in
165the GNU C++ STL by defining _GLIBCXX_DEBUG:
166
167 ./configure CPPFLAGS=-D_GLIBCXX_DEBUG
168
169For documentation of this option, see:
170http://gcc.gnu.org/onlinedocs/libstdc++/debug.html
171
172Note: all C++ code must be compiled with this defined or you'll get problems -
173Xapian 0.9.7 and later add a suitable check to xapian/version.h to prevent you
174making this mistake.
175
176To use valgrind (http://www.valgrind.org/), no special build options are
177required, but make sure you compile with debugging information (on by default
178for GCC) and the valgrind documentation recommends disabling optimisation (with
179optimisation, line numbers in error messages can be confusing due to code
180inlining, etc):
181
182 ./configure CXXFLAGS='-O0 -g'
183
184To use gdb (http://www.gnu.org/software/gdb/), no special build options are
185required, but make sure you compile with debugging information (on by default
186for GCC). You'll probably find debugging easier if you compile without
187optimisation (with optimisation, line numbers in error messages can be
188confusing due to code inlining, etc, and the values of some variables can't be
189printed because they've been eliminated from the code completely):
190
191 ./configure CXXFLAGS='-O0 -g'
192
193To enable profiling for gprof:
194
195 ./configure CXXFLAGS=-pg LDFLAGS=-pg
196
197To use Purify (a proprietary tool):
198
199 ./configure CXXLD='purify c++' --disable-shared
200
201To use Insure (another proprietary tool):
202
203 ./configure CXX=insure
204
205If you have runes for using other tools, please add them above, or send them
206to us so we can.
207
208Building from SVN:
209==================
210
211If you're building code from SVN, you'll want to configure with:
212
213./configure --enable-maintainer-mode
214
215This will be done for you if you use the top-level bootstrap script and then
216run the top-level configure this produces (see below for more information about
217this). If you don't enable maintainer mode, then rules to rebuild generated
218sources are disabled (and similarly rules to build documentation are only
219enabled by --enable-documentation, or --enable-maintainer-mode without
220--disable-documentation).
221
222The SVN repository does not contain any automatically generated files
223(such as configure, Makefile.in, Lemon generated sources, etc) because
224experience shows it's best to keep these out of version control. This
225means that if you check the sources out of SVN, before you can successfully
226run the normal build process you'll need to have several programs installed
227so these files can be generated. Note that you can avoid needing to have
228these programs installed by using the SVN snapshots available from the
229"Bleeding Edge" page of the Xapian website. These snapshots are bootstrapped
230tarballs much like any release version.
231
232At the time of writing, these programs are autoconf, automake, and libtool.
233Some older versions of these programs may not work correctly at the time of
234writing, we require the following versions:
235
236 autoconf (GNU Autoconf) 2.59
237 2.57 fixes the annoying chmod warning on FreeBSD. 2.54 is
238 required by automake 1.6.3; autoconf 2.50 is needed for many
239 reasons anyway. automake 1.8.5 needs at least 2.58, and 2.59
240 was released the same day as 2.58 to fix a problem. Currently
241 snapshots and release tarballs are generated with autoconf 2.61
242 but this isn't yet a hard requirement (because the spec file
243 for building RPMs currently runs autoreconf to avoid a libtool
244 bug with setting rpath for /usr/lib64, and such platforms may
245 still only have autoconf 2.59).
246
247 automake (GNU automake) 1.9.5
248 automake 1.4 has problems with "make dist" in a VPATH build,
249 and doesn't support AM_CXXFLAGS.
250
251 automake 1.5 doesn't work with "make check" with Solaris make -
252 the problem is with the rules to build tests/internaltest
253 (perhaps no longer relevant as those rules are simpler now).
254
255 Note that automake 1.6 has a bug which causes it to emit
256 spurious warnings: this is fixed in automake 1.6.1.
257
258 automake 1.7 works too.
259
260 automake 1.8 works too (we required 1.8.5 for ages).
261
262 automake 1.9's NEWS file suggests it will benefit us with
263 smaller Makefile.ins amongst other things. Currently snapshots
264 and release tarballs are generated with automake 1.9.6, but
265 this isn't a hard requirement - automake 1.9.5 is sufficient
266 for bootstrapping sources from SVN (this is the version which
267 debian sarge had, so this shouldn't be an onerous requirement).
268
269 automake 1.10 is now out. It requires autoconf 2.60. I've not
270 yet tested it with Xapian, but I don't expect any problems.
271
272 GNU libtool 1.5.22 + patches
273 libtool 1.5 was the first version to properly support linking
274 C++ libraries, and 1.5.22 is largely 1.5 plus bug fixes and
275 portability enhancements. (Note: nothing actually enforces
276 the requirement for 1.5.22, but this is the version which
277 snapshots and release tarballs are currently bootstrapped
278 with).
279
280 We currently use a patched build of libtool 1.5.22. There
281 patches are (append /raw to the gmane URLs to get plaintext
282 versions):
283
284 A fix for libtool bug on HP-UX:
285
286 http://article.gmane.org/gmane.comp.gnu.libtool.general/7083
287
288 A fix for compiling with -library=stlport4 using Sun C++.
289 This thread discusses the patches:
290
291 http://thread.gmane.org/gmane.comp.gnu.libtool.patches/7041
292
293 The complete patch isn't there in one piece - it was extracted
294 from branch branch-1-5 of libtool's CVS using:
295
296 cvs diff -r1.314.2.160 -r1.314.2.162 libtool.m4
297
298 And 2 fixes for regressions on BSDs introduced in 1.5.22:
299
300 http://article.gmane.org/gmane.comp.gnu.libtool.patches/6862
301 http://article.gmane.org/gmane.comp.gnu.libtool.patches/6861
302
303Please tell us if you find that older or newer versions of any of these
304tools work or fail to work.
305
306We have provided a simple script (bootstrap) to run these programs for you
307on all the xapian modules you've checked out of SVN to produce a source tree
308like that you'd get from unpacking the result of "make dist". bootstrap is
309in SVN in the level above xapian-core, etc. Running bootstrap generates
310a configure script in the top level which allows you to configure xapian-core
311and any other modules you've checked out with one command.
312
313The bootstrap script should be run from its source directory (ie, from the
314directory containing it). The configure script generated by it supports
315building in a separate directory to the sources: simply create the directory
316you want to build in, and then run the configure script from inside that
317directory. For example, to build in a directory called "build" (starting in
318the top level source directory)::
319
320 ./bootstrap
321 mkdir build
322 cd build
323 ../configure
324
325When running bootstrap, you may need to add extra macro directories to the path
326searched by aclocal (which is part of automake) - you can do this by specifying
327these in the ACLOCAL_FLAGS environment variable, e.g.::
328
329 ACLOCAL_FLAGS=-I/extra/macro/directory ./bootstrap
330
331There is a good GNU autotools tutorial at
332<http://www-src.lip6.fr/homepages/Alexandre.Duret-Lutz/autotools.html>.
333
334If you are tracking development in SVN, there will sometimes be changes to the
335build system sources which require regeneration of the generated makefiles and
336associated machinery. We aim to make the build system automatically regenerate
337the necessary files, but in the event that a build fails after an update, it
338may be worth re-running the bootstrap script to regenerate the build system
339from scratch, before looking for the cause of the error elsewhere.
340
341If you want to be able to build distribution tarballs (with "make dist") then
342you'll also need some further tools. The build system is designed to fail with
343a suitable message if you lack any of the required tools (the alternative is to
344build a tarball with various bits missing, which is best avoided - better to be
345told to install pdflatex than to upload a tarball with no PDF manual).
346
347These tools are:
348
349doxygen (v1.4.6 is used for snapshots, releases, and the source documentation)
350dot (part of the graphviz package)
351perl 5
352pdflatex (on Debian, tetex-extra is also required for fancyhdr.sty)
353makeindex (usually packaged with TeX)
354help2man
355rst2html (on Debian/Ubuntu, this is provided by the python-docutils package)
356
357Building from SVN on Windows with MSVC:
358---------------------------------------
359
360The windows build process is maintained in the xapian-maintainer-tools
361directory in the subversion repository. See the win32msvc/README file in that
362directory for details of how to build from subversion.
363
364Use of C++ Features:
365====================
366
367* STL: We decided early on to embrace the C++ STL. Some older compilers
368 don't include full support for this. Often we can work around this, for
369 example:
370
371 * Using om_stringstream instead of stringstream or strstream.
372 * Providing our own auto_ptr implementation (AutoPtr).
373 * Using string::resize(0) instead of string::clear() (for GCC 2.95).
374 * Avoiding use of '#include <limits>' (for GCC 2.95; GCC 3.0+ support it).
375
376 There is now plenty of choice of compilers which provide good conformance to
377 ISO C++, so if working around problems for some compiler proves too hard we
378 should just document the issue and users will either have to upgrade to a
379 more compliant compiler, or use another STL implementation such as STLPort
380 (http://www.stlport.org/).
381
382* RTTI (dynamic_cast<>, typeid, etc): Needing to use RTTI features in the
383 library most likely indicates a design flaw, and you should avoid use
384 of these features. Where necessary, you can use a technique similar to
385 Database::as_networkdatabase() to replace dynamic_cast<>.
386
387* Exceptions: In hindsight, throwing exceptions in the library seems to have
388 been a poor design decision. GCC on Solaris can't cope with exceptions in
389 shared libraries, and we've also had test failures on other platforms which
390 only occur with shared libraries - possibly with a similar cause. Exceptions
391 can also be a pain to handle elegantly in the bindings. We intend to
392 investigate modifying the library to return error codes internally, and then
393 offering the user the choice of exception throwing or error code returning
394 API methods (with the exception being thrown by an inlined wrapper in the
395 externally visible header files). With this in mind, please don't complicate
396 the internal handling of exceptions...
397
398* "using namespace std;" and "using std::XXX;" - it's OK to use these in
399 applications, library code, and internal library headers. But in externally
400 visible headers (such as anything included by "#include <xapian.h>") you MUST
401 use explicit "std::" qualifiers - it's not acceptable to pull anything from
402 namespace std into the namespace of an application which uses Xapian.
403
404* Use C++ style casts (static_cast<>, reinterpret_cast<>, and const_cast<>)
405 in preference to C style casts. The syntax is ugly, but they do make the
406 intent much clearer which is definitely a good thing.
407
408Miscellaneous Portability Issues:
409=================================
410
411Web Resources:
412--------------
413
414The "C++ FAQ Lite" covers many frequently asked C++ questions:
415http://www.parashift.com/c++-faq-lite/
416
417The libstdc++-porting-howto discusses various C++ portability issues:
418http://gcc.gnu.org/onlinedocs/libstdc++/17_intro/porting-howto.html
419
420<fcntl.h>:
421----------
422
423Don't directly '#include <fcntl.h>' - instead '#include "safefcntl.h"'.
424
425The main reason for this is that when using certain compilers on certain
426versions of Solaris, fcntl.h does '#define open open64'. Sadly this breaks C++
427code which has methods called open (as we do). There's a cunning workaround
428for this problem in common/safefcntl.h.
429
430Also, safefcntl.h ensures the O_BINARY is defined (to 0 if not required) so
431calls to open() and creat() can specify O_BINARY unconditionally for the
432benefit of platforms which discriminate between text and binary files.
433
434<windows.h>:
435------------
436
437Don't directly '#include <windows.h>' - instead '#include "safewindows.h"'
438which reduces the bloat of header files included and prevents some of the
439more egregious namespace pollution. It also defines any constants we need
440which might be missing in older versions of the mingw headers.
441
442<winsock2.h>:
443-------------
444
445Don't directly '#include <winsock2.h>' - instead '#include "safewinsock2.h"'.
446This ensure that safewindows.h is included before <winsock2.h> to avoid
447winsock2.h including windows.h without our namespace pollution reducing
448workarounds.
449
450<errno.h>:
451----------
452
453Don't directly '#include <errno.h>' - instead '#include "safeerrno.h"' which
454works around a problem with Compaq's C++ compiler.
455
456<sys/select.h>:
457---------------
458
459Don't directly '#include <sys/select.h>' - instead '#include "safesysselect.h"'
460which supports older UNIX platforms which predate POSIX 1003.1-2001 and works
461around a problem on Solaris.
462
463<sys/stat.h>:
464-------------
465
466Don't directly '#include <sys/stat.h>' - instead '#include "safesysstat.h"'
467which under MSVC enables stat to work on files > 2GB, defines the missing
468POSIX macros S_ISDIR and S_ISREG, pulls in <direct.h> for mkdir() (which is
469provided by sys/stat.h under UNIX) and provides a compatibility wrapper for
470mkdir() which takes 2 arguments (so code using mkdir can always just pass
471two arguments).
472
473<unistd.h>:
474-----------
475
476Don't directly '#include <unistd.h>' - instead '#include "safeunistd.h"'
477- MSVC doesn't even HAVE unistd.h!
478
479The various "safe" headers are maintained in xapian-core/common, but also used
480by Omega. Omega pulls in a copy using the svn:externals property which is
481set on xapian-applications/omega. Because of how this feature of SVN works,
482we pull in a read-only copy via HTTP access to the main repository, so you
483have to update it in xapian-core, and if you have ssh write access to the
484repo but no HTTP access, this will fail.
485
486The imported URL has to be absolute, which isn't too branch friendly. To avoid
487problems from this, we specify a particular revision to import, but this does
488mean we need to monitor changes to xapian-core and decide when to update omega.
489The release checklist includes a reminder to check this.
490
491Warning Free Compilation:
492-------------------------
493
494Compiling without warnings on every platform is our goal, though it's not
495always possible to achieve. For example, GCC 2.95 produces a few bogus
496warnings (e.g. about not returning a value from a non-void function),
497and some GCC 3.x compilers produce the occasional bogus warning (e.g.
498warning that a variable may be used uninitialised, despite it being initialised
499at the point of declaration!)
500
501If using GCC 3.0 or newer, you should consider configure-ing with:
502
503./configure CXXFLAGS=-Werror
504
505when doing development work on Xapian. This promotes warnings to errors,
506and should ensure you don't introduce warnings.
507
508If you configure with --enable-maintainer-mode, and are using GCC 4.0 or newer,
509this is done for you automatically. This is intended to be an aid rather than
510a form of automated punishment - it's all too easy to miss a new warning as
511once a file is compiled, you don't see it unless you modify that file or one of
512its dependencies.
513
514With Intel's C++ compiler, --enable-maintainer-mode also enables -Werror.
515If you know the equivalent of -Werror for other compilers, please add a note
516here, or tell us so that we can add a note.
517
518Configure Options
519=================
520
521Especially for a library, compile-time options aren't a good solution for
522how to integrate a new feature. An increasingly large number of users install
523pre-built binary packages rather than building from source, and unless the
524package is capable of being split into modules, the packager has to choose a
525set of compile-time options to use. And they'll tend to choose either the
526standard ones, or perhaps a broader set to try to keep everyone happy. For a
527library, similar issues occur when installing from source as well - the
528sysadmin must choose the options which will keep all users happy.
529
530Another problem with compile-time options is that it's hard to ensure that
531a change doesn't break compilation under some combination of options without
532actually building and running the test-suite on all combinations. The fewer
533compile-time options, the more likely the code will compile with every
534combination of them.
535
536So please think carefully before adding more compile-time options. They're
537probably OK for experimental features (but should go away once a feature is no
538longer experimental). Options to instrument a build for special purposes
539(debug, profiling, etc) are also acceptable. Disabling whole features probably
540isn't (e.g. the --disable-backend-XXX options we already have are dubious,
541though being able to disable the remote backend can be useful when trying to
542get Xapian going on a platform).
543
544Makefile Portability:
545=====================
546
547We don't want to force those building Xapian from the source distribution to
548have to use GNU make. Requiring GNU make for "make dist" isn't such a problem
549but it's probably better to use portable constructs everywhere to avoid
550problems when people move or copy code between targets. If you do make use
551of non-portable constructs where it's OK, add a comment noting the special
552circumstances which justify doing so.
553
554Here's an incomplete list of things to avoid:
555
556* Don't use "$(RM)" - it's defined by GNU make, but using it actually harms
557 portability as other makes don't define it. Use plain "rm" instead.
558
559* Don't use "%" pattern rules - these are GNU make specific. Use an
560 implicit rule (e.g. ".c.o:") if you can. Otherwise, write out each version
561 explicitly.
562
563* Don't use "$<" except in implicit rules. This is an annoying restriction,
564 as using "$<" makes it much easier to make VPATH builds work. But it's only
565 portable in implicit rules. Tips for rewriting - if it's a source file,
566 write it as::
567
568 $(srcdir)/foo.ext
569
570 If it's a generated object file or similar, just write the name as is. The
571 tricky case is a generated file which isn't in SVN but is shipped in the
572 distribution tarball, as such a file could be in either the source or build
573 tree. Use this trick to make sure it's found whichever directory it's in::
574
575 `test -f foo.ext || echo '$(srcdir)/'`foo.ext
576
577* Don't use "exit 0" to make a rule fail. Use "false" instead. BSD make
578 doesn't like "exit 0" in a rule.
579
580* Don't use make conditionals. Automake offers conditionals which may be
581 of use, and these are implemented to work with any make. See the automake
582 manual for details, and a few caveats.
583
584* The list of portable utilities is:
585
586 cat cmp cp diff echo egrep expr false grep install-info
587 ln ls mkdir mv pwd rm rmdir sed sleep sort tar test touch true
588
589 Note that versions of these (GNU versions in particular) support switches
590 which aren't portable - notably, "test -r" isn't portable; neither is
591 "cp -a". And note that "mkdir -p" isn't portable - the semantics vary.
592 See the "Goat Book" for more details and other useful tips:
593
594 http://sources.redhat.com/autobook/
595
596* Don't use "include" - it's not present in BSD make (at least some versions
597 have ".include" instead, but that doesn't really seem to help...) Automake
598 provides a configure-time include, which may provide a replacement for some
599 uses of "include".
600
601* It appears that BSD make only supports VPATH for implicit rules (e.g. ".c.o:")
602 - there's certainly a restriction there which is not present in GNU make.
603 We used to try to work around this, but now we use AM_MAINTAINER_MODE to
604 disable rules which are only needed by those developing Xapian (these were
605 the rules which caused problems). And we recommend those developing Xapian
606 use GNU make to avoid problems.
607
608* Rules with multiple targets can cause problems for parallel builds. These
609 rules are really just a shorthand for multiple rules with the same
610 prerequisites and commands, and it is fine to use them in this way. However,
611 a common temptation is to use them when a single invocation of a command
612 generates multiple output files, by adding each of the output files as a
613 target. Eg, if a swig language module generates xapian_wrap.cc and
614 xapian_wrap.h, it is tempting to add a single rule something like::
615
616 # This rule has a problem
617 xapian_wrap.cc xapian_wrap.h: xapian.i
618 SWIG_commands
619
620 This can result in SWIG_commands being run twice, in parallel. If
621 SWIG_commands generates any temporary files, the two invocations can
622 interfere causing one of them to fail.
623
624 Instead of this rule, one solution is to pick one of the output files as a
625 primary target, and add a dependency for the second output file on the first
626 output file::
627
628 # This rule also has a problem
629 xapian_wrap.h: xapian_wrap.cc
630 xapian_wrap.cc: xapian.i
631 SWIG_commands
632
633 This ensures that make knows that only one invocation of SWIG_commands is
634 necessary, but could result in problems if the invocation of SWIG_commands
635 failed after creating xapian_wrap.cc, but before creating xapian_wrap.h.
636 Instead, we recommend creating an intermediate target::
637
638 # This rule works in most cases
639 xapian_wrap.cc xapian_wrap.h: xapian_wrap.stamp
640 xapian_wrap.stamp: xapian.i
641 SWIG_commands
642 touch $@
643
644 Because the intermediate target is only touched after the commands have
645 executed successfully, subsequent builds will always retry the commands if an
646 error occurs. Note that the intermediate target cannot be a "phony" target
647 because this would result in the commands being re-run for every build.
648
649 However, this rule still has a problem - if the xapian_wrap.cc and
650 xapian_wrap.h files are removed, but the xapian_wrap.stamp file is not, the
651 .cc and .h files will not be regenerated. There is no simple solution to
652 this, but the following is a recipe taken from the automake manual which
653 works. For details of *why* it works, see the section in the automake manual
654 titled "Multiple Outputs"::
655
656 # This rule works even if some of the output files were removed
657 xapian_wrap.cc xapian_wrap.h: xapian_wrap.stamp
658 ## Recover from the removal of $@. A full explanation of these rules is in
659 ## the automake manual under the heading "Multiple Outputs".
660 @if test -f $@; then :; else \
661 trap 'rm -rf xapian_wrap.lock xapian_wrap.stamp' 1 2 13 15; \
662 if mkdir xapian_wrap.lock 2>/dev/null; then \
663 rm -f xapian_wrap.stamp; \
664 $(MAKE) $(AM_MAKEFLAGS) xapian_wrap.stamp; \
665 rmdir xapian_wrap.lock; \
666 else \
667 while test -d xapian_wrap.lock; do sleep 1; done; \
668 test -f xapian_wrap.stamp; exit $$?; \
669 fi; \
670 fi
671 xapian_wrap.stamp: xapian.i
672 SWIG_commands
673 touch $@
674
675* This is actually a robustness point, not portability per se. Rules which
676 generate files should be careful not to leave a partial file in place if
677 there's an error as it will have a timestamp which leads make to believe it's
678 up-to-date. So this is bad:
679
680 foo.cc: script.pl
681 $PERL script.pl > foo.cc
682
683 This is better:
684
685 foo.cc: script.pl
686 $PERL script.pl > foo.tmp
687 mv foo.tmp foo.cc
688
689 Alternatively, pass the output filename to the script and make sure you
690 delete the output on error or a signal (although this approach can leave
691 a partial file in place if the power fails). All used Makefile.am-s and
692 scripts have been checked (and fixed if required) as of 2003-07-10 (didn't
693 check xapian-bindings).
694
695And lastly a style point - using "@" to suppress echoing of commands being
696executed removes choice from the user - they may want to see what commands
697are being executed. And if they don't want to, many versions of make support
698the use "make -s" to suppress the echoing of commands.
699
700Using @echo on a message sent to stdout or stderr is acceptable (since it
701avoids showing the message twice). Otherwise don't use "@" - it makes it
702harder to track down problems in the makefiles.
703
704Use of Assert
705=============
706
707Use Assert to perform internal consistency checks, and to check for invalid
708arguments to functions and methods (e.g. passing a NULL pointer when this isn't
709permitted). It should *NOT* be used to check for error conditions such as
710file read errors, memory allocation failing, etc (since we want to perform such
711checks in non-debug builds too).
712
713File format errors should also not be tested with Assert - we want to catch
714a corrupted database or a malformed input file in a non-debug build too.
715
716There are several variants of Assert:
717
718- Assert(P) -- asserts that expression P is true
719- AssertEq(a,b) -- asserts that expressions a and b are equal [message reports
720 values of a and b, so is more informative than Assert((a)==(b))]
721- AssertNe(a,b) -- asserts a and b are not equal
722- AssertEqDouble(a,b) -- asserts a and b differ by less than DBL_EPSILON
723
724- AssertParanoid(P) -- a particularly expensive assertion. If you want a build
725 with Asserts enabled, but without a great performance overhead, then
726 passing --enable-assertions=partial to configure and AssertParanoids
727 won't be checked, but Asserts will. You can also use AssertEqParanoid
728 and AssertNeParanoid.
729
730Marking Features as Deprecated
731==============================
732
733In the API headers, a feature (a class, method, function, enum, typedef, etc)
734can be marked as deprecated by using the XAPIAN_DEPRECATED() macro. Note that
735you can't deprecate a preprocessor macro.
736
737For compilers with a suitable mechanism (currently GCC 3.1 or later, and
738MSVC 7.0 or later) this causes compile-time warning messages to be emitted for
739any use of the deprecated feature. For compilers without support, the macro
740just expands to its argument.
741
742You must add this line to any API header which uses XAPIAN_DEPRECATED():
743
744 #include <xapian/deprecated.h>
745
746When marking a feature as deprecated, document the deprecation in
747docs/deprecation.rst. When actually removing deprecated features, please tidy
748up by removing the inclusion of <xapian/deprecated.h> from any file which no
749longer marks any features as deprecated.
750
751The XAPIAN_DEPRECATED() macro should wrap the whole declaration except for the
752semicolon and any "definition" part, for example::
753
754 XAPIAN_DEPRECATED(int old_function(double arg));
755
756 class Foo {
757 public:
758 XAPIAN_DEPRECATED(int old_method());
759
760 XAPIAN_DEPRECATED(int old_const_method() const);
761
762 XAPIAN_DEPRECATED(static int old_static_method());
763
764 XAPIAN_DEPRECATED(static const int OLD_CONSTANT) = 42;
765 };
766
767To avoid compilation errors with older GCC versions (noted with GCC 3.3.5),
768you can't mark a method which is defined inline in a class with
769XAPIAN_DEPRECATED (this works with recent GCC versions though)::
770
771 class Foo {
772 public:
773 // This fails to compile with GCC 3.3.5, so don't do this!
774 XAPIAN_DEPRECATED(int old_inline_method()) { return 42; }
775 };
776
777Instead rewrite like so::
778
779 class Foo {
780 public:
781 XAPIAN_DEPRECATED(int old_inline_method());
782 };
783
784 inline int Foo::old_inline_method() { return 42; }
785
786Submitting Patches:
787===================
788
789If you have a patch to fix a problem in Xapian, or to add a new feature,
790please send it to us for inclusion. Any major changes should be discussed
791on the xapian-devel mailing list first:
792<http://www.xapian.org/lists.php>
793
794We find patches in unified diff format easiest to read. If you're using a
795SVN checkout just use "svn diff" to generate the diff. If you're working
796from a tarball, compare against the original versions of files using
797"diff -puN" (-p reports the function name for each chunk).
798
799Please set the width of a tab character in your editor to 8 spaces, and use
800Unix line endings (i.e. LF, not CR+LF). Failing to do so will make it much
801harder for us to merge in your changes.
802
803We don't currently have a formal coding standards document, but please try
804to follow the style of the existing code. In particular:
805
806* indent C++ code by 4 spaces for a new indentation level, and set your editor
807 to tab-fill indentation.
808
809* Put a space before the "(" after control flow constructs like "for", "if",
810 "while", etc. Don't put a space before the "(" in function calls. So
811 write "if (strlen(p) > 10)" not "if(strlen (p) > 10)".
812
813* If incrementing an iterator, prefer "++i" to "i++" unless you're actually
814 making use of the returned value ("++i" may be more efficient for some
815 iterators with some compilers).
816
817* Prefer "container.empty()" to "container.size() == 0" (and
818 "!container.empty()" to "container.size() != 0" or "container.size() > 0").
819 Finding the size of a container may not be a constant time operation for
820 all containers (e.g. std::list may not be); also the "empty()" form makes
821 the intent of the test more explicit.
822
823We will do our best to give credit where credit is due - if we have used
824patches from you, or received helpful reports or advice, we will add your name
825to the AUTHORS file (unless you specifically request us not to). If you see we
826have forgotten to do this, please draw it to our attention so that we can
827address the omission.
828
829Developers with SVN access:
830===========================
831
832People who are more seriously involved with the project are likely to
833have write access to the SVN repository. This section gives the conventions
834for those developers.
835
8361) Make sure that the documentation is updated
837----------------------------------------------
838
839 * API classes, methods, functions, and types must be documented by
840 documentation comments alongside the declaration in ``include/xapian/*.h``.
841 These are collated by doxygen - see doxygen's documentation for details
842 of the supported syntax.
843
844 * The documentation comments don't give users a good overview, so we also
845 need documentation which gives a good overview of how to achieve particular
846 tasks.
847
848 * Internal classes, etc should also be documented by documentation comments
849 where they are declared.
850
8512) Make sure the tests are right
852--------------------------------
853
854 * If you're adding a feature, also add feature tests for it. These both
855 ensure that the feature isn't broken to start with and detect if later
856 changes stop it working as intended.
857 * If you've fixed a bug, make sure there's a regression test which
858 fails on the existing code and succeeds after your changes.
859 * If you're adding a new testcase to exhibit an existing bug, and not checking
860 a fix in at the same time, mark the testcase as a known failure (by calling
861 the macro "KNOWN_FAILURE" somewhere in your testcase), so that the build
862 continues to succeed. This allows the automated build systems to continue
863 to work, whilst displaying the error to developers. Fixing the bug is then
864 a priority - we can't generally make a release while there are known
865 failures. Note that failures which are due to valgrind finding memory
866 errors are not affected by this macro, because this would cause the
867 testsuite to fail for users without valgrind. Also, test failures which are
868 only shown by valgrind won't cause problems for the automated builds, which
869 don't currently use valgrind.
870 * Make sure all existing tests continue to pass.
871
872If you don't know how to write tests using the Xapian test rig, then
873ask. It's reasonably simple once you've done it once. There is a brief
874introduction to the Xapian test system in docs/tests.html .
875
8763) Make sure the attributions are right
877---------------------------------------
878
879 * If necessary, modify the copyright statement at the top of any
880 files you've altered. If there is no copyright statement, you may
881 add one (there are a couple of Makefile.am's and similar that don't
882 have copyright statements; anything that small doesn't really need
883 one anyway, so it's a judgement call). If you've added files, they
884 should include the GPL boilerplate with your name only.
885 * If you're not in there, add yourself to the AUTHORS file.
886
8874) Create a ChangeLog entry and commit
888--------------------------------------
889
890 * Add an entry to the ChangeLog file at the top of the module. The
891 text of this can be identical to the SVN commit message. The datestamps in
892 our ChangeLog entries are as produced by the Unix date utility when invoked
893 as::
894
895 date "+%a %b %d %T %Z %Y"
896
897 * Commit to the repository.
898
899Then you can update any patch, bug or feature request items in Bugzilla
900to indicate that they've been dealt with.
901
902API Structure Notes
903===================
904
905We use reference counted pointers for most API classes. These are implemented
906using Xapian::Internal::RefCntPtr, the implementation of which is exposed for
907efficiency, and because it's unlikely we'll need to change it frequently, if at
908all.
909
910For the reference counted classes, the API class (e.g. Xapian::Enquire) is
911really just a wrapper around a reference counted pointer. This points to an
912internal class (e.g. Xapian::Enquire::Internal). The reference counted
913pointer is a member variable of the API class called internal. Conceptually
914this member is private, though it typically isn't declared as private (this
915is to avoid littering the external headers with friend declarations for
916non-API classes).
917
918There are a few exceptions to the reference counted structure, such as
919MSetIterator and ESetIterator which have an exposed implementation. Tests show
920this makes a substantial difference to speed (it's ~20% faster) in typical
921cases of iterator use.
922
923The postfix operator++ for iterators should be implemented inline in terms
924of the prefix form as described by Joe Buck on the gcc mailing list
925- excerpt from http://article.gmane.org/gmane.comp.gcc.devel:50201 ::
926
927 class some_iterator {
928 public:
929 // ...
930 some_iterator& operator++();
931
932 some_iterator operator++(int) {
933 some_iterator tmp = *this;
934 operator++();
935 return tmp;
936 }
937 };
938
939 The compiler is allowed to assume that the copy constructor only does
940 a copy, and to optimize away unneeded copy operations. The result
941 in this case should be that, for some_iterator above, using the
942 postfix operator without using the result should give code equivalent
943 to using the prefix operator.
944
945 Now, for [GCC 3.4], you'll find that the dead uses of tmp are only
946 completely optimized away if tmp has only one data member that can fit in a
947 register. [GCC 4.0 will do] better, and you should find that this style
948 comes very close to eliminating any penalty from "incorrect" use of the
949 postfix form.
950
951Xapian's PostingIterator, TermIterator, and PositionIterator all have only one
952data member which fits in a register.
953
954Handy tips for aiding development
955=================================
956
957If you are find you are repeatedly changing the API headers (in include/)
958during development, then you may become annoyed that the docs/ subdirectory
959will rebuild the doxygen documentation every time you run "make" since this
960takes a while. You can disable this temporarily (if you're using GNU make),
961by creating a file "docs/GNUmakefile" containing these two lines:
962
963%:
964 @echo "Skipping 'make $@' in docs"
965
966Note that the whitespace at the start of the second line needs to be a
967single "tab" character!
968
969Don't forget to remove (or rename) this and check the documentation builds
970before committing or generating a patch though!
971
972How to make a release
973=====================
974
975This is a (hopefully complete) list of the jobs which need doing:
976
977* Email Fabrice Colin so he can check RPM spec files.
978
979* Check the revision currently specified in the svn:externals property of
980 xapian-applications/omega. Unless there's a good reason, we should release
981 xapian-core and omega with synchronised versions of the shared files.
982
983* Make sure that any new/changed/removed API methods in xapian-core have been
984 wrapped/updated/removed in xapian-bindings.
985
986* Update the lists of deprecated/removed API methods in docs/deprecation.rst
987
988* Update the NEWS files using information from the ChangeLog files
989
990* Update the PLATFORMS file. Don't forget to use reports from the tinderbox:
991 http://www.oligarchy.co.uk/tinderbox/xapian/status.html
992
993* Update the version in configure.ac for each module (xapian-core, omega, and
994 xapian-bindings), and the library version info in xapian-core's configure.ac
995
996* Move any bugs fixed by this release from "RESOLVED FIXED" -> "CLOSED"
997 http://www.xapian.org/cgi-bin/bugzilla/buglist.cgi?bug_status=RESOLVED&resolution=FIXED
998 Make sure the submitters are mentioned in the "thanks" list in AUTHORS.
999
1000* On ixion, svn tag the source trees for the new revision - use the svn-tag-release
1001 script, running it with the new version number, for example:
1002
1003 xapian-maintainer-tools/svn-tag-release 0.9.0
1004
1005 This script also generates tarballs for the new release and copies them
1006 across to the website.
1007
1008* Add the new version to the list of versions in Bugzilla:
1009 http://www.xapian.org/cgi-bin/bugzilla/editversions.cgi?product=Xapian&action=add
1010
1011* Update the website: version.php in the CVS module www.xapian.org contains the
1012 latest version and the date it was released.
1013
1014* Run /u1/olly/xapian-website-update/update_website.sh
1015
1016* Update the wiki: Create a new page http://wiki.xapian.org/ReleaseNotes/X.Y.Z
1017 and link it into http://wiki.xapian.org/ReleaseNotes in place of the old
1018 current release link, which should be moved to the archived section.
1019
1020* Update the freshmeat entry at:
1021 http://freshmeat.net/add-release/40427/43070/
1022
1023* Announce the new version on xapian-discuss
1024
1025* Have a nice cup of tea!
1026
1027How to make Debian packages for a new release
1028=============================================
1029
1030Debian control files are stored in the "debian" subdirectory of each module
1031for which packages have been produced (currently xapian-core, xapian-bindings
1032and xapian-applications/omega). After each release, these should be
1033updated as follows:
1034
1035* Update the debian/changelog file, being sure to keep it in the
1036 standard Debian format (the easiest way is to use the dch utility
1037 like so: "dch -v 0.9.7-1". The new version number should be the
1038 version number of the release followed by "-1" (ie, a debian
1039 patch number of 1). The changelog message should indicate that
1040 there is a new upstream release, and should mention any significant
1041 changes in the new release.
1042
1043* If any patches are being applied when building the debian package
1044 (ie, there is a patch file "debian/patch"), and these patches are
1045 now incorporated into the release, remove or update the patch file.
1046
1047* Use xapian-maintainer-tools/debian/svn-tag-debs to tag all the files in the
1048 debian control directory with the tag "debian-VERSION-1" - eg, for a new
1049 release of version 0.9.6, tag with "debian-0.9.6-1".
1050
1051* Use xapian-maintainer-tools/debian/make-source-packages to make and
1052 upload new source packages (for stable, unstable, dapper, and edgy).
1053 The source packages for the distributions other than stable are done
1054 as "backports" - if unstable's version is 0.9.6-1, then stable's
1055 version is 0.9.6-0stable1 (so that it's older than the unstable version
1056 so that upgrading between distributions works). The corresponding dapper
1057 version would be 0.9.6-1.99dapper (and similarly for edgy).
1058
1059* Build debs for stable, unstable, dapper, edgy, and feisty. The scripts
1060 xapian-maintainer-tools/debian/create-chroot and
1061 xapian-maintainer-tools/debian/build-packages allow building these in
1062 a series of chroots on a single machine using pbuilder.
1063
1064
1065
1066.. vim: syntax=