Opened 19 years ago
Last modified 11 months ago
#63 assigned enhancement
Improve visibility annotations for the library
Reported by: | Olly Betts | Owned by: | Olly Betts |
---|---|---|---|
Priority: | low | Milestone: | 2.0.0 |
Component: | Library API | Version: | git master |
Severity: | minor | Keywords: | |
Cc: | Richard Boulton | Blocked By: | |
Blocking: | Operating System: | Linux |
Description (last modified by )
See http://gcc.gnu.org/wiki/Visibility
This should make the shared library much smaller, and a little faster!
Change History (28)
comment:1 by , 19 years ago
Status: | new → assigned |
---|
comment:2 by , 18 years ago
Priority: | high → normal |
---|
comment:3 by , 18 years ago
Blocking: | 118 added |
---|
comment:4 by , 18 years ago
Cc: | added |
---|
comment:5 by , 18 years ago
I've put this off previously as visibility annotations are only useful for GCC >= 4.0. However, such versions are now deployed by many linux distros, so I think the time has come for this, and it's not really something we can retrofit into the 1.0 releases. It shouldn't be a lot of work, but it would be good to do it now so any issues have a chance to be shaken out before release.
So I've just made a first pass. The library builds, as do simple utilities like "delve". Things like "quartzcheck" which hook into quartz don't build, and neither does the test harness, so it needs some more work still.
Currently (on x86_64) the stripped libxapian.so is 8% smaller (unstripped is 2.6% smaller), while the number of external symbols drops from 2851 to 508.
comment:6 by , 18 years ago
Hmm, what a mess!
The test harness uses a few bits of the library internals, which is a bit unclean but we can expose those for now at least.
Doing that, apitest builds and passes, which is a promising sign.
But quartztest, btreetest, and internaltest don't stand a chance without exposing shedloads of library internals.
Looking at quartztest, much of it can either just be removed or turned into API level tests where such tests don't already exist.
Not sure what to do about the rest. Unit tests are a good idea, so deleting them isn't really a good option (btreetest and quartztest are less important now flint is the default, but we ought to at least consider a flint btree-level test).
We could try to just link against the object files which are actually needed, but then we also need utility functions, debug functions, etc and this seems likely to be a maintenance headache.
Passing "-static" when linking these problematic binaries is a fix which works (though it's not ideal either). This only links libxapian statically at least - libstdc++ and the shared library aren't affected. Also, we only need to do this for GCC >= 4.0. If we go this route, I think we should merge all the affected tests into a single binary (or fewer binaries at least).
Any other ideas?
comment:7 by , 18 years ago
Possibly not a helpful comment - but don't the utilities in xapian-core/bin/ also need to link to various internals? It would be nice if we could move sufficient amounts of their functionality into the API, so the utilties could be implemented with just the public API (with the side benefit that things like the python bindings could perform a check of a database without needing to exec a sub-process). However, that's not going to happen in the near future.
The -static approach seems the only way forward to me: if the symbols aren't exported by the library, there's no way that utilities are going to be able to link to it, however clever we are. So, the question is whether it's worth doing given that cost.
This isn't a particularly urgent fix, in my mind; though 8% would be a nice size improvement, I've not come across anywhere recently where size is a problem (though if someone wanted to use Xapian in an embedded system it might become one). The attraction of doing this is more to keep the API neat, and reduce the probability of symbol collisions. Perhaps we should defer this to 1.1, and try and move tests into fewer binaries and work on the utilities in the meantime.
comment:8 by , 18 years ago
The utilities in xapian-core/bin do indeed, but the API we need to expose there is fairly limited, so for now I've just marked those as visible (which isn't actually worse than just using them because they happen to be visible if you include the right header!) As you say, longer term it would be better to make those implementable through the API. In particular, being able to merge databases would be handy when writing an indexer.
Given I've got this far, I'm inclined to do a bit more work and commit this, even if it means marking all the classes and functions used by the testsuite as visible (at least some things will be hidden, which would be a step forward).
In the longer term, a coherent approach to allowing unit testing of things like the serialisation functions would be useful though...
comment:9 by , 18 years ago
As you say, hiding even some bits of the internals is better than nothing at all, and getting the infrastructure in place to remove more when we are able to is also good. I hadn't thought of doing a partial implementation like that.
comment:10 by , 18 years ago
Blocking: | 118 removed |
---|---|
Operating System: | → Linux |
Priority: | normal → low |
Summary: | Add visibility annotations to the library → Improve visibility annotations for the library |
I've just committed changes to implement this.
For now, all the internals required by stuff in xapian-core/bin and xapian-core/tests is simply marked as visible. We can look at removing the need for some of these dependencies (e.g. by adding API methods to allow databases to be merged or checked) and think about a strategy to allow sane unit testing.
I've also not worried about marking internal members of visible classes as non-visible - the gain from that is likely to be fairly small, so that can wait until later too I think.
So I'm retitling this bug, and removing the blockage on 1.0.
comment:12 by , 16 years ago
Description: | modified (diff) |
---|---|
Milestone: | → 1.1.1 |
Flag for reassessing during the 1.1.x series.
comment:14 by , 15 years ago
Milestone: | 1.1.4 → 1.2.0 |
---|
I've removed three symbols which didn't need to be visible at all. All the rest seem to be currently used.
We could improve things further by having a "standalonetest" which does not link to libxapian, and move some of the internaltest testcases there.
But there are some obstacles:
- testsuite.cc currently pulls in backendmanager stuff (which is shouldn't really anyway as this is meant to be a generic test harness).
- testsuite.cc has a catch for Xapian::Error (again, it shouldn't really). Richard's
idea to fix this is a call-back function - we can use the
catch (...)
and rethrow trick that xapian-bindings uses to rethrow the exception in the callback function and process it there.
- Some of the internals tested aren't as standalone as perhaps they ought to be - e.g. mixing generic and Xapian-specific code in the same file, throwing Xapian::Error subclasses from otherwise generic code.
Also, if it is just a case of changing a non-API symbol from visible to hidden, that isn't an ABI breaking change, as no user code should be linked to that symbol. So we can address this in 1.2.x without issues.
So I'm going to bump the milestone now.
comment:15 by , 13 years ago
Milestone: | 1.2.x → 1.3.x |
---|
comment:16 by , 13 years ago
comment:17 by , 13 years ago
And serialise_document() and unserialise_document() (which are now available via the public API) in r16294.
comment:18 by , 13 years ago
In r16316 I've added an API for xapian-check's functionality and modified xapian-check to use that, allowing removal of XAPIAN_VISIBILITY_DEFAULT from a number of functions which aren't in the public API.
comment:19 by , 11 years ago
More progress on eliminating symbols has been made on trunk (these will all debut in 1.3.2):
A while ago, I reworked the 4 version inspection functions to be inline wrappers around a single function in the shared library which returns a pointer to a structure, which eliminates 3 external symbols.
I've also just deprecated 6 of the database factory functions in favour of flags passed to the Database and WritableDatabase constructors, and these deprecated functions are now implemented as inline wrappers which just call those constructors, so that's another 6 external symbols eliminated.
comment:20 by , 9 years ago
Milestone: | 1.3.x → 1.3.5 |
---|
[d56fc51f498b474001744cd76e53c9f6beef6264] makes the symbols for the various overloaded forms of str()
private to the library.
Marking for 1.3.5 to at least check through the remaining symbols which are currently public but shouldn't really need to be.
comment:21 by , 9 years ago
Looking at where XAPIAN_VISIBILITY_DEFAULT
is used outside of `include/xapian/' in xapian-core:
- api/replication.h: Replication was originally an external API, but we weren't entirely happy with it. Perhaps whatever that was ought to be sorted out and these made external API features.
DatabaseMaster
andDatabaseReplica
need to be visible forxapian-replicate
andxapian-replicate-server
.ReplicationInfo
is entirely defined within the header, so probably doesn't need to be marked withXAPIAN_VISIBILITY_DEFAULT
. - backends/chert/:
ChertCursor
andChertTable
used byxapian-inspect
. Thexapian-inspect
tool is really more of a development aid than a user tool, so perhaps it shouldn't be installed, just built in-tree and statically linked to the bits of the backend it needs. - common/stringutils.h: Tables for C locale versions of
isalpha()
, etc,tolower()
andtoupper()
. Perhaps these functions should be available in the API. - net/:
RemoteServer
,RemoteTcpServer
: Used byxapian-progsrv
andxapian-tcpsrv
. Perhaps ought to be available via the API to allow other forms of remote server. - net/:
ReplicateTcpClient
,ReplicateTcpServer
: Used byxapian-replicate
andxapian-replicate-server
. It seems silly we need these as well as the (originally public) API above - I guess it's for the TCP support code. - net/serialise.h:
serialise_error()
andunserialise_error()
- used in testcases intests/internaltest.cc
. Perhaps could be moved totests/unittest.cc
to avoid needing these outside the library. - net/tcpserver.h:
TcpServer
, referenced inbin/xapian-tcpsrv.cc
as a hook for setting aRegistry
(for adding custom weighting schemes, etc) but this is actually the wrong class! (EDIT: Also used via subclasses inbin/xapian-tcpsrv.cc
andbin/xapian-replicate-server.cc
)
comment:23 by , 9 years ago
Moved the tests of serialise_error()
and unserialise_error()
to tests/unittest.cc
and made these symbols hidden in [63208048574983216aac07a101e04a609b4f0848].
Updated list of where XAPIAN_VISIBILITY_DEFAULT
is used outside of `include/xapian/' in xapian-core:
- api/replication.h: Replication was originally an external API, but we weren't entirely happy with it. Perhaps whatever that was ought to be sorted out and these made external API features.
DatabaseMaster
andDatabaseReplica
need to be visible forxapian-replicate
andxapian-replicate-server
. - backends/chert/:
ChertCursor
andChertTable
used byxapian-inspect
. Thexapian-inspect
tool is really more of a development aid than a user tool, so perhaps it shouldn't be installed, just built in-tree and statically linked to the bits of the backend it needs. - common/stringutils.h: Tables for C locale versions of
isalpha()
, etc,tolower()
andtoupper()
. Perhaps these functions should be available in the API. - net/:
RemoteServer
,RemoteTcpServer
: Used byxapian-progsrv
andxapian-tcpsrv
. Perhaps ought to be available via the API to allow other forms of remote server. - net/:
ReplicateTcpClient
,ReplicateTcpServer
: Used byxapian-replicate
andxapian-replicate-server
. It seems silly we need these as well as the (originally public) API above - I guess it's for the TCP support code. - net/tcpserver.h:
TcpServer
used via subclasses inbin/xapian-tcpsrv.cc
andbin/xapian-replicate-server.cc
.
comment:24 by , 9 years ago
In [95a6ea015c939109cce30a4ee328b39f3a0c5216] I have folded the tables for isalpha()
etc into one table, and put it into the constinfo structure. There's not a public API using this yet, but one could be added at any point.
comment:25 by , 9 years ago
Milestone: | 1.3.5 → 1.4.x |
---|
And [4b3e73c87dac176437b985faebc50ebfecad566a] cleans up ChertCursor
and ChertTable
.
Updated list of where XAPIAN_VISIBILITY_DEFAULT
is used outside of `include/xapian/' in xapian-core:
- api/replication.h: Replication was originally an external API, but we weren't entirely happy with it. Perhaps whatever that was ought to be sorted out and these made external API features.
DatabaseMaster
andDatabaseReplica
need to be visible forxapian-replicate
andxapian-replicate-server
. There was some discussion of the idea that replication should be split into creating changesets, transporting changesets and applying changesets, so you could transport the changeset files by other means (replication by sneakernet, etc), but I think that's out of scope for 1.3.x at this point. - net/:
RemoteServer
,RemoteTcpServer
: Used byxapian-progsrv
andxapian-tcpsrv
. Perhaps ought to be available via the API to allow other forms of remote server. - net/:
ReplicateTcpClient
,ReplicateTcpServer
: Used byxapian-replicate
andxapian-replicate-server
. It seems silly we need these as well as the (originally public) API above - I guess it's for the TCP support code. - net/tcpserver.h:
TcpServer
used via subclasses inbin/xapian-tcpsrv.cc
andbin/xapian-replicate-server.cc
.
I think we need to punt on the rest for this development cycle. The only quick change I can see is making them public APIs, which wouldn't actually reduce the symbol count.
comment:26 by , 5 years ago
Version: | SVN trunk → git master |
---|
The current status is unchanged from the comment above.
comment:27 by , 5 years ago
Milestone: | 1.4.x → 1.5.0 |
---|
[c46ddd7eaa2c8bc3730e169255f87f97f4ce6438] moves RemoteTcpServer
from libxapian to xapian-tcpsrv, since it's only used in the latter.
Updated list of where XAPIAN_VISIBILITY_DEFAULT
is used outside of include/xapian/
in xapian-core:
- api/replication.h: Replication was originally an external API, but we weren't entirely happy with it. Perhaps whatever that was ought to be sorted out and these made external API features.
DatabaseMaster
andDatabaseReplica
need to be visible forxapian-replicate
andxapian-replicate-server
. There was some discussion of the idea that replication should be split into creating changesets, transporting changesets and applying changesets, so you could transport the changeset files by other means (replication by sneakernet, etc), but I think that's out of scope for 1.3.x at this point. - net/:
RemoteServer
: Used byxapian-progsrv
andxapian-tcpsrv
. Perhaps ought to be available via the API to allow other forms of remote server. - net/:
ReplicateTcpClient
,ReplicateTcpServer
: Used byxapian-replicate
andxapian-replicate-server
. It seems silly we need these as well as the (originally public) API above - I guess it's for the TCP support code. - net/tcpserver.h:
TcpServer
used via subclasses inbin/xapian-tcpsrv.cc
andbin/xapian-replicate-server.cc
.
This isn't 1.4 material at this point.
comment:28 by , 18 months ago
We can't just move ReplicateTcpClient
and link it only into xapian-replicate
because it uses RemoteConnection
so we'd need to expose that in the library ABI instead. Similarly for ReplicateTcpServer
.
I tried this and then had to move various other things around and add ABI visibility to TcpClient
as well as RemoteConnection
.
In exchange, I removed ABI visibility from ReplicateTcpClient
, ReplicateTcpServer
and TcpServer
.
I need to check if this is actually a win in terms of symbols in the ABI. TcpClient
is just a single method in a namespace (so one symbol) but RemoteConnection
has quite a lot of methods.
Maybe more refactoring could establish a cleaner split point.
comment:29 by , 11 months ago
Milestone: | 1.5.0 → 2.0.0 |
---|
It would be lovely to do this for 1.0. Possibly not important enough to worry about right now, though.