Throughout this document, XXX will stand for the name of your library, Y will stand for the major version and Z will stand for the minor version. See below for an explanation of major and minor versions.
You should avoid changing the major version of your library as much as possible. Breaking backwards compatibility should be a last resort, not a first - doing so slows peoples systems down by causing several unsharable copies of the code to be loaded at once, will cause end users pain as they install software only to find that their system is missing a sufficiently old version of the library, and can push application developers away from using your code.
Smart shared library authors avoid breaking backwards compatibility entirely. Tips for how to do this are provided later. It's the right thing to do, you know it in your heart ;)
libgadgets
installed, the linker can
check that it's got a major version of 3 but not that its minor
version is >= 2.
This is a problem because functions are relocated on
demand. That means a user could be using Fred App fine for an
hour, then come to save their work and have Fred App crash
because it was matched against a version
of libgadgets
that was too old at runtime. Letting
the linker check this means the user will get an error at
startup if this happens, meaning at least there's no chance for
the user to lose work.
For example, libpng is used for loading PNG files. If an application is linked against libpng.so.3, and also a widget toolkit that is linked against libpng.so.2, you now have two incompatible versions of libpng loaded at once. Without symbol versioning this is a highly unstable situation: if a function or structure changed in a breaking manner, the widget toolkit will be incorrectly linked against the version 3 and the app will most likely crash and burn as a result.
This is in contrast to how it works on Windows, where symbols are only searched for in the direct dependencies of an EXE or DLL, so be careful. What feels intuitively "obvious" will not happen if you don't use symbol versioning.
The version script consists of blocks that look like this:
XXX_1.0 {
global:
xxx_some_function;
xxx_some_other_function;
local:
*;
};
XXX_1.1 {
global:
xxx_a_new_function;
xxx_a_new_variable;
} XXX_1.0;
As you can see, the format is simple enough. Each set of symbols is assigned to a particular version by placing them inside a block. The version name after the braces is optional and if present represents a dependency, ie, version 1.1 extends version 1.0. Watch out for the semi-colons after each block! You shouldn't ever have multiple major version numbers in a version script, if you break backwards compatibility like that change the soname!
Make sure you don't miss any symbols out. You can use wildcards if your library already manages its namespace effectively, see the section on namespacing for more details. Any symbols that aren't in a global section will become invisible, and apps won't be able to link against them.
Then, add this to your link line:
-Wl,--version-script,versions.ldscript
which will pass the file
to the linker for processing. You might want to use an autoconf
check to ensure the linker in use supports this option. If it
doesn't, just don't provide it.
FIXME: show how to use this with C++/Java libraries.
Parallel installability doesn't just apply to libraries. Command line programs, file formats, directory layouts and so on are all interfaces which should be able to co-exist in parallel with older versions. Havoc says:
Parallel install should happen according to a very specific rule: when an app asks for an interface by name, it should always get something with the same interface and semantics it expected.
void xxx_some_func();
#if XXX_MINOR_VERSION >= 2
void xxx_a_new_func();
#endif
/* in the new version, we changed this macro to work in a
different way, so compiling a program against the new
headers could actually change its dependencies if not
for these header guards. */
#if XXX_MINOR_VERSION >= 3
# define XXX_THE_ANSWER (xxx_consult_deep_thought())
#else
# define XXX_THE_ANSWER 42
#endif
enum xxx_modes
{
XXX_MODE_A,
XXX_MODE_B,
#if XXX_MINOR_VERSION >= 3
XXX_MODE_C,
#endif
XXX_MODE_LAST /* don't use this! */
};
This technique allows developers to control with confidence the dependencies of their program. Let's say you introduce a whizzy new version of libxxx. While developers may want to take advantage of these new features in apps they run themselves, they might not want to make it a hard dependency of their own software just yet. Maybe they think it hasn't got a large enough install base, maybe their software just doesn't need the new features.
Regardless, they should be able to know that unless they opt into the new version, they won't accidentally get a dependency on it when they compile. Requiring the developer to state which version they wish to target not only allows the compiler to check that they're not using any new symbols, it also allows you to redefine macros as shown above without silently rewriting the developers code to depend on new functionality.
Notice that there is no XXX_MAJOR_VERSION macro. Why not? Because if you break backwards compatibility, you should provide a separate set of headers so users can still compile against the old code, so having a MAJOR_VERSION macro would be superfluous.
It is possible to break binary compatibility but not source compatibility and vice-versa. For simplicities sake, if you want to break backwards compatibility it's recommended to save up any breaking changes you want to make and do them all together. See the backwards compatibility section below for more info. If you don't do this, you may be tempted to "optimize" by releasing an incompatible version 2 of your library, but keeping the same headers. You can do this, but it's not recommended. The standard for free software is to give libraries parallel installable headers rather than use major version macros: it's best to stick to the standard to keep things simple.
The semantic interface is a bit more loosely defined. Simply put, it's the way your library behaves. Say you have a function which returns a list of strings, and the list was always sorted in a particular way. Now the implementation changes and for some reason the list isn't sorted, it's randomized. Could this break applications?
Perhaps surprisingly, the answer is yes. Here's a real world
example: when the kernel (which is in a sense exposing an API like
a library would) was patched in such a way that .
and ..
were no longer the first two items in a
directory listing, some software broke. The developers had never
seen a system in which this assumption was invalid, and coded with
it in mind. They probably didn't even realise they were doing
it. A common mistake was to take a directory listing, skip the
first two entries, then go from there.
Breaking software in ways like this is remarkably easy. Very few APIs (almost none, in fact) are specced out to perfection in the documentation and so applications often subtly rely on implementation details without anybody noticing. You might be thinking "well, stupid programmers deserve to lose, they should fix their crappy software" but the programmers who make these assumptions aren't necessarily stupid - at least in my experience they are just as likely to be a veteran of the software industry as an inexperienced novice. Writing software to be totally robust in the face of semantic changes is extremely hard, and as such it's very rarely done. Living with this fact is just a part of writing shared libraries.
When you change the way a library works, think about how it will affect the developer. Think about how the developer may have coded their software to use the API, what common tasks with it are and how they might have been accomplished. For many library developers in the open source community this is easy - just go look at the code of your users!
Another way to detect semantic breaks is by testing the library against a wide selection of software that uses it. If you find that a change you made breaks things, think very carefully about whether to proceed. Often users of complex APIs end up subconciously relying on implementation details without realising it. It could just as easily have been you who made the bad assumption, so don't jump to break their software because you feel they should have known better. Remember: it's ultimately the users who lose when stuff crashes or eats their work.
Common ways of breaking the semantics of your library are:
int i = xxx_ask_a_question(); /* display a dialog box */
if ((i == XXX_NO) || (i == XXX_CANCEL))
{
printf("no\n");
}
else
{
printf("yes\n");
}
Now introduce an XXX_SHOW_HELP return code. Oops! The program
will think you clicked yes. If this question was "that might
damage your hardware, really continue?" ... well, let's not go
there :) This sort of problem is rather hard to avoid - the
recommended way of doing it is to introduce a new
function, xxx_ask_a_question_with_help()
for
instance that can return the new code. You could simply blame
the developer for not considering future API extensions, but
this doesn't help the poor user who just saw their computer blow
up.
Basically what this means is that if your library can run callbacks in user code, your library needs to be re-entrant: ie at any point at which you run a callback, execution can start again from the top of any of the library functions. If you do a callback while in the middle of manipulating a linked list, you'd better believe that somebody ... somewhere ... will decide to call xxx_add() or xxx_remove() from within that callback. Maybe even rerun the function which triggered the callback in the first place!
If this used to work somehow, even if it was accidental, and you change it so it doesn't work anymore then user code might break. Very few libraries document their level of re-entrancy, it is assumed that any library which performs callbacks is totally re-entrant. Be careful!
Example:
struct xxx_foo
{
char *x, *y, *z;
int num;
/* abi padding */
void (*reserved1)(void);
void (*reserved2)(void);
void (*reserved3)(void);
void (*reserved4)(void);
}
Note the use of function pointers, rather than regular pointers or ints. Err on the cautious side, ie too much padding rather than too little. If you want to apply this technique to many structs, a macro for it may be helpful.
xxx_some_function_with_foo()
or xxx_some_function2()
xxx_random()
uses -
you might want to export a variable that enables the new
semantics which default to off. For instance, require the
developer to write:
xxx_new_random_number_algorithm = TRUE;
....
int i = xxx_random();
Don't invert the meaning of this variable, ie don't make it a "please use the old way" setting. This is useless, as old binaries will still break and have to be modified, at which point they might as well be updated to use the new semantics anyway. Meanwhile the end user is suffering with broken software.
clear
or define_key
(*cough* ncurses *cough* :) can
interfere with other totally independent libraries due to the way
symbol scoping works. Using symbol versioning doesn't mean you
don't have to namespace your libraries, it simply helps avoid
conflicts at runtime. They can still occur at link time. This
is true even of symbols internal to your library. That
means if you have a function not declared static in a C file
because it's used from another C file, it either has to be
namespaced, or you have to be using symbol versioning to mark it
as local so it won't be exported. Otherwise, even your library
internals can get its wires crossed!
Libs: -L${libdir} -lxxx -lz -lX
Don't do this! By leaking information about how the library is implemented you are breaking encapsulation, a very important programming principle. It's not purely academic: it gives the application bogus dependencies upon libraries it doesn't even use, and if those libraries are upgraded or removed (perhaps your library doesn't need them anymore) now the application will refuse to start for no good reason. By only specifying the actual library on the link line you keep how the library is implemented away from the application itself, which can increase robustness.
Have fun!