Writing shared libraries

This document is intended to describe best practices for writing shared libraries on Free ELF-based operating systems such as Linux and the BSD family. By following these guidelines, your libraries will be easier to use for other developers, more robust and will cause minimum hassle for end users running applications built upon them.

Throughout this document, XXX will stand for the name of your library, Y will stand for the major version and Z will stand for the minor version. See below for an explanation of major and minor versions.

Versioning
Backwards compatibility and interface management
Namespacing
Config scripts

Versioning

The major version of your library starts at zero and is incremented if the library breaks backwards compatibility. Exactly how your library can do this is documented below: make sure you understand these rules. If you make a breaking change and do not increment the major version, users may upgrade their system and find that their software crashes, won't start, or mysteriously corrupts data.
You should avoid changing the major version of your library as much as possible. Breaking backwards compatibility should be a last resort, not a first - doing so slows peoples systems down by causing several unsharable copies of the code to be loaded at once, will cause end users pain as they install software only to find that their system is missing a sufficiently old version of the library, and can push application developers away from using your code.
Smart shared library authors avoid breaking backwards compatibility entirely. Tips for how to do this are provided later. It's the right thing to do, you know it in your heart ;)
The minor version of your library also starts at zero. Every time you add to the libraries interface but maintain backwards compatibility, this is incremented. If you bump the major version, it is reset to zero. For instance, a library with an interface version of 1.4 can satisfy the dependency of a program which requires version 1.2, but not one that requires 3.0 or 0.8. Before releasing your library to the world, make sure you understand the interface stability rules documented below so you know when to increment this and when to leave it alone.
Every ELF library has a soname, which is distinct from its file name. The soname is what the linker uses to identify the library. A good soname has the form "libxxx.so.y", where xxx is the name of your library and y is the major version. The minor version is not reflected in the soname. You can set the soname of your library using a linker switch.
The software version of your library should match the interface version. The interface version is the MAJOR.MINOR code assigned to the library indicating which interfaces it exports. The software version is what you put on your website, freshmeat page, release announcements etc. You can save people a lot of grief by making your software versions the same: for instance, don't release a libxxx version 1, add a function then release version 2 because ... well, it's a really sweet function, right? If your library now has an interface version of 1.1, that should be the version you use to advertise the software with. If you want to use kernel versioning where 1.1 indicates "development" feel free, there is no need for versions to increase in units of 1, they can be incremented by 2 if you so wish.
Document the version history of your library. If you have to break backwards compatibility make a note in the README of when you broke backwards compatibility, how, and why. This will help developers to keep up with and understand the changes. It'll also help the user who is trying to figure out how to compile an old version of your library to satisfy a program they have.
Good libraries use symbol versioning. This involves tagging each symbol exported by your library with an identifier of the form XXX_Y.Z and has several advantages:

It allows the linker to understand minor versions. The basic ELF versioning mechanism lets the linker distinguish between two incompatible versions of the library, but it's unable to check that what you have on your system is new enough. For instance, if you have libgadgets installed, the linker can check that it's got a major version of 3 but not that its minor version is >= 2.
This is a problem because functions are relocated on demand. That means a user could be using Fred App fine for an hour, then come to save their work and have Fred App crash because it was matched against a version of libgadgets that was too old at runtime. Letting the linker check this means the user will get an error at startup if this happens, meaning at least there's no chance for the user to lose work.
RPM can use them to provide more accurate dependency information: again, by allowing it to understand minor versions.
They can improve startup time by making linking faster, and by eliminating unnecessary internal relocations.
Standard ELF symbol scoping is process scoped not library scoped as is usually the case in other operating systems. This means that if your library exports some_function, that definition can override all uses of symbols of the same name elsewhere in the link tree. This is especially problematic if you break backwards compatibility, as in a popular library you can end up with two incompatible versions linked into the same process which will then interfere with each other.
For example, libpng is used for loading PNG files. If an application is linked against libpng.so.3, and also a widget toolkit that is linked against libpng.so.2, you now have two incompatible versions of libpng loaded at once. Without symbol versioning this is a highly unstable situation: if a function or structure changed in a breaking manner, the widget toolkit will be incorrectly linked against the version 3 and the app will most likely crash and burn as a result.
This is in contrast to how it works on Windows, where symbols are only searched for in the direct dependencies of an EXE or DLL, so be careful. What feels intuitively "obvious" will not happen if you don't use symbol versioning.
The downside is that it's not fully portable, so these benefits will be only available to Linux users. Fortunately, it's possible to use them without depending on them: they are useful but your library will still work on systems that don't support symbol versioning. Certainly don't let this put you off, those who are using supported systems should get the benefits!
Adding versioning to your library isn't too difficult. If you're retrofitting it onto an existing codebase, the hardest part will be compiling a list of symbols in your API and figuring out at which version each was added. Once you have this information, create a file in your source tree called versions.ldscript - it can be named anything but emacs at least recognises the ldscript extension.
The version script consists of blocks that look like this:
```
        XXX_1.0 {
          global:
            xxx_some_function;
            xxx_some_other_function;
          local:
            *;
        };

        XXX_1.1 {
          global:
            xxx_a_new_function;
            xxx_a_new_variable;
        } XXX_1.0;

      
```
As you can see, the format is simple enough. Each set of symbols is assigned to a particular version by placing them inside a block. The version name after the braces is optional and if present represents a dependency, ie, version 1.1 extends version 1.0. Watch out for the semi-colons after each block! You shouldn't ever have multiple major version numbers in a version script, if you break backwards compatibility like that change the soname!
Make sure you don't miss any symbols out. You can use wildcards if your library already manages its namespace effectively, see the section on namespacing for more details. Any symbols that aren't in a global section will become invisible, and apps won't be able to link against them.
Then, add this to your link line: -Wl,--version-script,versions.ldscript which will pass the file to the linker for processing. You might want to use an autoconf check to ensure the linker in use supports this option. If it doesn't, just don't provide it.
FIXME: show how to use this with C++/Java libraries.

Make your library parallel installable. This means that two incompatible versions of the library should not only have different sonames, but also install their header files to different locations, and provide different pkg-config files (see the pkg-config section below for more info on this). Look at how GTK+ does it if you're unsure: basically being parallel installable means being able to have two incompatible versions of the library installed at once, and still be able to compile apps against either of them. Read Havoc Penningtons article for more information on this technique.
Parallel installability doesn't just apply to libraries. Command line programs, file formats, directory layouts and so on are all interfaces which should be able to co-exist in parallel with older versions. Havoc says:

Parallel install should happen according to a very specific rule: when an app asks for an interface by name, it should always get something with the same interface and semantics it expected.
Use header versioning. This means that when you add features to the headers or modify macros in a way that introduces new symbols to compiled programs, you protect these new symbols with macro guards. For instance:
```
          void xxx_some_func();

          #if XXX_MINOR_VERSION >= 2
          void xxx_a_new_func();
          #endif

          /* in the new version, we changed this macro to work in a
             different way, so compiling a program against the new
             headers could actually change its dependencies if not
             for these header guards.  */

          #if XXX_MINOR_VERSION >= 3
          #  define XXX_THE_ANSWER (xxx_consult_deep_thought())
          #else
          #  define XXX_THE_ANSWER  42
          #endif

          enum xxx_modes
          {
             XXX_MODE_A,
             XXX_MODE_B,
          #if XXX_MINOR_VERSION >= 3
             XXX_MODE_C,
          #endif
             XXX_MODE_LAST /* don't use this! */
          };
      
```
This technique allows developers to control with confidence the dependencies of their program. Let's say you introduce a whizzy new version of libxxx. While developers may want to take advantage of these new features in apps they run themselves, they might not want to make it a hard dependency of their own software just yet. Maybe they think it hasn't got a large enough install base, maybe their software just doesn't need the new features.
Regardless, they should be able to know that unless they opt into the new version, they won't accidentally get a dependency on it when they compile. Requiring the developer to state which version they wish to target not only allows the compiler to check that they're not using any new symbols, it also allows you to redefine macros as shown above without silently rewriting the developers code to depend on new functionality.
Notice that there is no XXX_MAJOR_VERSION macro. Why not? Because if you break backwards compatibility, you should provide a separate set of headers so users can still compile against the old code, so having a MAJOR_VERSION macro would be superfluous.
It is possible to break binary compatibility but not source compatibility and vice-versa. For simplicities sake, if you want to break backwards compatibility it's recommended to save up any breaking changes you want to make and do them all together. See the backwards compatibility section below for more info. If you don't do this, you may be tempted to "optimize" by releasing an incompatible version 2 of your library, but keeping the same headers. You can do this, but it's not recommended. The standard for free software is to give libraries parallel installable headers rather than use major version macros: it's best to stick to the standard to keep things simple.

Backwards compatibility and interface management

Every shared library has an API (application programming interface), an ABI (application binary interface) and a semantic interface. All of these can be broken independently of each other, though they are often inter-related. Being backwards compatible means not breaking any of them. It's not as hard as it sounds.
The API is what the developer works against. It's the function names, the member names of structures, the macros in the header files. In C++ it may include things like the constness of parameters. You can break the API by doing things like renaming structure members, functions etc - basically anything that would cause compilation of source to fail.
The ABI is what the actual compiled binaries work against. The binaries don't care about things like struct member names and (on 32 bit machines) the difference between a function that returns an int and a function that returns a pointer. They do care about the following: (changing any of these will break the ABI)

The position of structure members.
The number of parameters a function takes.
The size of structures: this is the size of all the members and potentially alignment padding adding by the compiler. Changing the size of a structure is a sure way to break the ABI of a library.
The size of any static arrays/variables in the binary.
.... fixme : add c++ rules here ....

The semantic interface is a bit more loosely defined. Simply put, it's the way your library behaves. Say you have a function which returns a list of strings, and the list was always sorted in a particular way. Now the implementation changes and for some reason the list isn't sorted, it's randomized. Could this break applications?
Perhaps surprisingly, the answer is yes. Here's a real world example: when the kernel (which is in a sense exposing an API like a library would) was patched in such a way that . and .. were no longer the first two items in a directory listing, some software broke. The developers had never seen a system in which this assumption was invalid, and coded with it in mind. They probably didn't even realise they were doing it. A common mistake was to take a directory listing, skip the first two entries, then go from there.
Breaking software in ways like this is remarkably easy. Very few APIs (almost none, in fact) are specced out to perfection in the documentation and so applications often subtly rely on implementation details without anybody noticing. You might be thinking "well, stupid programmers deserve to lose, they should fix their crappy software" but the programmers who make these assumptions aren't necessarily stupid - at least in my experience they are just as likely to be a veteran of the software industry as an inexperienced novice. Writing software to be totally robust in the face of semantic changes is extremely hard, and as such it's very rarely done. Living with this fact is just a part of writing shared libraries.
When you change the way a library works, think about how it will affect the developer. Think about how the developer may have coded their software to use the API, what common tasks with it are and how they might have been accomplished. For many library developers in the open source community this is easy - just go look at the code of your users!
Another way to detect semantic breaks is by testing the library against a wide selection of software that uses it. If you find that a change you made breaks things, think very carefully about whether to proceed. Often users of complex APIs end up subconciously relying on implementation details without realising it. It could just as easily have been you who made the bad assumption, so don't jump to break their software because you feel they should have known better. Remember: it's ultimately the users who lose when stuff crashes or eats their work.
Common ways of breaking the semantics of your library are:

Changing the order of callbacks/notifications sent by your library, or ceasing to send them altogether.
Refusing to accept input that was previously accepted. This holds true even if you were previously accepting inputs to the library that were garbage, made no sense, or were otherwise invalid. Basically: disallowing things that were previously allowed.
Changing the error codes reported in a particular situation.
Performing callbacks on a different thread to the threads they were registered on. For instance, in a URL downloader library, you might want to switch to having a separate thread do the download rather than only reading when the main loop iterates. If you start executing user code in a new thread however, you are now implicitly requiring the user to write thread safe code whereas before you didn't. This can cause crashes, data corruption and other badness. If you want to use threads in your library implementation, it's recommended that you marshal callbacks into the thread the user code expects. You can then add a new function which allows the library to perform callbacks direct from the downloader thread to increase performance if the user has made their own code thread safe.
Changing any on-disk storage formats used in an incompatible way.
Returning new values from a function that the developer did not expect. This can cause problems with code written like so:
```
          int i = xxx_ask_a_question();    /* display a dialog box */
          if ((i == XXX_NO) || (i == XXX_CANCEL))
          {
             printf("no\n");
          }
          else
          {
             printf("yes\n");
          }
      
```
Now introduce an XXX_SHOW_HELP return code. Oops! The program will think you clicked yes. If this question was "that might damage your hardware, really continue?" ... well, let's not go there :) This sort of problem is rather hard to avoid - the recommended way of doing it is to introduce a new function, xxx_ask_a_question_with_help() for instance that can return the new code. You could simply blame the developer for not considering future API extensions, but this doesn't help the poor user who just saw their computer blow up.
Making code that was previously re-entrant non re-entrant. This one is especially subtle as it usually manifests itself as problems or crashes inside the library itself.
Basically what this means is that if your library can run callbacks in user code, your library needs to be re-entrant: ie at any point at which you run a callback, execution can start again from the top of any of the library functions. If you do a callback while in the middle of manipulating a linked list, you'd better believe that somebody ... somewhere ... will decide to call xxx_add() or xxx_remove() from within that callback. Maybe even rerun the function which triggered the callback in the first place!
If this used to work somehow, even if it was accidental, and you change it so it doesn't work anymore then user code might break. Very few libraries document their level of re-entrancy, it is assumed that any library which performs callbacks is totally re-entrant. Be careful!

You can usually easily avoid breaking backwards compatibility by taking some precautions when you start:

Pad your structures with pointers to make them larger than they need to be. Then, if you want to add a member, you can convert one of the padding fields and avoid changing the structure size. If you reach the last one, make it a pointer to a new extra structure.
Example:
```
          struct xxx_foo
          {
              char *x, *y, *z;
              int num;

              /* abi padding */
              void (*reserved1)(void);
              void (*reserved2)(void);
              void (*reserved3)(void);
              void (*reserved4)(void);
          }
      
```
Note the use of function pointers, rather than regular pointers or ints. Err on the cautious side, ie too much padding rather than too little. If you want to apply this technique to many structs, a macro for it may be helpful.
If a structure has both public and private fields, keep the private fields in a separate structure and just store a pointer to it in the public structure. This way you can change the layout of the private structure at will.
Don't change the parameters a function takes. If you want to introduce a new parameter, create a new function with it in and then move the code into the new function. Now make the old function just call the new function with a default value for the new parameter. How you name the new functions is up to you: Microsoft use the convention of appending Ex to the function name (for extended) - on Linux it's a better idea to give the new function a more descriptive name or append a number, eg: xxx_some_function_with_foo() or xxx_some_function2()
If you want to make a potentially breaking change in the semantics of your API - for instance, by altering the random number generation algorithm xxx_random() uses - you might want to export a variable that enables the new semantics which default to off. For instance, require the developer to write:
```
          xxx_new_random_number_algorithm = TRUE;

          ....

          int i = xxx_random();
      
```
Don't invert the meaning of this variable, ie don't make it a "please use the old way" setting. This is useless, as old binaries will still break and have to be modified, at which point they might as well be updated to use the new semantics anyway. Meanwhile the end user is suffering with broken software.

Namespacing

Good libraries are namespaced. This is especially critical on ELF platforms like Linux because exported functions with generic names like clear or define_key (*cough* ncurses *cough* :) can interfere with other totally independent libraries due to the way symbol scoping works. Using symbol versioning doesn't mean you don't have to namespace your libraries, it simply helps avoid conflicts at runtime. They can still occur at link time. This is true even of symbols internal to your library. That means if you have a function not declared static in a C file because it's used from another C file, it either has to be namespaced, or you have to be using symbol versioning to mark it as local so it won't be exported. Otherwise, even your library internals can get its wires crossed!
Namespacing libraries is easy. Just ensure that each symbol in your headers starts with a short, recognisable string. For instance, all symbols in the GTK+ library start with "gtk" (surprise). This includes things that don't appear in the binary like constants defined using macros, and the contents of enumerations. They could conflict with other headers!

Config scripts

When a program uses your library, it will probably want to check for that libraries presence in its configure script. It will also want to find out what the compiler/linker flags are for it on this system.
The traditional way to do this is to ship a libxxx-config program that is installed into the bin directory of wherever the library is installed to. Each library has its own slightly different config script, and the arguments that work with one wouldn't work with another. Because of this, pkgconfig was invented.
It's recommended that you ship a .pc file which pkgconfig can interpret. How to do this is described here.
One tip: you may see .pc files that list not only the -l option for the library, but also all the libraries that it uses as well. For instance, if libxxx uses libz and libX, you might be tempted to use this:
```
Libs: -L${libdir} -lxxx -lz -lX
```
Don't do this! By leaking information about how the library is implemented you are breaking encapsulation, a very important programming principle. It's not purely academic: it gives the application bogus dependencies upon libraries it doesn't even use, and if those libraries are upgraded or removed (perhaps your library doesn't need them anymore) now the application will refuse to start for no good reason. By only specifying the actual library on the link line you keep how the library is implemented away from the application itself, which can increase robustness.

Although this guide may seem a bit scary, don't be put off from writing shared libraries! Writing reusable libraries of code is a fantastic way of contributing to the open source community, and by following the guidelines given above you can ensure people benefit to the maximum extent possible.

Have fun!