Adventures with Shared Libraries on Linux

Shared libraries (The equivalent of a dynamic link library in Windows) are bits of code which sit outside of a program, and can be shared by multiple programs simultaneously. Unlike a static library, which gets copied into the program that you build, shared libraries help reduce memory usage, as the memory manager can share the same code pages across different programs.

They are an elegant solution when they work, but when messed up, can lead to a nightmare (a situation known as “DLL Hell” in Windows). Since I’m beginning to reuse code a lot between projects, I figured it was time to invest a few hours (that turned into almost a week) on researching shared libraries.

The Problem

When many programs use the same shared libraries, the interface between the programs and the shared libraries needs to be fixed, so that a program can expect the functions it calls in the shared library to exist and behave in a certain manner.

Does this mean that shared libraries need to be extremely stable and can’t be changed once deployed? Not really. Shared libraries become hell when they aren’t version-tracked properly. Arguably, this is a version tracking issue rather than a problem with shared-libraries itself.

The Solution

The solution here is in version tracking the interfaces of the shared library, through what is known as a “soname“. The soname is a way for a program to find the correct library file to use that meets its needs (i.e provies all the functions it calls and behaves consistent with the caller’s expectations). The soname consists of a a combination of the name of the library + the version number of the interface.

Be aware that the version of interface is not necessarily the same as the version of the shared library itself — it is possible to make changes to the shared library, but if its interface doesn’t change, then programs that use it need not know the difference, and therefore can use the newer version. In such a case, the soname remains constant.

Building a Shared Library

Building a shared library is actually quite easy. You would start of with a number of *.c or *.cpp files containing the functions you want to re-use, and a *.h that declares them (tip: make life easy for your users, even though you have split the functions over many *.c/*.cpp files, combine their declarations into a single *.h file so your users need only #include one thing in their code).

For example, say we have a library called “fantastic 2.0”, that is made from the following files:

fantastic.h
fantastic1.c
fantastic2.c

You start by compiling each *.c into a *.o, with the addition of the “-fPIC” compiler option, which generates ” position independent code”, a requirement for shared libraries:

gcc -c -fPIC fantastic1.c
gcc -c -fPIC fantastic2.c

Now you link it with a shared library. This is where we start dealing with version numbering.

In our example, let’s say the version of the software is 2.0, since this is the first time we are publishing the interface, the interface is numbered 1.

The soname, consists of the world “lib” + the name of the library + the extension “.so.” + the interface version, i.e. libfantastic.so.1. This soname gets embedded into the output shared library.

gcc -shared -Wl,-soname,libfantastic.so.1 -o libfantastic.so fantastic1.o fantastic2.o

That’s it as far as the actual building of a shared library goes.

Deploying the Shared Library

Something called the dynamic linker is responsible for marrying your own program with the shared library when you run it, and it expects the shared library to be in a certain location and named in a certain way:

The actual file for your shared library, with the soname and the version number, e.g. “libfantastic.so.1.2.0” should be installed inside /usr/lib.A symbolic link should be created between the soname and the actual file, e.g. libfantastic.so.1 -> libfantastic.so.1.2.0.

You then need to run “ldconfig” after setting up the above. This updates the dynamic linker’s cache file in /etc/ld.so.cache. The cache maps “sonames” to actual filenames, making it easy for the dynamic linker to find the actual library file given its soname (it doesn’t need to scan the entire /usr/lib folder every time). You can see what sonames map to what actual library files by running “ldconfig -v”.

Another, optional thing you need to do, is add a “versionless” link of the library to the soname, e.g. libfantastic.so -> libfantastic.so.1. This is not actually required to use the library, but makes things slightly easier for developers.

I found myself doing the above steps often enough in my makefiles and packaging files that I’ve created the following script fragment:

# script to install a shared lib
( FILENAME=libfantastic.so; LIBNAME=fantastic ; 
IFACE=1; VERSION=2.0 ; 
SONAME=lib${LIBNAME}.so.${IFACE} ; 
REALNAME=${SONAME}.${VERSION} ; 
DEST=${DESTDIR}/usr/lib ; 
install -D ${FILENAME} ${DEST}/${REALNAME} ; 
cd ${DEST} ; 
ln -s ${REALNAME} ${SONAME} ; 
ln -s ${SONAME} ${FILENAME} ; 
)

Don’t forget to also package your header file, e.g. fantastic.h. The final layout of things should be like this:

/usr/lib/libfantastic.so.1.2.0
/lusr/lib/libfantastic.so.1 -> libfantastic.so.1.2.0
/lusr/lib/libfantastic.so -> libfantastic.so.0
/usr/include/fantastic.h

If you’re packaging separate versions of your library for runtime and for development, the files should be split as follows:

fantastic:
/usr/lib/*.so.*
fantastic-devel:
/usr/lib/*.so
/usr/include/*.h

Using Your Shared Library

Now let’s say others want to use your shared library. A developer would need to install both the “fantastic” as well as the “fantastic-devel” library.

Let’s say he has a program called “main.c”, it needs to include the fantastic.h file to declare the functions:

#include <fantastic.h>
#include <iostream>

using namespace std;

int main () {

cout << "Hello, World!" << endl;

... call some function in libfantastic ...

}

Compile it with:

g++ -c main.c

And link it with:

g++ main.o -lfantastic -o main

Some behind the scenes activities happen here. When it sees -lfantastic on its command line, The Linker searches for a file in its library path called “libfantastic.so”. However it doesn’t encode the path to this file into the program, instead when it finds a matching library file, it grabs the soname of that library and embeds inside the program.

Since we installed the fantastic-devel package, we have a symlink from /usr/lib/libfantastic.so -> /usr/lib/libfantastic.so.1 -> /usr/libfantastic.so.1.2.0 ; and so the soname embedded will be “libfantastic.so.1”.

By specifying “-lfantastic” on the command line, we’re essentially saying use the soname of the fantastic library from the current fantastic-devel package. So essentially, we can control which interface version a program uses, by installing the appropriate “devel” package before building. And bear in mind, only one “devel” package can be installed on a system at any time.

You can check what shared library sonames a program depends on, and what actual files that leads to on any platform, by running “ldd main”. It lists the “sonames” a program requires, and which library file/link provides that soname.

Releasing a Newer Library

So now back to the library developer, let’s say you wish to release a new version of fantastic, the first question you need to answer is: does this new version only change the implementation, or is the interface changed too? If only the implementation has changed, then you just increment the software version number. However, if the interface has changed, then the soname changes to, as the interface version is part of the soname.

So if the implementation only changes, the new version will appear as follows:

/usr/lib/libfantastic.so.1.2.1
/lusr/lib/libfantastic.so.1 -> libfantastic.so.1.2.1
/lusr/lib/libfantastic.so -> libfantastic.so.0
/usr/include/fantastic.h

But if the implementation changes, you generate this:

/usr/lib/libfantastic.so.2.2.2
/lusr/lib/libfantastic.so.2 -> libfantastic.so.2.2.0
/lusr/lib/libfantastic.so -> libfantastic.so.2
/usr/include/fantastic.h

Shared Library Coexistence

As far as non-devel packages go, there is no reason why you can’t have two different soname interfaces supported on the same system:

/usr/lib/libfantastic.so.1.2.1
/lusr/lib/libfantastic.so.1 -> libfantastic.so.1.2.1
/usr/lib/libfantastic.so.2.2.2
/lusr/lib/libfantastic.so.2 -> libfantastic.so.2.2.2

In this situation, older programs compiled against interface version 1 will look for the soname “libfantastic.so.1” and newer ones compiled against interface version 2 will look for the soname “libfantastic.so.2”.

This extends to packaging as well. There is actually no problem for rpm to have two or more of the same package names installed simultaneously, provided the files they provide don’t conflict. You just need to install the second one with “rpm -ivh” rather than upgrade as in “rpm -Uvh”, as an upgrade removes the earlier version.

If you have a binary executable that goes along with your library, it’s usually packaged separately, e.g. for postgresql there is:

postgresql-8.1.22-1.el5_5.1
postgresql-libs-8.1.22-1.el5_5.1
postgresql-devel-8.1.22-1.el5_5.1

You can install multiple postgresql-libs, but only one of the others. Caveat: RPM implictly adds a “Provides” attribute for all the sonames in your shared library, so you can only install multiple library RPMs if they provide different sonames (which makes sense when you think about it).

This is why, when compiling your own programmes, it is important to know which soname is being linked against. This would be determined by the which devel package is installed, as it provides the libfantastic.so -> libfantastic.so.N (unversioned to soname) link.

By implementing your shared libraries in this way, you can support multiple versions of the same package without needing any special hacks in naming or #ifdefs in your users code. It is possible to package all releases under the same package name as long as they are versioned properly.

Note: the only case where it is necessary to create a different package name is when you want to release different versions of executable programs (not just shared libraries) AND intend for multiple versions to co-exist together simultaneously. I can’t think of many situations that call for this.

Afterthoughts

Don’t simply install devel packages; make sure you install the lowest soname for what your software requires. If you aren’t writing code that links against a library, don’t even install its devel package in the first place!Be especially careful if you have a “test” programme or utility — it should never be packaged together with your shared library. Put the utility in the main package, and the libraries in main-libs, and the header and unversioned symlink inside main-devel. You can make the main and main-devel package dependent on main-libs, but main-libs should have as few dependencies as possible. Otherwise you limit the capability for “elegant coexistence”.

References: http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.htmlhttp://gcc.gnu.org/onlinedocs/gcc-4..2/gcc/http://www.rpm.org/max-rpm/s1-rpm-depend-auto-depend.html

This post was originally published as a Facebook Note at 2011-02-10 18:00:59 +0800.

You may also like...