I. Introduction

C++ programs must be compiled and linked before they can be executed. Compilation takes each human-readable .cc file as input and produces a machine readable .o file as output. Since a .cc file can use a function defined in another file, linking is necessary to match up the call sites of a function with its definition and produce the final executable.

It is not always obvious that linking is a separate step from compilation because command line tools like g++ do both the compilation and linking in one go.

II. Separate compilation and linking example

We’ll use this simple program as a running example.

To produce an executable file from the source code above, type:

$g++ square.cc main.cc # Compile and link square.cc and main.cc.  If you want to separately compile and then link you can type: $ g++ -c square.cc  # Compile square.cc to machine code.
$g++ -c main.cc # Compile main.cc to machine code.$ g++ square.o main.o  # Link square.o and main.o.


The -c flag tells g++ to compile the file without linking. When you pass .o files to g++ it will link them together.

g++ -c square.cc takes the source code in square.cc, converts in to machine code that can be executed by your computer and finally puts that exectuable code in the square.o file along with some bookkeeping information.

The square function is declared in main.cc but it is not defined in main.cc.

A declaration looks like int square(int x);, it tells the compiler the types of the return values and the arguments of the function. This allows the compiler to type check the function call square(3) without having to know how square is implemented1. Typically declarations will be in a header file.

A definition contains the actual body of the function. In our example, the square function is defined in square.cc.

After main.cc is compiled, the main.o file contains the machine code for the main function and some metadata which records that the square function is declared, but not defined, in main.o.

In the final step, g++ square.o main.o links the object files into an executable program by matching up the function declaration in main.o with the function defined in square.o.

Most libraries will have many source files, and therefore many .o files. When distributing a library on the internet, it is typical for the object files in the library to be bundled together into an archive file ending in the .a extension. A .a file bundles a bunch of .o files together for convenient linking.

When you install a C library on Linux, the headers for the library are typically placed in /usr/local/include and the .a file in /usr/local/lib.

The compiler will automatically look in /usr/local/include for headers (again this only applies to Linux, type g++ -E -Wp,-v - to see the full include path).

To tell the compiler to link with a .a in /usr/local/lib, pass the flag -l<name of library> to the compiler (type g++ -print-search-dirs to see the full link path).

For example, I recently installed FFTW on my computer. It added libfftw3.a to my /usr/local/lib directory. To link with this library I type:

g++ main.cc -lfftw3


Note the lib prefix is omitted, there is no space between -l and fftw3 and the -lfftw3 flag is after the source file which uses it.

IV. Inspecting object files

When researching this blog post, I found it really helpful to actually inspect the object files produced by the compiler. objdump can do this.

First, add some global variables to square.cc to make it more interesting.

To use objdump, type:

$g++ -c square.cc$ objdump --disassemble --full-contents --all-headers --section=.text --section=.rodata --section=.data square.o


The --disassemble flag shows the assembly for our functions, --full-contents shows the contents of each section of the object file in both hex and ascii, all-headers shows the symbol table and sections, --section=.text --section=.rodata --section=.data filters the results to only include the functions we defined, global read-only data and global data.

The output is:

square.o:     file format elf64-x86-64
square.o
architecture: i386:x86-64, flags 0x00000011:
HAS_RELOC, HAS_SYMS

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
0 .text         00000010  0000000000000000  0000000000000000  00000040  2**0
1 .data         00000004  0000000000000000  0000000000000000  00000050  2**2
3 .rodata       0000000e  0000000000000000  0000000000000000  00000058  2**3
SYMBOL TABLE:
0000000000000000 l    d  .text	0000000000000000 .text
0000000000000000 l    d  .data	0000000000000000 .data
0000000000000000 l    d  .rodata	0000000000000000 .rodata
0000000000000000 l     O .rodata	000000000000000e _ZL8greeting
0000000000000000 g     O .data	0000000000000004 x
0000000000000000 g     F .text	0000000000000010 _Z6squarei

Contents of section .text:
0000 554889e5 897dfc8b 45fc0faf 45fc5dc3  UH...}..E...E.].
Contents of section .data:
0000 03000000                             ....
Contents of section .rodata:
0000 48656c6c 6f2c2077 6f726c64 2100      Hello, world!.

Disassembly of section .text:

0000000000000000 <_Z6squarei>:
0:	55                   	push   %rbp
1:	48 89 e5             	mov    %rsp,%rbp
4:	89 7d fc             	mov    %edi,-0x4(%rbp)
7:	8b 45 fc             	mov    -0x4(%rbp),%eax
a:	0f af 45 fc          	imul   -0x4(%rbp),%eax
e:	5d                   	pop    %rbp
f:	c3                   	retq

Disassembly of section .data:

0000000000000000 <x>:
0:	03 00 00 00                                         ....

Disassembly of section .rodata:

0000000000000000 <_ZL8greeting>:
0:	48 65 6c 6c 6f 2c 20 77 6f 72 6c 64 21 00           Hello, world!.



The first line

square.o:     file format elf64-x86-64


tells us that the file is in the Executable and Linkable Format (ELF). This is the default object code format on Linux. On macOS it is Mach-O and on Windows there is COM, PE and PE32+.

The object file is organized into sections. Metadata about the sections are displayed in a table.

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
0 .text         00000010  0000000000000000  0000000000000000  00000040  2**0
1 .data         00000004  0000000000000000  0000000000000000  00000050  2**2
3 .rodata       0000000e  0000000000000000  0000000000000000  00000058  2**3


The symbol table is a table which contains metadata about every global variable and function. For example, the entry _Z6squarei is the entry for the square function. The compiler transforms the names of functions in a process called name mangling to encode type information (and potentially other data such as which namespace the function was declared in) into the function name. This ensures that even if we declare two different functions with same name such as int square(int x) and double square(double x), every entry in the symbol table will have a unique name. You can add the flag --demangle to objdump to make the names more human-readable.

SYMBOL TABLE:
0000000000000000 l    d  .text	0000000000000000 .text
0000000000000000 l    d  .data	0000000000000000 .data
0000000000000000 l    d  .rodata	0000000000000000 .rodata
0000000000000000 l     O .rodata	000000000000000e _ZL8greeting
0000000000000000 g     O .data	0000000000000004 x
0000000000000000 g     F .text	0000000000000010 _Z6squarei


The contents of each of the sections is shown in hexadecimal and ASCII. The .text section contains the machine code for the square function, the .data section contains our global variable x (which has value 3) and the .rodata section (read-only data) has our greeting variable (which has value “Hello, world!”).

Contents of section .text:
0000 554889e5 897dfc8b 45fc0faf 45fc5dc3  UH...}..E...E.].
Contents of section .data:
0000 03000000                             ....
Contents of section .rodata:
0000 48656c6c 6f2c2077 6f726c64 2100      Hello, world!.


The assembly code for the square function is shown at the end of the output.

Disassembly of section .text:

0000000000000000 <_Z6squarei>:
0:	55                   	push   %rbp
1:	48 89 e5             	mov    %rsp,%rbp
4:	89 7d fc             	mov    %edi,-0x4(%rbp)
7:	8b 45 fc             	mov    -0x4(%rbp),%eax
a:	0f af 45 fc          	imul   -0x4(%rbp),%eax
e:	5d                   	pop    %rbp
f:	c3                   	retq


V.

Many statically typed languages such as C, Rust and Swift follow the same model as C++: separate compilation of source files followed by linking. This means you can call e.g. C functions from Swift by compiling the human-readable source files into object files and linking them together. It’s useful to know about linking if you want to interop between C and more modern languages.

Even if you just stick to C++, some of the darker corners of language and the more inscrutable error messages from the compiler are due to C++’s linking model. Having a good mental model of linking and compilation can save hours of debugging.

P.S. To keep things simple I didn't discuss link-time optimization at all in this post. For a great overview of link-time optimization and its implementation in LLVM, see Teresa Johnson's talk ThinkLTO: Scalable and Incremental Link-Time Optimization. (This is one of my favorite CppCon talks ever!)

Comment on Hacker News

1 C does not require explicit declarations, C++ does. A C program calling a function that has not been declared will not compile with a C++ compiler.