Eigenstate : Writing Makefiles

Notes on Writing Makefiles

Make is an ancient part of Unix, said to have been present even when dinosaurs roamed the earth. Today, it is still widely used, shipped with every Unix out there in one form or another.

While it's certainly ancient, warty, and far from perfect, it is still a good choice for building software and running dependent commands.

There are many, many tutorials in the wild describing the basic use of Make. This document won't try to duplicate them. Instead, I would like to lay out some suggestions on how to write clean, maintainable, predictable makefiles. These rules can be summarized as follows:

Follow Conventions
Don't surprise people. Instead, provide a familiar interface.
Factor out common code
Make has acceptably good support for factoring rules out into libraries. Use these facilities
Autogenerate Dependencies
Nothing's more annoying than a makefile that fails to rebuild code when it needs to. Writing out dependencies by hand is not only tedious, but also bug prone. Compilers can often integrate with make to solve this issue
Disable magic
Make comes with a large number of default behavior that can lead to surprising results. It's better to keep things explicit. If you put your rules into a library, it doesn't cost much in verbosity.

Make has plenty of limitations, of course. For example, if you need to build something that needs dependencies before the build can even begin, the only way to achieve this easily with make is to write a tool that will scrape the dependencies separately. And the dependency on a shell for running commands makes make an awkward tool if you want to use Windows, for which I would seriously look into cmake, or another similar build generation tool.

There are also lots of subtle points and cruft which have built up in make over the years, making it a not especially pretty option in many cases. But it's there, it does the job, and the replacements aren't exactly stellar either.

Tutorials

I'm not interested in giving a full tutorial or reference on make, as others have done it before, and have written it better than I could.

For an introduction to make, take a look at:

Makefile Tutorial
GNU Make Manual

An Aside: Code You can use

I've more or less turned the hairy bits of the makefile definition into a library that I use for most of my projects. It handles many of the more annoying pieces, implements the usual targets, and comes with a script to handle the most basic bits of configuration.

The implementation is relatively limited, directed towards producing one artifact per directory, but it serves my needs. It should be portable to most modern Unixes, and is regularly used on Linux, FreeBSD, and OSX.

My makefiles tend to look something like this:

INSTBIN=binary
OBJ= \
    obj.o \
    list.o \

DEPS=../lib/libconvenience.a

include ../config.mk
include ../mk/c.mk

Where all of the options for the build are in config.mk, and the rules for building and installing C, C++, and so on are in a generic library in mk/c.mk, and config.mk is generated by ./configure.

The code for it lives on github, at https://github.com/oridb/mk

Follow Conventions

Ususal Targets

If you're writing makefiles, you should follow established conventions to whatever extent is reasonable. Your makefiles should support the usual targets:

all
Build all targets. Some people like to put these built files into a directory named build/, others like to put them into the same directory as the source files. I'm neutral on that issue. all should be the default built target.
clean
Remove all built and autogenerated files. There's not much to be said for this one. This just undoes make all. If your makefiles are done correctly, it should be needed very rarely.
check
This runs the tests that you shipped with your code. You have tests, right?
install
This copies the binaries and data files to the right locations. There are two configuration knobs that this target typically supports: prefix and DESTDIR. More on this later.
unintall
Uninstall, unsuprisingly, undoes the install target, removing files from the configured install directories.

Environment Variables

Make should respect a number of environment variables. Not all of these apply for all targets, but many of these should be respected.

DESTDIR
The directory to stage the install into. A full description of how this interacts with make install is in the section on installation below.
CC
The C compiler to use. This should not be overridden explicitly in the makefile, but left as the default for the make version.
CXX
The C++ compiler to use. As with $CC, this should not be overridden explicitly in the makefile.
CFLAGS
The flags passed to the C compiler. These should be appended to whatever mandatory flags are added, and are for the user to add include paths, set warnings, and so on.
CXXFLAGS
Unsurprisingly, the same as CFLAGS, only for C++ code. Again, don't override these flags, just append them.

Configuration

Configuration is typically done with a small program, generally written in shell script, named configure. I typically prefer to write my code portably, instead of probing for large numbers of system specific workarounds, which means that I can feasibly keep the script simple. A static config file is another option, and not a bad one, but I tend to stick with a configure script for familiarity.

It's also possible to use GNU Autoconf for this. This is a large, hairy, and relatively ugly program, and generates a shell script with the same attributes, but it isn't fundamentally tied at the hip to Automake. If you need to probe for the quirks of every Unix under the sun, Autoconf has you covered.

Installation

make install should install to the usual directories on Unix, with $prefix prepended. $prefix/bin for executables, $prefix/lib for shared and static libraries, $prefix/share for data that is not executable, and is shared between architectures (icons, etc). Hard coded paths, as are typical on Unix, should include this prefix.

prefix is generally configured by running:

./configure --prefix=prefix

make install should also respect the DESTDIR environment variable, which redirects installation to another directory without affecting the prefix. This is used by when packaging code for distro packages. So, for example, if you configure with --prefix=/my/prefix, and install with DESTDIR=~/staging then you should end up with files installed to ~/staging/my/prefix/bin/example.

For a full description, the Filesystem Hierarchy Standard lays out exactly where every file should go.

Factor out Common Code

Make has surprisingly adequate facilities for writing generic makefiles. Patterns can be defined to apply similar rules over many files, and includes can be used to share these generic rules between many different makefiles,

Special Variables: A Refresher

Make has a number of special, automatically defined variables that you can use in rules. I'm not going to define all of them here, but I'm going to give a refresher on the most common ones.

$@
The output target.
$\*
The stem of the pattern in the rule, ie, the part that the '%' matched.
$<
The first dependency in the dependency list. This is useful if, for example, you have a .o file and a number of headers.
$^
The list of all dependencies.

Pattern Rules

Pattern rules are your friend, and are the core of abstraction in Makefiles. To refresh, you can define a pattern in the following form:

%.suffix: %.prefix other dependencies
        command $@ $<

where '%' matches any stem. $@ expands to the output file, and $< expands to the first input file. $* is similar to $@, but it includes all of the dependencies.

So, for example, if I wanted to define a generic rule to build a .c file into a .o file, I might write something like:

%.o: %.c %.h whatever.h
    $(CC) -c -o $@ -std=c11 $(CFLAGS)

This conforms to the rules above. $(CC) is used, and $(CFLAGS) is appended to the mandatory cflags instead of overriding them.

Generate Dependencies

A large number of tools know how to interoperate with make, generating fragments of makefile that will accurately capture dependencies. With most Unix compilers, the flags to generate headers are the -M{stuff} family of flags. I tend to use -MMD -MP -MF .deps/filename.d, which will, as a side effect of the first compilation, generate a dependency file named .deps/filename.d. This must be included in order to have an effect on future compilations. So, for example, adding to the previous rule:

%.o: %.c
        $(CC) -c -o $@ -std=c11 -MMD -MP -MF .deps/$*.d $$(CFLAGS)

-include .deps/*.d

You'll notice how the headers were dropped from the dependency list. This is because the first build never needs to determine staleness, and the autogenerated makefile fragment will include dependencies on all headers needed for subsequent builds.

The -include line is the line that includes all fragments for autogenerated dependencies. The -include variant is used so that the lack of dependencies will not cause a problem for the initial build.

Disable Funny Business

Make has a whole range of implicit rules, behaviors, and actions. We should probably turn off some of the more surprising bits. I usually turn off implicit rules, since it's far easier to debug what's going on when you only have to deal with the rules you defined. To turn them off, define an empty pseudorule:

.SUFFIXES:

Then, I tend to turn off the auto-removal of intermediate files. This can trigger spurious rebuilds if you are building, for example, test targets that end up creating implicit rules. To turn that off, add another fake rule.

.SECONDARY:

Include Generic Make Code

There isn't too much to say here. If you have rules that you can make generic, you put them into a separate makefile, and include it. The only rough point is that there is no especially awesome way to pass information to the rules in the included makefiles, other than variables.

An Example

A makefile I've seen out in the wild is below. It's not eye-bleedingly bad, and far worse can be found. It's got the virtues of simplicity, but it also has a number of problems. Take some time, study it,and think about what you can improve.

# ---- Configuration options ----

CC = g++  # g++ or clang++ are both good
CXXFLAGS = -std=c++11  # Mandatory flags
CXXFLAGS += -Wall -O1

# ---- Global targets ----

EXECUTABLES = Base58CheckTest CurvePointTest EcdsaTest \
    FieldIntTest Ripemd160Test Sha256HashTest Sha256Test \
Sha512Test Uint256Test

# Default build target. This compiles all the test executables and runs them.
all: $(EXECUTABLES)
    for name in $(EXECUTABLES); do echo ./$$name; ./$$name; done

clean:
    rm -f -- $(EXECUTABLES) *.o

.PHONY: all clean


# ---- Executable files ----

Base58CheckTest:  Base58CheckTest.cpp  Base58Check.o \
    Utils.o Sha256Hash.o Sha256.o Uint256.o
    $(CC) $(CXXFLAGS) -o $@ $^

CurvePointTest:   CurvePointTest.cpp   Utils.o Uint256.o FieldInt.o CurvePoint.o
    $(CC) $(CXXFLAGS) -o $@ $^

<a bunch of other tests>

# ---- Module files ----

Base58Check.o: Base58Check.cpp Base58Check.hpp
    $(CC) $(CXXFLAGS) -c -o $@ $<

CurvePoint.o: CurvePoint.cpp CurvePoint.hpp
    $(CC) $(CXXFLAGS) -c -o $@ $<

<a bunch of other inputs>

Let's Fix It

There are a few things that jump out at me when I look at this.

In order to facilitate separating out the makefiles into something that is reusable, we might start by defining the inputs and outputs as variables:

LIB=bitcoincrypto
LIBOBJ=Base58Check.o CurvePoint.o Ecdsa.o FieldInt.o \
    Ripemd160.o Sha256.o Sha256Hash.o Sha512.o Uint256.o Utils.o 
TESTS=Base58CheckTest CurvePointTest EcdsaTest FieldIntTest \
      Ripemd160Test Sha256HashTest Sha256Test Sha512Test Uint256Test

Since this is trying to build a library, let's start by defining a generic rule for .o files, with dependency generation.

%.o: %.cpp .deps
    $(CXX) $(CXXFLAGS) -c -o $@ $< -MMD -MP -MF .deps/$*.d

.deps:
    mkdir .deps

-include .deps/*.d

There are also a whole bunch of binaries being built, with more or less the same options. Since the goal here is to build something that can be used as a library, we may as well actually build the library, and then use it for all of the binaries:

lib$(LIB).a: $(LIBOBJ)
    $(AR) -rcs $@ -- $(LIBOBJ)

And add in a generic rule for the binaries themselves:

%: %.o lib$(LIB).a
    $(CXX) -o $@ $< -L. -l $(LIB)

Then we can add in a rule for everything:

all: lib$(LIB).a $(TESTS)

We also want to follow the conventions, adding in clean and check rules:

all: lib$(LIB).a $(TESTS)

clean:
    rm -f -- lib$(LIB).a $(LIBOBJ) $(TESTS) $(TESTS:=.o)

check: $(TESTS)
    for name in $(TESTS); do echo ./$$name; ./$$name; done

Now, we also had been overriding $(CC) with a C++ compiler, which is wrong on two counts: CC is for a C compiler, and we should not be overriding the user's choices. Let's delete that line, and stop overriding CXXFLAGS while we're at it.

The Final Result

We can put this into two different files. In 'Makefile', we put the following code:

LIB=bitcoincrypto
LIBOBJ=Base58Check.o CurvePoint.o Ecdsa.o FieldInt.o \
    Ripemd160.o Sha256.o Sha256Hash.o Sha512.o Uint256.o Utils.o 
TESTS=Base58CheckTest CurvePointTest EcdsaTest FieldIntTest \
      Ripemd160Test Sha256HashTest Sha256Test Sha512Test Uint256Test

include mk/lib.mk

Then, in mk/lib.mk we can put the following:

# Append instead of clobber.
CXXFLAGS += -std=c++11  # Mandatory flags
CXXFLAGS += -Wall -O1

# All the targets
all: lib$(LIB).a $(TESTS)

%: %.o lib$(LIB).a
    $(CXX) -o $@ $< -L. -l $(LIB)

lib$(LIB).a: $(LIBOBJ)
    $(AR) -rcs $@ -- $(LIBOBJ)

clean:
    rm -f -- $(LIBOBJ) $(TESTS) $(TESTS:=.o)

check: $(TESTS)
    for name in $(TESTS); do echo ./$$name; ./$$name; done

.SUFFIXES:
.SECONDARY:
%.o: %.cpp .deps
    $(CXX) $(CXXFLAGS) -c -o $@ $< -MMD -MP -MF .deps/$*.d

# subtle: directories are listed as changed when entries are
# created, leading to spurious rebuilds.
.deps/stamp:
    mkdir .deps && touch/stamp

-include .deps/*.d

An Aside: Dialects

Make is an old language, spoken by many. And like most languages with many speakers living in many different parts of the world, there are a huge number of dialects that are spoken.

GNU Make

GNU Make seems to be the most popular dialect of make. It's featureful, ubiquitous, and generally sufficient. It's the default make implementation on Linux and OSX, and available on most other systems as gmake. I'm going to be using GNU Make for the tutorial.

BSD Make

BSD make is, unsurprisingly, the default make program on most BSD implementations. The basic syntax is the same as GNU make, but the syntax for even the most basic extensions, like conditionals, is different. Everything discussed here should be possible to implement in BSD make.

Mk

Mk is not quite make. It's used by default on Plan 9, and is shipped for Unix as part of Plan9port. Being much newer than make, it's got quite a few cleanups over it. I mainly mention it because it served as the inspiration for some of these guidelines.

Others

There are a number of other make implementations which exist in the wild. You are unlikely to run into them. For further reading, there's always a list on Wikipedia

Further Reading