A tutorial for using make
.
To be read after the tutorial about the compiler toolchain.
Abstract
Themake
utility formalizes as a directed graph the problem of compiling several source files. The vertices of the graph are files and the edges are the instructions to transform the corresponding files. Vertices with only outgoing edges are the source files. Vertices with only incoming edges are the target files. Vertices with both incoming and outgoing edges are intermediary files. The build process is performed by creating the target files according to the instructions given on this directed graph. The edges of this directed graph are called rules. Themake
program comes with a large set of predefined rules, and these rules can be augmented or changed by writing aMakefile
.
Disclaimer
The goal of this tutorial is to showcase the timeless beauty of Makefiles, not to give a set of recipes.
The make
program is an extremely elegant UNIX utility. It solves
the problem of building a complex program given a formally described set of
dependencies. With the -j
option, it builds as many files as
possible in parallel.
In this tutorial we explain several use cases. We start with the simplest possible case (compiling a single file), and we move to more complicated cases (like compiling several files, or linking against libraries).
It is not necessary to write any makefile for using make
; for
example, when we want to build a single file. We start with this simple
example, that already illustrates almost all the features of make
.
Imagine that you want to compile a simple program hello.c
#includeint main(void) { return printf("hello world\n"); }
Then you run
make hello
And it will compile your program (by running cc hello.c -o hello
).
This is the whole story. Using make
is always the same: you ask
make to build one file, and he tries to do it, in the most meaningful way.
Everything else are minor details, like setting the compiler options; or minor
variations, like building more than one file.
In the previous example, note that running make hello
a second
time does nothing. Since the file hello
is already built, there's
nothing else to do. The program make
is idempotent: running
make
two times is the same as running it once. This is a
fundamental property. How does the program know that there is no more work to
do? Because it looks at the dates of each file.
Thus, to force a recompilation, you can change the date of the source code
(e.g., by running touch hello.c
) and then make
will
build it again.
Now try the following:
rm -f hello # delete the compiled file make hello.o # will run "cc -c hello.c -o hello.o" make hello # will run "cc hello.o -o hello"
Notice that, when the file hello.o
already exists,
make
nows that the source is already compiled, and it only needs
to link it to produce a final executable. It always tries to build the
requested file with the least possible amount of work. If you touch
hello.o
and then you run make hello
, it will only do the
linking, because the file hello
is older than
hello.o
.
How does make
know what to build from what?
Because he has a secret list of implicit rules that tell it so.
These rules form a directed graph, which in our case is the following:
hello.c ---------------------------> hello.o -----------------------> hello cc -c hello.c -o hello.o cc hello.o -o hello.c
The nodes of this graph are the filenames. The edges of this graph are the
instructions to create one file from another. When you run make
hello
, the program checks if the requested file appears in the graph,
and finds a directed path from a file that exists to the requested file.
Then, the program runs the instructions corresponding to each edge in the
path. It only runs the part of the path from the file that is newer than the
requested target, if any.
This is all that make
does.
The only control left to the user is the specification of the graph.
You can set variables to fine-tune the build process. The most important
variable is CC
, that specifies the C
compiler, and
the default value is typically "cc"
. There are two
ways to change the value of a variable (without writing a Makefile): either you set an environment variable of the same name, or you specify it as argument to the make
invocation.
# first technique: give argument to make make CC=clang hello # compile "hello" using clang # second technique: specify environment variable export CC=clang make hello # compile "hello" using clang
The following table summarizes the most important variables
variable | meaning | default | example assignement |
---|---|---|---|
CC | C compiler | cc | tcc |
CXX | C++ compiler | c++ | clang |
CFLAGS | flags for C compiler | -O2 -Wall | |
CXXFLAGS | flags for C++ compiler | -O2 -Wall | |
CPPFLAGS | flags for preprocessor | -I /path/to/my/includes -DMACRO=value |
|
LDFLAGS | flags for linker | -L /path/to/my/libs | |
LDLIBS | libraries for linker | -lmylib |
Variables are very powerful. For example, the following shell script builds the same program using five C compilers and different compiler options (debug and release mode):
for i in gcc clang tcc icc suncc; do make CC=$i CFLAGS="-Wall -g" hello mv hello hello_debug_$i make CC=$i CFLAGS="-Wall -O3" hello mv hello hello_release_$i done
This script creates 10 different executables with all the compilers and all the
compiler options (assuming than the named compilers are installed). These kind
of scripts are useful to check that your program gives zero warnings for a
large set of compilers and compiler options. And we have not written a
Makefile
yet!
Making without makefiles may be an interesting exercice, but it is not more
practical than calling the compiler directly. The real interest of
make
is that it allows to compile many files into one
program—or many programs—in a single stroke.
This is how you would specify the graph of the example above in the
Makefile
language:
hello.o: hello.c cc -c hello.c -o hello.o hello: hello.o cc hello.o -o hellox
That's a complete Makefile
.
It is a list of rules. Each rule has the following form:
target: source1 source2 ... sourcen instructions to build target from sources
Very important: the instructions are indented using one tab. Spaces will not work.
Once you have written a makefile, you can request to build a target by running
make target
. If you don't specify a target, the first target will
be built.
This simple makefile, where the rules are explicit, is actually less powerful
than an empty makefile that uses the implicit rules. There are two ways in
which it is less powerful: (1) we cannot change the compiler or the flags; and
(2) the name hello
is hardcoded, it does not specify how to build
files with different names. The first problem is solved using variables. The
second problem will be solved later using pattern rules.
You can define variables inside the makefile and use them later in the rules:
COMPILER=gcc COMPILER_FLAGS=-O3 hello: hello.c $(COMPILER) $(COMPILER_FLAGS) hello.c -o hello
Variables defined inside the makefile are taken as default values. They can be
overridden by redefining them as arguments in the call to make
. Of
course, the variable names that we have chosen in this example are
preposterous. Here, we should have used the standard names, which are
already given default values, and other people can expect default behaviour
from our makefile (such as taking into account their preferred compiler flags):
hello: hello.c $(CC) $(CFLAGS) hello.c -o hello
Notice that the makefile above is equivalent to an empty file, because it matches an implicit rule. However, for clarity we will work on this simple example and add more files to build. Until section 3.6, forget about implicit rules.
For now, we have just dealt with a single file to compile. In a typical
case, the source code of a program will span several files (let's say, three
files hello
, options
and lib
). Then,
we can specify the rules for building each file:
hello: hello.o options.o lib.o $(CC) hello.o options.o lib.o -o hello hello.o: hello.c $(CC) $(CFLAGS) -c hello.c -o hello.o options.o: options.c $(CC) $(CFLAGS) -c options.c -o options.o lib.o: lib.c $(CC) $(CFLAGS) -c lib.c -o lib.o
Now, running make
will call the compiler four times: one for each
object file, and then one to link all the objects into one exectuable. This is
embarrassingly parallelizable; indeed running make -j
will launch
the compilation of the three object files in parallel.
In the example above, there is a lot of redundancy: the names of the files appear many times. The redundancy can be removed by using local or automatic variables:
variable | meaning |
---|---|
%@ |
name of the target |
%^ |
list of all prerequisites |
%< |
the first prerequisite |
%* |
the stem of a pattern rule |
Thus, when the name of the target or of the prerequisites appears inside a rule (which happens almost always), we can simplify the rule using local variables:
hello: hello.o options.o lib.o $(CC) $^ -o $@ hello.o: hello.c $(CC) $(CFLAGS) -c $< -o $@ options.o: options.c $(CC) $(CFLAGS) -c $< -o $@ lib.o: lib.c $(CC) $(CFLAGS) -c $< -o $@
The makefile that we have just written above exhibits a higher-level type of redundancy: the rules themselves are all the same! Moreover, the names of each target and its prerequisite are the same, only differing by extension. Pattern rules allow to express this kind of redundancy:
hello: hello.o options.o lib.o $(CC) $^ -o $@ %.o: %.c $(CC) $(CFLAGS) -c $< -o $@
The character %
is a placeholder for an arbitrary string. If you
request to build a file with the extension .o
, then this rule will
match, and it will try to build the .o
file from the
.c
file in the indicated way.
In the makefile above, notice that the rules for building the object files are unnecessary, because they do exactly the same thing as the implicit rules. Thus, an equivalent makefile is the following.
hello: hello.o options.o lib.o $(CC) $^ -o $@
Rather short, isn't it? You just say that you need file.o
, and
the implicit rules take care of building it from file.c
.
Now we are in the rarefied atmosphere of theories of excessive beauty and we are nearing a high plateau on which geometry, optics, mechanics, and wave mechanics meet on a common ground. Only concentrated thinking, and a considerable amount of re-creation, will reveal the full beauty of our subject in which the last word has not been spoken yet. —Cornelius Lanczos, The Variational Principles of Mechanics.
The two-line makefile above can be further shortened to this thing of beauty:
hello: hello.o options.o lib.o
This is a complete makefile, equivalent to the examples given above. How is that even possible? What kind of sorcery is going on here?
This works because the multiple prerequisites of the same target can be stated on separate lines, and they are simply added to the rule (there must be exactly one rule per target). Thus, without using implicit rules and writing the prerequisites separately, this is equivalent to the following
hello: hello.o hello: options.o hello: lib.o %: %.o $(CC) $^ -o $@ %.o: %.c $(CC) $(CFLAGS) -c $< -o $@
When we put all the prerequisites on the same line and expand all the patterns, we recover EXACTLY the same text as in section 3.3.
We have talked before about a "secret" list of implicit rules. Actually, there is nothing secret about it. The implicit rules are defined explicitly by pattern rules and they look exactly like the last two patterns of the previous section. To look at the complete list of implicit rules run the following command:
make -p -f /dev/null > implicit_rules.txt
This will create a text file with the list of all implicit rules (and many
other information). Running make
without a makefile is exactly
equivalent to using this file. Now, this file may seem overwhelming; it is
probably very long, because make
uses a lot of heuristics, and
they are all specified here. But somewhere in the middle it must contain
lines that look more or less like this:
%: %.o $(CC) $^ -o $@ $(LDLIBS) %.o: %.c $(CC) $(CFLAGS) -c $< -o $@
...which is just the two pattern rules on section 3.6. See section 6.4 below for a more complete view of the default pattern rules.
It is highly recommended to print the list of implicit rules for your make setup, and read it thoroughly. Even if it is long, it is nothing more than a sequence of variable assignments and pattern rules.
It is not necessary that a rule creates any file, make
will not
verify anyway; you can run all sort of crazy stuff in the instructions. The
most typical is to have a clean
target, that instead of creating a
file called "clean" simply removes all the executable files. Or you
can have a check
target that runs unit tests in your code.
This is then our fancy makefile:
OBJ = main.o options.o lib.o hello: $(OBJ) $(CC) $(OBJ) -o hello clean: rm -f $(OBJ) hello check: hello ./hello -test
Notice that the clean
target has no dependences. The
check
target has the file hello
as dependency, so it
will compile the hello
file if needed.
NOTE: if you have files named clean
or check
then all hell will break loose. To protect against this risk, you can
precede these targets with a line that says
".PHONY:
". But I like to live on the edge.
Given the makefile above, the following shell script builds the program using five C compilers, and runs the test suite for each of the resulting executables, both in debug and in release mode (for a total of 10 checks of the test suite):
for i in gcc clang tcc icc suncc; do for m in "-O3 -DNDEBUG" "-g"; do make clean check CC=$i CFLAGS=$m done done
If you want to be really neat, design the test suite so that it is silent upon
success, and add the -s
(silent) option to make
.
Then, the script will only produce output when something fails.
Try doing that with cmake
!
In an ideal world (from the point of view of the makefile writer), your program is written from scratch using an old standard of the programming language, and it uses no external libraries. In practice, however, your program may rely in some modern features of the language—that require compiler flags—and it may need external libraries which are installed under strange names. Also, the dependences between the source files may be somewhat convoluted, and writing them by hand is error-prone. Let's see what we can do about all these problems.
The program make
will never try to find where external
libraries are located; it is just not his job. In theory, this is not a
problem at all. If your program requires e.g., the libtiff
library, then you simply add -ltiff
to the compilation line.
The following makefile compiles a program that requires libtiff:
OBJ=main.o options.o lib.o hello: $(OBJ) $(CC) $(CFLAGS) $(OBJ) -o hello -ltiff
This will work correctly as long as libtiff is installed on your system. What does it mean, exactly "to be installed in your system"? Well, by definition, it means that this makefile works! More precisely, it means that the following three things are true:
#include <tiffio.h>
in your source
code, the preprocessor finds this include file.
-ltiff
to the compilation line, the linker is
able to find the library file.
libtiff.so
in your system.
For example, if the program hello
of section 3.7 requires
libtiff
, then this is a complete makefile for compiling the
program
LDLIBS = -ltiff hello : hello.o options.o lib.o
This works because $(LDLIBS)
is used in the implicit rule for
linking objects.
For GNU and BSD systems, libraries are often correctly installed: once you install a library using the package manager, this library becomes available to the compiler without further ado. In case the library is not installed, the compilation will produce a clear error message, which I suppose is the desired behaviour.
In other situations (e.g., bizarre systems without package managers like OSX, or user-installed libraries), you may want to use a library that is not "correctly installed" according to the three points above. Then, the solution is to correctly install it! This can be done by setting three environment variables:
CPATH
.
.so
or .a
) to the variable LIBRARY_PATH
.
LD_LIBRARY_PATH
.
Once these three variables are set, then the library is correctly installed, and your makefile can be run. Notice that this task is independent to the usage of the makefile; the task is part of the installation of the library, for systems that do not have a decent package manager. These variables are recognized by all compilers that I know of (GCC, CLANG, TCC, INTEL and SUN compilers).
It is strongly advised to write portable code that compiles out of the box in any system. Today, this is much, much, easier than a few years ago because most systems are POSIX-compliant (with slightly different versions of the POSIX standard, though). Thus, horrible ``portability'' tools like automake, autoconf, and cmake are mostly unnecessary.
Still, there are a few situations when your code with a simple makefile is
not straightforwardly portable to all the systems that you may want to.
I show to examples: to compile ANSI C in older versions of GCC, and to
``enable'' openmp in the platforms where it is available. Once you
understand these hacks, you can easily rewrite them for other simtuations.
The basic idea is to run a shell command that will return or not an empty
output, according to the condition you want to check, and then capture this
output from within a $(shell ...)
directive:
# The following hack allows to compile modern ANSI C (C99 and newer), on # very old and unmantained versions of the gcc compiler (older than # gcc 5.1, released on april 2015). These old versions of GCC are able # to compile C99 and C11, but some features are not enabled by default, thus # the hack enables these features explicitly if the compiler seems to # be pre-ANSI C. The clang compiler does not typically need such a hack. # # hack for older compilers (adds gnu99 option to CFLAGS, if necessary) ifeq (,$(shell $(CC) $(CFLAGS) -dM -E - < /dev/null | grep __STDC_VERSION__)) CFLAGS := $(CFLAGS) -std=gnu99 endif
# use OpenMP only if not clang ifeq (0,$(shell $(CC) -v 2>&1 | grep -c "clang")) CFLAGS := $(CFLAGS) -fopenmp endif
If you have a favorite Makefile hack, you can send it to me and I add it here.
Often, automatic generation of dependences is unnecessary: each
object file depends on the corresponding source file (that has the same name
but different extension). This case is already covered by the implicit rule
%.o:%.c
.
However, there are still some situations when there are other dependences
between source files. For example, when you #include
another
file from your source code. In principle, since the file is explicitly
included, it should not appear in the command line and the compilation will
be successful without explicitly stating this dependency. Thus, the short
answer is that even in that case, this dependence need not be known by make.
But when you are developing code, the situation is different: if you change the included file, you may want to recompile the object. Thus, make must know about this dependency.
Fortunately, most compilers accept the -MM
option, that
prints the list of files included by a source file, conveniently formatted as
makefile dependencies. The following makefile deals automatically with this
# regular makefile stuff LDLIBS = -ltiff hello : hello.o lib.o options.o # generation and inclusion of missing dependencies deps.mk : $(CC) -MM $(shell ls *.c) > deps.mk -include deps.mk
This is the case that we have solved above
BIN = hello OBJ = hello.o options.o lib.o $(BIN) : $(OBJ) clean : $(RM) $(BIN) $(OBJ)
This can also be done using only implicit rules.
BIN = foo bar baz all: $(BIN) clean: $(RM) $(BIN)
When you have a set of object files common to a set of separate executables.
BIN = foo bar baz OBJ = lib1.o lib2.o lib3.o default : $(BIN) $(BIN) : $(OBJ) clean : $(RM) $(BIN) $(OBJ)
This is just like the previous case but putting all the objects inside a static library for ease of linking.
BIN = foo bar baz OBJ = lib1.o lib2.o lib3.o LIB = libmine.a default : $(BIN) $(BIN) : $(OBJ) $(LIB) : $(LIB)($(OBJ)) clean : $(RM) $(BIN) $(LIB) $(OBJ)
You can ``enhance'' the example above with compiler options and additional libraries to obtain a very general Makefile:
# user-editable configuration CFLAGS = -march=native -Os # required libraries LDLIBS = -lpng -ljpeg -ltiff # files BIN = foo bar baz OBJ = lib1.o lib2.o lib3.o # default target: build all the binaries default : $(BIN) # each binary depends on all the object files $(BIN) : $(OBJ) # bureaucracy clean : ; $(RM) $(BIN) $(OBJ) .PHONY : default clean
Notice that there is no harm at all in linking unused object files. The
linker will actually ignore symbols that are unused. Similarly with
libraries given with as -llib
.
pkg-config
and the like
The pkg-config
tool provides a way for people to use libraries
that are not fully installed on their system. This is a simple program
that prints whatever horrible flags are necessary for compiling and linking
against the library. It has three relevant options, here shown with their
output on my system (for lib-poppler):
pkg-config poppler --cflags # prints -I/usr/include/poppler pkg-config poppler --libs # prints -lpoppler pkg-config poppler --version # prints 0.26
The output of pkg-config is straightforward to use inside Makefiles
Other packages, suck as gdal, prefer to avoid the standard
pkg-config
system and provide their own gdal-config
with similar behaviour. Thus, if your program requires support for gdal, you
simply do the following:
# variables CFLAGS = -march=native -Os `shell gdal-config --cflags` LDLIBS = `gdal-config --libs` # files BIN = foo bar baz OBJ = lib1.o lib2.o lib3.o # default target: build all the binaries default : $(BIN) # each binary depends on all the object files $(BIN) : $(OBJ) # bureaucracy clean : ; $(RM) $(BIN) $(OBJ) .PHONY : default clean
All the examples given above work on a flat directory structure. This simplification allows to harness the full power of implicit rules. If you want to work with more complex directory structures, you will have to write the patterns yourself. Here we show an example of separate source and output directory (for the many objects/many executables case)
# files BIN := prog1 prog2 prog3 OBJ := lib1.o lib2.o lib3.o # add appropriate prefix to filenames BIN := $(addprefix bin/,$(BIN)) OBJ := $(addprefix src/,$(OBJ)) # default target default : $(BIN) # rule to build each executable bin/% : src/%.o $(OBJ) $(CC) $(LDFLAGS) -o $@ $^ $(LDLIBS) # bureaucracy clean : ; $(RM) $(BIN) $(OBJ) .PHONY : default clean
I would advise to only split your source code into subdirectories when you have a lot of files (say, more than 100).
In what language is a Makefile written? The answer is: in three separate and different languages:
Moreover, there is a set of pre-defined macros. This is actually very important since it allows to write extremely succint makefiles.
The core makefile language describes a directed graph explicitly. Each edge is written either using a tab:
to : from edge
or a semicolon
to : from ; edge
The vertices are filenames, and the edges are shell instructions.
This core language is extremely portable along all the historical implementations of make.
The edges of the graph, or rules, are written in plain UNIX shell. This text is first pre-processed by the makefile macro language, replacing the dollar-variables that it finds before sending the text to the shell. Thus, if you want that the shell receives dollar characters, you have to escape them (with another dollar character).
Formally, it is easy to distinguish between the make language and the shell
language parts of a Makefile: lines starting with TAB are interpreted by the
shell, and the other lines are interpreted by make. This is almost true, the
shell is also used inside makefiles in another place: as the first and only
argument of the $(shell ...)
directive.
If you really need to, you can change the actual shell used for running the
programs by changing the make variable SHELL. For example
SHELL=/bin/zsh
. But it is strongly advised to use only posix
shell features.
The makefile macro language is the ``fancy'' part of the Makefile. It is largely non-portable, but equivalent constructions exists between the two main implementations of Make: BSD make and GNU make. While it is possible to write makefiles in a portable way, they do not tend to be beautiful (mainly because the implicit rules are slightly different).
You have to think of the macro language as a pre-processor of your makefile, just like the C preprocessor. It expands the macros in your makefile until creating a makefile with only the core language constructions. The following features are available
# copy verbatim the text of the file, fail if it does not exist include filename.mk # copy the text of the file only if the file exists -include filename.mk
$(VAR) # value of variable var ${VAR} # alternative syntax
OBJ = $(SRC:%.c=%.o) # change the extensions from .c to .o
VAR = value # recursive expansion of value VAR := value # copy value without recursively expandingSome other kinds of variables (can be combined with the colon):
VAR += value # append VAR ?= value # assign only if it does not have a value already override VAR = value # override an assignment given in the command line export VAR = value # exported as environment variable to the shellVariables can also be assigned from the environment of the shell that calls make, or explicitly in the command line as arguments
VAR=value
.
$(MAKEFILE_LIST) # the file name of the current makefile $(MAKE) # the name of the make program
ifeq ($(CC),gcc) libs = $(libs_for_gcc) else libs = $(normal_libs) endif
SRC = $(shell ls *.c | grep -v notme.c ) # all C files except notme.c
SRC = $(wildcard *.c) # all C files SRC := $(filter-out notme,$(SRC)) # all C files except notme.c
a%b : c%d instructions build aXb from cXd
$(LIST) : a%b : c%d instructions build aXb from cXd
There is a ton more of available features in the GNU macro language. The GNU Make manual has more than 200 pages!
The output of make -p
is indeed overwhelming.
Yet, the only lines of concern for C and C++ are the following:
SHELL = /bin/sh # shell to run the rules RM = rm -f # command to delete files CC = cc # default C compiler CXX = c++ # default C++ compiler CFLAGS = # C compiler flags CPPFLAGS = # C preprocessor flags CXXFLAGS = # C++ compiler flags LDFLAGS = # linker flags LDLIBS = # libraries # build an object from a C source file %.o : %.c ; $(CC) $(CFLAGS) $(CPPFLAGS) -c -o $@ $< # link an executable from an object % : %.o ; $(CC) $(LDFLAGS) $^ $(LDLIBS) -o $@ # directly compile and link a C source file % : %.c ; $(CC) $(CFLAGS) $(CPPFLAGS) $(LDFLAGS) $^ $(LDLIBS) -o $@ # build an object from a C++ source file %.o : %.cc ; $(CXX) $(CXXFLAGS) $(CPPFLAGS) -c -o $@ $< # directly compile and link a C++ source file % : %.cc ; $(CXX) $(CXXFLAGS) $(CPPFLAGS) $(LDFLAGS) $^ $(LDLIBS) -o $@
When you run make without a Makefile, it is just as if this file was already present.
The human hand is perfectly optimized for grabbing a stone and throwing it to the eye of a mammoth. Using it for playing the violin is extremely awkward and unnatural; clearly not what it was made for. Yet, it is a beautiful act. Human civilization is all about using our God-given tools for other purposes that they were designed for. —Cmdr. Armando Rampas.
Forget about compiling. The make
program is useful in a very
general situation: whenever you want to run a complex pipeline with many
intermediary files. Typically this task is correctly accomplished by writing a
shell script or (god forbid) a python script. However, we show that
using a makefile may be a better idea.
For example, consider the following simple script that registers several images and computes the fusion of them all:
# input images: i{0..11}.png # output image: out_med.png # intermediate: i*.sift p*.txt h*.txt reg_*.png IDX=`seq -w 0 11` SIZE=`imprintf "%w %h" i00.png` # compute sift descriptors of each image for i in $IDX; do sift i$i.png > i$i.sift done # register each image to the first one for i in $IDX; do siftu pairR 0.8 i00.sift i$i.sift p$i.txt # match pairs ransac hom 1000 1 30 h$i.txt < p$i.txt # find homography homwarp h$i.txt $SIZE i$i.png reg_$i.png # warp done # compute the median value at each position vecov med reg_*.png -o out_med.png
This is a classical computational photography problem, and this script is a perfectly acceptable way of solving it. Shell scripts are cool. Yet, we will see how to improve it a bit.
This script runs the tasks in series. This is wasteful on a large computer
with, say, 32 cores, because it could be running the tasks in parallel. No
problem, GNU parallel is very easy to use. You simply print the instructions
that you want to run, and pass the resulting text to parallel
:
IDX=`seq -w 0 11` SIZE=`imprintf "%w %h" i00.png` # 1. compute sift descriptors of each image for i in $IDX; do echo "sift i$i.png > i$i.sift" done | parallel # 2. register each image to the reference one # 2.1. match pairs for i in $IDX; do echo siftu pairR 0.8 i00.sift i$i.sift p$i.txt done | parallel # 2.2. find homography for i in $IDX; do echo "ransac hom 1000 1 30 h$i.txt < p$i.txt" done | parallel # 2.3. warp for i in $IDX; do echo homwarp h$i.txt $SIZE i$i.png reg_$i.png done | parallel # 3. compute the median value at each position vecov med reg_*.png -o out_med.png
This version of the script will run all the tasks in parallel and will be much faster.
Is this the best parallelization possible? No. Notice that if, for example, one of the tasks on step 2.1. takes much longer than the others, there will be a long wait between steps 2.1. and 2.2., during which only one processor will be working. How can we solve this problem? In this case it seems easy, we just have to parallelize at a coarser level, sending to GNU parallel lines that contain the whole computation for each file. But if the dependences between files are more complicated, the problem becomes difficult very soon.
Another issue with the first script is that it always runs all the steps. Imagine that you change the fusion criterion in the last line of the script. Then, when you re-run the script all the steps are performed, but this is wasteful because all intermediary files are identical. The typical solution to this problem is to COMMENT all the script except the lines that you want to re-run. But of course this is ugly. A slightly better option is to check whether the updated files will be changed or not before recomputing them:
IDX=`seq -w 0 11` SIZE=`imprintf "%w %h" i00.png` # 1. compute sift descriptors of each image for i in $IDX; do test i$i.png -ot i$i.sift || sift i$i.png > i$i.sift done # 2. register each image to the reference one # 2.1. match pairs for i in $IDX; do test i$i.sift -ot p$i.txt || siftu pairR 0.8 i00.sift i$i.sift p$i.txt done # 2.2. find homography for i in $IDX; do test p$i.txt -ot h$i.txt || ransac hom 1000 1 30 h$i.txt < p$i.txt done # 2.3. warp for i in $IDX; do test h$i.txt -ot reg_$i.png || homwarp h$i.txt $SIZE i$i.png reg_$i.png done # 3. compute the median value at each position test reg_0.png -ot out_med.png || vecov med reg_*.png -o out_med.png
Notice that this is just the original script, but you add an explicit test before executing each line. If the input file is older than the output file, then you do not run the line. This is a simple modification that turns your script into a much more useful one. It has the following nice properties:
Yet, it has the following problems
reg_*
files, not only the first one. If you delete file
reg_3.png
and rerun, it will recompute this file, but it
will not re-build the output out_med.png
There is, after all, a free lunch. If you rewrite your original script as a makefile:
# variables SIZE := $(shell imprintf "%w %h" i00.png) INPUTS := $(wildcard i*.png) REGISTERED := $(INPUTS:i%.png=reg_%.png) # default target default: out_med.png # 1. compute sift descriptors of each image i%.sift: i%.png sift i$*.png > i$*.sift # 2.1. match pairs p%.txt: i00.sift i%.sift siftu pairR 0.8 i00.sift i$*.sift p$*.txt # 2.2. find homography h%.txt: p%.txt ransac hom 1000 1 30 h$*.txt < p$*.txt # 2.3. warp reg_%.png: i%.png h%.txt homwarp h$*.txt $(SIZE) i$*.png reg_$*.png # 3. fusion out_med.png: $(REGISTERED) vecov med $^ -o $@
Now you get, for free:
I say ``for free'' because this makefile has essentially the same length and complexity as the original script, and it is just as easy to write (once you are fluent in makefile language).
Notice that you can join several rules into the same target...
# variables SIZE := $(shell imprintf "%w %h" i00.png) INPUTS := $(shell ls i*.png) # fusion of all registered images out_med.png: $(addprefix reg_,$(INPUTS)) vecov med $^ -o $@ # register each image to the first one reg_%.png: i00.png i%.png sift i$*.png > i$*.sift siftu pairR 0.8 i00.sift i$*.sift p$*.txt ransac hom 1000 1 30 h$*.txt < p$*.txt homwarp h$*.txt $(SIZE) i$*.png reg_$*.png
...to obtain a very short makefile. Some people is really into this sort of thing. Personally, I prefer to give an explicit target for each intermediate file, but this is just a matter of taste. This is the kind of discussion to have among other native makefile speakers, sipping scotch next to the fireplace.
make
in your system: man make
.make -p -f /dev/null
, that prints the set of
implicit rules.