Skip to content

Writing a Real C++ Program – Part 6

August 12, 2011

This is the sixth instalment in a series of C++ programming tutorials that started here.

Introduction

Up until now, you have been building the scheck application  using a command line that looks like this:

$ g++ -I inc src/main.cpp src/parser.cpp -o bin/scheck

All of the .cpp source files that form the project need to be on the command line, and as you add more source files, this gets beyond tiresome. In this instalment, I want to look at how you can automate the build process, but first I want to look at why it is necessary at all.

Why Multiple .cpp Files?

One thing that occurs to most programmers when they start using C++ is why do we need all these files at all? Why not put all the code in one big file (or #include all the code into one big file, which effectively does the same thing)?

Well, you can in fact do that! But there are a number of reasons why this is not a good idea, and they mostly have to do with the workflow involved in writing and building an application in C++:

  • If you have one big file, then only one person can effectively work on it at once. Although modern version control systems support merging of changes, this is not something you want to have to do for every single minor change you make.
  • Compilation times for one big file may be prohibitive. You do not want to correct a spelling mistake in an obscure error message and then find you have to wait several hours for your code to recompile. Even compilation times of several minutes can be enough to put you off making changes to the code.
  • For truly large programs, you may find that your compiler simply cannot handle the size of sources that you will be throwing at them, or handles them very slowly when the compiler runs out of real memory.

For these and other reasons, all real C++ programs are split up into a number of source files. The downside of this is that the compilation process becomes considerably more complicated. As a refresher, here’s an outline of what goes on during compilation and linking:

compile

The compiler (with some help from the preprocessor) takes each .cpp file and its associated header files and compiles them into object files, which will have the extension .o or .obj, depending on your platform. Object files contain machine code (code that your computer’s CPU can actually execute), but not in a directly executable form. Specifically, the addresses of functions and global variables are not known at this stage, because they depend on the content  of other .cpp files, and the compiler only processes one .cpp at  a time.

When all the .cpp files have been compiled, the linker is invoked. This looks at all of the object files that the compiler produced, plus any libraries your code may be linking with (you almost always link with some, although these may not be mentioned explicitly on the compiler command line). It fixes up all the unknown addresses in the object files and writes the fixed-up code out to a single executable.

Most of this is hidden from you by default – when you invoke the command:

$ g++ -I inc src/main.cpp src/parser.cpp -o bin/scheck

the g++ program, which is in fact not the compiler but a driver program (think of it as a really complicated shell script), calls the real compiler and the linker for you to create the final output executable. Unfortunately, in order to take advantage of the ability to use multiple source files, you need to dip beneath the surface and see what exactly is going on.

Separate Compilation

The current state of play for your source files should be something like this:

inc
   dictionary.h
   error.h
   parser.h
src
   main.cpp
   parser.cpp

To compile these separately, use the following two commands from the scheck root directory:

$ g++ -I inc -c src/main.cpp
$ g++ -I inc -c src/parser.cpp

The -c option tells the g++ driver to compile the the source files to object files, but not to link them into an executable. If you look in the scheck root, you should see two new files, main.o and parser.o  have appeared. You can now tell the driver to use the linker to link those two files to produce the final executable:

$ g++ main.o parser.o -o bin/scheck

Note that we didn’t need to specify the include directory, because the linker doesn’t know or care about header files.

This hardly seems much (if any) of an improvement, but consider – if you make a change to the parser, you only need to recompile the parser and then link; you don’t need to recompile main. This is not such a big deal when you only have two source files, but imagine that you have two hundred!

Dependencies

Unfortunately, you don’t only need to recompile the parser if only parser.cpp changes. You also need to recompile it if any of the header files it includes changes. It’s easy to see why. Suppose you have a header that contains a declaration of  this single function:

void f( int x = 42 );

If you ever change that default value from 42 to something else, then every file that included that header needs to change so that they can see the new default value. We say that all the files that need to be recompiled are dependent on this header file.

With that in mind, you can draw a dependency diagram for the current state of the scheck project:

deps

Here, arrows mean “depends-on”, so parser.cpp depends on error.h and parser.h, but not on dictionary.h, as the parser doesn’t use or know about dictionaries. One of the aims of good programming practice is to keep the number of dependencies to an absolute minimum – you should never include a header file because you think you “might need it later.”

Note that you should not normally consider dependencies on system library headers, such as <iostream> or <string> as these are assumed not to change. If they do change, as when you upgrade your compiler, a complete rebuild will be needed in any case.

The make Utility

All these dependencies means that it is easy to forget exactly what needs to be recompiled when something changes. This is a task for computer automation, and the premier tool in this area is the make utility.  Basically you give make a list of dependencies, and from then on it knows what needs to be recompiled, based on the timestamps of the files involved.

There are lots of different varieties of make. The remainder of this instalment  will describe the use of GNU make – if you are using Linux or if you followed these instructions to install the GCC compiler, or if you have the GCC compiler installed at all, you probably already have it. Note that this is by no means a full tutorial on GNU make – it simply describes how to use it in order to build the scheck project.

The dependency information that make needs in order to build your project is stored in a makefile. This is simply a text file, with the name (perhaps not suprisingly) of “makefile”. A makefile consists of a number of rules in this format:

target : dependency-list
    command1
    command2
    …

A target is something that we want to build – for example an object file or an executable. The dependency list contains the list of things that the target depends on – these will be the names of source files and target files. The commands are the commands (mostly there is only one) needed to build the target.

Let’s consider the final executable that your code produces first. This is called bin/scheck.exe (if on Windows) or simply  bin/scheck (if on Linux). It is built by linking together the two object files you created earlier, main.o and parser.o and is directly dependent only on them. So you can write a simple makefile (create it with a text editor and save it in the scheck root directory) called makefile which looks like this (the executable file name being OS dependent):

bin/scheck.exe : main.o parser.o
    g++ main.o parser.o -o bin/scheck.exe

Once you have saved it, from a command prompt in the scheck root directory, type:

$ make

The make utility looks for an input file called makefile by default. Unfortunately, what you will probably now see is an error message:

makefile:2: *** missing separator.  Stop.

This is because, for reasons lost in the depths of time, make requires that all the commands for  a rule be indented with a single TAB character – spaces will not do! Go back and edit the makefile, changing the spaces in front of the g++ command to a TAB. This is an extremely common problem when writing makefiles, so bear the experience in mind!

If you get the tabbing right, you will get this message:

make: `bin/scheck.exe' is up to date.

This is because you have already built an executable from these two object files, and make sees that there is nothing more for it to do. Try deleting the executable (not the object files) and run make again. You should get this output:

g++ main.o parser.o -o bin/scheck.exe

Great – make has executed the command you told it to, because it saw that the executable wasn’t there and so needed to be built. By default, make echoes each command as it executes it. Now delete the parser.o object file and make again – you will get this output:

make: *** No rule to make target `parser.o’, needed

by `bin/scheck.exe’. Stop.

This, or something like it, is probably the most common error message you will get from make. It’s saying that it knows it needs parser.o in order to build bin/scheck.exe, but it doesn’t know how to build parser.o – because you have not yet told it. Let’s fix that. Edit the makefile so it looks like this:

bin/scheck.exe : main.o parser.o
	g++ main.o parser.o -o bin/scheck.exe

parser.o : src/parser.cpp inc/parser.h inc/error.h 
	g++ -I inc -c src/parser.cpp

Some things to note here. First, the rule for how to build parser.o goes after the one to build the executable. The make utility is not sensitive to target order, except that the first rule is the default rule and should normally be the one used to build the final product. Secondly, you need to provide the pathnames of the various depedency files – make knows nothing about include paths or about your project structure. Thirdly, you don’t just need the names of the header files – an object file is also dependent on its .cpp file. Lastly, the command used to build the object file is exactly the same as the one you used earlier to build the file manually.

With this new rule in place, try running make again – you should get this output:

g++ -I inc -c src/parser.cpp
g++ main.o parser.o -o bin/scheck.exe

What make has done here is looked at the default target and seen that it needs to build parser.o. It has then looked for a rule that has parser.o as a target, and used that rule’s command to build it. It has then gone back to the default rule and used its command to build the executable. It’s important to realise that make does not execute makefile rules sequentially – what happens is much more like function calls, which may nest other function calls.

You also need a rule to build main.o, so edit the makefile again:

bin/scheck.exe : main.o parser.o
	g++ main.o parser.o -o bin/scheck.exe

parser.o : src/parser.cpp inc/parser.h inc/error.h 
	g++ -I inc -c src/parser.cpp

main.o : src/main.cpp inc/parser.h inc/error.h inc/dictionary.h
	g++ -I inc -c src/main.cpp

Run this to make sure sure that the executable is either built or up-to-date, and then try the following: edit the file dictionary.h to add a comment, it doesn’t matter what the comment is, and save it. Now run make again – you should get this output:

g++ -I inc -c src/main.cpp
g++ main.o parser.o -o bin/scheck.exe

You can see that make has recompiled main.cpp (because main.o is dependent on dictionary.h, which has just changed) but has not recompiled parser.o (because the parser knows nothing about the dictionary and none of the parsers dependencies have changed). This is exactly what is wanted.

From now on, as you add files to the scheck project, you should also add relevant entries in the makefile. And instead of building using the g++ command you have used up until now, you should build simply by typing make.

Conclusion

That wraps it for this instalment. I will be returning to make later to show how you can write a makefile that handles dependencies automatically. For now, you might like to try extending this makefile to place the object files in the bin directory, instead of having them cluttering up the root. Hint: you will need to use the -o compiler option to specify where you want them to go – you might also want to look at the (rather good) GNU make documentation.

Coming Next: Reporting, inheritance, pointers!

Sources for this and all other tutorials in the series available here.

Advertisements

From → c++, linux, tutorial, windows

One Comment
  1. Excellent post on makefiles. I understood it at the first attempt.
    You should consider make a post on a VCS like git. Your writing style/examples will really make it easy for novice programmers to understand it quickly.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: