Skip to content

Writing a Real C++ Program – Part 1

July 28, 2011

Introduction

This is the first in a series of tutorials that shows you how to go about writing real code in C++. It’s aimed at programmers who have learned the basics of C++ programming, but who have not yet done much in the way of multi-file programming, have not used C++ features such as  exception handling, object-orientation, and the Standard Library, and who may not have much design experience. Somewhat more experienced programmers may find it interesting too.

In order to follow these tutorials I assume the following:

  • A basic knowledge of C++.
  • The ability to use a command line, a text editor and other common tools on your operating system of choice, which can be either Windows or Linux. The tutorials should also work on Macs, but I have no way of testing this, and little experience of using the Mac OS.
  • An installed C++ compiler, specifically the GCC C++ compiler. If you are on Windows, see the instructions here. If you are on Linux, you probably have all you need already.

A number of issues will not be covered in this series, despite the fact that they are absolutely a part of writing a real C++ program. The main reasons for excluding them are  simply space and time, but I do have a few others. The major things that will be excluded are:

  • Version Control. Covering this to a useful depth would have at least doubled the size of the tutorial. However, I strongly advise you to install a VCS, and learn to use it it before proceeding.
  • Coding Style. Specifically, coding standards for things like naming and indentation will not be addressed. This is because there is simply no agreement on these things, important though they are. I do however stick to a single style throughout the tutorials, which I suggest you emulate.
  • Comments. I make no recommendations about how to comment your code, and most of the code presented is uncommented. This is mostly for reasons of presentation and space – your real code should definitely be well commented, and I intend to provide a fully commented final version of the project.

 

The Task

You arrive in your office at a publisher of technical journals one morning to receive an email from your boss:

We are getting lots of submissions in plain ASCII text from non-native English  speakers these days, and though the content is great, the spelling is sometimes terrible, and we’d like some way of automatically checking the quality before the stuff goes through to our editors. We’d like to pass each submission through a program that will tell us how many errors there are, and print them out in context. We’d also like like to be able to have multiple special dictionaries for each journal that contain words that are commonly used technically in those journals, but which might not appear in a standard English dictionary.

After a bit of to and fro with your boss, you agree to produce a console program, who’s command line looks something like this:

scheck [-n] [[-d special.dict] …] [file …]

where the -n option only prints the number of errors, and the -d option specifies a special dictionary. There can be more than one -d option. The checker checks the listed files (if none are specified it reads standard input) and produces output of the form:

article1.txt,2 errors
article1.text,7,thre,perhaps thre is a reaction
article1.text,12,phiton,low-energy phiton is emitted

Your boss also indicated that he might need XML output later on, but you should get the CSV format above working first. He also said there were a lot of submissions, and that checking a document should not take more than two or three times as long as listing it on the screen would do.

How To Get Started

In the real world, the first thing you should do is search the Web for an existing solution (I’m sure there is one) but, as we are in tutorial-land, I’m going to assume you did such a search and found nothing.

The next thing you should do is a modicum of design – you should definitely not start writing code yet. The  approach I suggest for a problem like this is to take the boss’s emails, print them out (there is nothing like paper for doing design work!) and go through them with a marker, marking the interesting looking bits, like "dictionary" and "submission". Then bring these interesting terms together in diagram form.

What you want to think about when doing this is possible "knows about" or "uses" relationships between the "interesting bits". For example, does it seem likely that a "submission" from an external  author knows about or uses the spell-checking dictionary? No, I don’t think so. Is it possible that the spell-checker uses the dictionary? Definitely. You can then construct a simple block diagram, freehand, on a single sheet of paper (don’t use a drawing tool) that shows these relationships (the arrows indicate who knows about what), adding extra information you think will be useful – here I’ve added the fact that there are N technical dictionaries, but (probably) only one English dictionary:

design

For a project of this size, a diagram like this is almost all you are going to need, but it’s important that it exists. It’s something you can show your boss and it should give you and him some confidence that you have an idea of where you are going. It also allows you to identify the components of the system for the purposes of project planning and prioritisation. If you can’t produce such a diagram, stop. Don’t try to go on to write code. Instead, go back and ask more questions. If you can draw such a diagram, pin it up on your cubicle wall and astound your co-workers, most of whom will never have seen such a thing!

Important:  Many new programmers when faced with a task go into a sort of fugue state where they think of one way of doing things, then another, then another, but never actually make anything. Don’t get into this state!  Always produce an output (in this case the diagram) and when you have done so, go with that output for a bit. There is no need  for it to be absolutely correct – it’s just a few marks on a bit of paper that you can crumple up and discard whenever you like. The chances of you producing some wonderful, correct design at the beginning of the development process are approximately zero, so don’t worry if you foresee a few problems with what you have produce at this point.

The next thing you should do is, given the design diagram, assign some level of complexity to each of the elements in the diagram (please notice I have not called them classes), and arrange them in order of complexity, together with any issues that may have already occurred to you. I tend to use a spread sheet for this, but you could equally well do it on paper.

Element Complexity Issues
Dictionary High Where to get content? How to search quickly? How to load quickly? Different kinds of dictionary.
Checker Moderate How to read submissions?
Reports Moderate Formatting. Need to support XML later.
Command Line Low Maybe just use argc and argv.

 

You might well come up with different values, but I think its obvious that the dictionaries are the key to the whole thing, and probably the most difficult part. Keep this document updated as you go, adding new issues as they occur to you. You’ll use these values to prioritise which tasks to attack first – this is a simple project plan.

You should now try to forecast how long each bit is going to take, in days. If you are new to programming, this is the really difficult bit, but it’s well worth doing, so long as you don’t give your forecasts out to anyone! Just write them on a bit of paper and put them in a drawer – it will be interesting to see just how far out you were when the project is finished.

One thing you should not, now or ever, do when designing a real C++ program is to produce a hierarchy of C++ classes. The use of inheritance in C++ and other OO languages is an implementation tool (and not  a great one at that) and not something that should be used when designing an application.

Can I Write Some Code Now?

Not yet! Before writing any code it is a very good idea to put some time into setting up the infrastructure for the project. At the very least, you should set up the project’s initial directory structure. It’s a lot easier to get this more or less right when initiating a project than it is to change it later, particularly for some version control systems that don’t really like project structures being changed too much.

I suggest you create the following top-level directory structure for this project:

schecktree

Where scheck is the project’s root, and the purposes of the subdirectories are as follows:

  • bin – where the scheck executables will be stored
  • data – place to put supporting data, like the dictionaries
  • doc – yes, you are going to have to write documentation!
  • inc – C++ header files go here
  • src – C++ source files (those with a .cpp extension) go here
  • tests – place to put tests we can run to make sure our app actually works

You’ll see you need further subdirectories later, but this is a good start.

Writing Scheck Version 0.1

You are now ready to write the first version of your software. Here’s what the entire source code for it looks like:

#include <iostream>
using namespace std;

int main() {
    cout << "scheck version 0.1" << endl;
}

Using a text editor, create this code and save it in the src subdirectory as … what?

One of the key skills of a programmer, as of a wizard in Ursula Le Guin’s Earthsea books, is to be good at naming things. Unfortunately, most people are quite bad at this (which is why wizards are special), and it’s worth putting a bit of thought into what to call even the simplest file, class or variable. As I see it, there are three alternatives:

  • scheck.cpp – this is after all the main file of the application.
  • scheck_main.cpp – are we are going to prefix all file names with the name of the project?
  • main.cpp – has the virtue of simplicity, at least.

Cases can be made for all three, but for me simplicity wins out. There is also the issue that your boss will almost certainly insist on changing the name of the project to SubmissionMaster or some such thing, so encoding the project name in the project’s file names is not such a great idea. So we will save the file as main.cpp.

Some other source file naming guidelines: On no account use mixed case in file names; this can only lead to portability problems if you find (as you will) you need to move your code from case-respecting systems like Linux to or from case-ignoring systems like Windows. And always use .cpp as the extension for C++ source files; do not be tempted by the likes of .cxx, .cc, .C etc.

Time To Compile!

It’s now time to compile your first version of scheck. In later tutorials, you’ll see how to automate this using make, but for now open a command-line window in the scheck root directory, and enter this command (the $ sign is your command line prompt, don’t type that):

$ g++ src/main.cpp -o bin/scheck

This will compile the main.cpp file, link with the GCC C++ libraries, and place the executable (called scheck or scheck.exe, depending on your OS) in the bin directory. You can now test it:

$ bin/scheck
scheck version 0.1

If that worked, give yourself a pat on the back – you have overcome what Brian Kernighan describes as the biggest hurdle in programming – getting a simple program to compile and link.

Conclusion

That is it for this first tutorial. It may not seem that much has been covered here, but a lot has actually been achieved:

  • You have (I hope) a clear idea of the problem.
  • You have produced design documents and a rough project plan.
  • You have the basis for a sound project structure.
  • You actually have shippable product (the executable) – admittedly, it is not exactly feature-rich!

If I could get all this done in a single day, I would be well pleased with myself. Even if it took a couple of days, which getting the problem specified correctly might well do, I’d still be feeling pretty happy.

Coming Next: A first attempt at the dictionary.

Sources for this and all other tutorials in the series available here.

From → c++, linux, tutorial, windows

2 Comments
  1. Bill permalink

    I like the concept of this series, and I’m looking forward to seeing the rest.

  2. @Bill Thanks – I hope not to disappoint!

Leave a comment