Skip to content

Five Easy Pieces #2 – Picking Things

February 21, 2012

Introduction

This is the second article in a series of five, in each of which I will present a simple problem and show how it can be solved in C++. The solutions will illustrate various aspects of C++ and its Standard Library in what is hopefully an easily digestible form, and each will build on the solution to the previous problem.

You will need some basic knowledge of C++ in order to follow these articles, and I suggest if you want to learn from them, you equip yourself with a C++ compiler (instructions for how to install one for Windows are here) and try to write and compile code to solve  problem as (or before!)  I present the solution.  This article builds on the first one, so it would be a good idea to read that before proceeding.


Problem

Write a program called rpick which picks at random from the strings provided as command line parameters. For example:

$  rpick 1 2 3 4 5 6
4

$ rpick heads tails
tails

$ rpick red green blue
red

Hint: You will need to use the parameters of the main() function.


The First Bit

In order to solve this problem, you obviously need to read the values that the user specifies on the command line. To do this, you need to use the parameters of main(). The main() function is a rather strange beast in C++ – for example, the C++ Standard forbids you from calling it from your own code, so you can’t write recursive main()s. The Standard also says that main() must return an int, and that the C++ implementation must allow versions of main that look like this:

int main() { … }

or this:

int main( int argc, char * argv[] ) { … }

It’s the latter of these two that you want, as it’s the one that gives you access to the command line parameters. When you run a program, the operating system and the C++ runtime code will populate the argc and argv parameters for you. Suppose you entered:

$ rpick red green blue

There are four things on the command line, so the argc parameter will take on the value 4.  Note that as there is always at least one thing on the command line (the program’s name), argc will never be zero.

The rather intimidating argv parameter is an array of pointers to char, which gets set up like this:

image

Each pointer in the array will point to the beginning of a null-terminated string representing each of the command line parameters. The name of the the program will be pointed to by argv[0], the first actual parameter by argv[1], and so on. The last element of the array will contain a NULL pointer, indicating there are no more parameters.

To check that you understand the argc and argv parameters, it’s a good idea to write a little program to print them out:

#include <iostream>
using namespace std;

int main( int argc, char * argv[] ) {
  for( int i = 0; i < argc; i++ ) {
    cout << "argv[" << i << "] is " << argv[i] << "\n";
  }
}

When I ran this on my Windows laptop, I got the following output:

argv[0] is c:\users\neilb\home\temp\rpick.exe
argv[1] is red
argv[2] is green
argv[3] is blue

Note that argv[0] (the program’s name) is not exactly what I typed in on the command line – the operating system has decided to supply the full path of the executable to the C++ program – exactly what you get in argv[0] will be implementation dependent. Another thing to note is that even though main() is declared as returning an int, I didn’t bother to do so. This is not an error – the C++ Standard says that an explicit return may be omitted for main(), and the result is as if you had said “return 0;”. I’ll have more to say about the return value a bit later.

So, you have the parameters of main(), now all you need to do is pick one of them at random. To do this, you simply have to generate a number in the range 1 to (argc-1), and print out the corresponding entry in the argv array. Luckily, the code from the previous article can easily be adapted:

int pick = 1 + rand() % (argc-1);

However, it’s a good idea when faced with code that is obviously so re-usable to make it it into a function. It’s also a good idea in C++ to deal with what are known as half-open ranges. A half-open range is (informally) one that begins with zero and ends at one before a specified terminating value. So the half-open range 0 – 6  comprises of the numbers 0, 1, 2, 3, 4, and 5, but _not_ 6. Such ranges are handy in C++ because of the zero-based nature of C++ arrays and vectors – for an array of size 6, the half-open range 1 – 6  specifies all the valid indices of the array.

With the above, in mind, let’s write  a function that returns random numbers in the half open range 0 to n – it’s pretty easy:

int Random( int n ) {
    return rand() % n;
}

Why bother? Well, although the use of the remainder operator is a very simple way of generating random numbers in a range, it may not be the best. The footnote to this article shows another function you could use, which may perform better. But by making both implementations into functions, you can easily swap in and out different implementations without re-writing much code. With that in mind, lets do the same for the seed-setting code too, so if we need a better seed-setting method, we can just slot that in:

int Randomise() {
    srand( time(0) );
}

With these to functions in place, writing the rpick program would seem to be simple:

#include <iostream>
#include <cstdlib>
#include <ctime>
using namespace std;

int Random( int n ) {
    return rand() % n;
}

int Randomise() {
    srand( time(0) );
}

int main( int argc, char * argv[] ) {
    Randomise();
    cout << argv[ 1 + Random( argc - 1 ) ] << endl;
}

This would seem to work quite nicely:

$ rpick 1 2 3 4 5 6
4

$ rpick 1 2 3 4 5 6
1

But what happens if you simply do:

$ rpick

Well, on Windows at least, you get this highly unwelcome dialog:

Screenshot_rpick

What has happened here is that I didn’t provide any command line parameters except for the program name, and so the value of the argc parameter of main is 1. The program logic then calls the Random() function like this:

Random( argc - 1 );

which results in the execution of the statement:

return rand() % 0;

The remainder operator is really just a fancy divide operator, so what I’m doing here is attempting to divide by zero, and as you probably know, dividing by zero is a bad thing to do – in fact, in C++ it results in what is known as “undefined behaviour”, which means that what the program does from this point on is outside the remit of the C++ Standard. Despite what you may have heard, undefined behaviour is unlikely to make your PC explode, or reformat your disk (though neither are impossible outcomes), but it’s unlikely to be what you want to happen.

To avoid this sort of thing, whenever you write  a program that uses command-line parameters, you need to check that the user has supplied the parameters your application needs, and if they haven’t, fail gracefully. A common trope, is to provide a usage message, that tells the user how to use the program:

#include <iostream>
#include <cstdlib>
#include <ctime>
using namespace std;

int Random( int n ) {
    return rand() % n;
}

int Randomise() {
    srand( time(0) );
}

int main( int argc, char * argv[] ) {
    if ( argc == 1 ) {
        cerr << "usage: rpick value1 [value2 ... valueN]\n";
        return EXIT_FAILURE;
    }
    Randomise();
    cout << argv[ 1 + Random( argc - 1 ) ] << endl;
}

There are a few wrinkles. The text of a usage message should indicate, often using square brackets, which command line parameters are optional. The message should be written to the standard error stream cerr, so that if the user is re-directing standard output, they will still see the error message. And lastly, you need to return a value from main indicating that there was a problem – the Standard Library provides the EXIT_FAILURE constant to do this.

With the above code in place, the problematic use of rpick now looks like this:

$ ./rpick
usage: rpick value1 [value2 ... valueN]

and the program is complete.

Conclusion

In this artickle I’ve provided an introduction in how to use the command-line parameters of the main() function. Hopefully, you will have learned:

  • How to access the parameters using argc and argv.
  • How to check them for validity and report an error.
  • The meaning of the phrase “half-open range”.

In the next article, I’ll look at stream input, and see how to pick things from input streams and files.

Footnote

The Random() function presented in the body of this article is simple and easy to understand, but owing to the mathematical properties of modular arithmetic, it may produce sequences with a bias towards certain numbers. To avoid this, you can substitute the simple implementation of Random() with this one, which adapted from one presented in the excellent Accelerated C++, by Koenig and Moo:

int Random( int n ) {
    const int bucket_size = RAND_MAX / n;
    int r;
    do {
        r = rand() / bucket_size;
    } while( r >= n );
    return r;
}
Advertisements

From → c++, tutorial

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: