Skip to content

Writing a Real C++ Program – Part 9

August 26, 2011

This is the ninth instalment in a series of C++ programming tutorials that started here.

Introduction

This instalment was going to be about testing. However, I discovered that I couldn’t write the stuff I wanted to without having a working command line interface for scheck. So I’m afraid testing has been pushed back yet again, and this instalment will focus on providing a command line for scheck.

Designing The Command Line

Command line option design is something of a black art, and is rarely discussed, much less taught. Your best bet is to make some upfront decisions:

Should the command option indicator by a slash (as in /a) or dash (as in -a)? Or either? Scheck will only support dashes.
Should we support long names as well a short options (i.e. both -v and –version)? Scheck will support option names of any length, but only one option name per option.
Do we need to support a list of input files? Yes, scheck will take a list of submissions to check, which come after all the options.
Do we need to support input from standard input? Yes, if no files are specified, scheck will read a submission from standard input.

Having made these general decisions, I usually write out some sample command lines to see how they look. Remember, we need to be able to specify a dictionary file (I’m going to ignore multiple dictionaries for now), specify report options (CSV or XML for now) and provide an optional list of files.

scheck Read submission from standard input, use default dictionary and output in CSV format.
scheck -d mydict.dat As above, but use mydict.dat as the dictionary.
scheck f1.txt f2.txt Read the two files f1.txt and f2.txt and check against the default dictionary, output as CSV.
scheck -xml f1.txt f2.txt As above, but report in XML format.

These look pretty good – but how to implement them?

The Parameters Of main()

Most C++ programmers know that main has two alternative declarations:

int main( void );
int main( int argc, char * argv[] );

Your specific C++ implementation may provide other possible declarations, but these are the only ones specified by the C++ Standard. The parameter values are provided to the program by the operating system. The argc parameter is the number of parameters, including the program name (and so is never zero) and the argv parameter is an array containing pointers to the parameter values. So, for the command line:

scheck -d mydict.dat

The value of argc will be 3 and the argv array will look like this:

argv

It is certainly possible to work directly with the argc and argv parameters, but doing so quickly becomes rather messy. Because of this, a number of libraries for dealing with the command line have been written, perhaps most notably the getopt function. If that is available, you should certainly consider using it, but for this tutorial I am going to show you how to implement a simple class to deal argc and argv.

The CmdLine Class

The basic idea behind the CmdLine class is that the program creates a CmdLine object from the parameters of main. It then calls an ExtractOpt function on the object to extract a specific option and (possibly) value from the command line. Extraction is destructive. So if the command line was:

scheck -xml -d mydict.dat

then after calling ExtractOpt( "-xml" ) the command line would contain:

scheck -d mydict.dat

The program simply keeps calling ExtractOpt until there are no more options to process.

The declaration for CmdLine looks like this:

class CmdLine {
  public:
    CmdLine( int argc, char *argv[] );
    bool HasOpt( const std::string & opt ) const;
    bool ExtractOpt( const std::string & opt );
    bool ExtractOpt( const std::string & opt, std::string & val );
    bool MoreOpts() const;
    int Argc() const;
    std::string Argv( unsigned int i ) const; 
  private:
    typedef std::vector <std::string> ArgVec;  
    typedef ArgVec::iterator Iter;
    Iter FindOpt( const std::string & opt );
    ArgVec mArgs;
};

The ExtractOpt functions allow for options with zero or one associated parameter, like the dictionary name, and return true if the extraction succeeds. The MoreOpts function returns true if there areny options left to process., and the Arg and Argv functions return the current command line size and values, analogous to arg and argv.

Note that in the implementation, I have typedefs for the type of the vector and its iterator. This is pretty common practice in C++, as most people will prefer to read and type Iter rather than std::vector<std::string>::iterator.

I don’t propose to show the full C++ implementation of this class – you might like to try implementing it yourself from this description, or you can simply download the code for this instalment via the link at the bottom of the page. Here are the important bits – the constructor:

CmdLine :: CmdLine( int argc, char *argv[] ) {
  for ( int i = 0; i < argc; i++ ) {
    mArgs.push_back( argv[i] );
  }
}

This simply populates the mArgs vector with all of the original command line parameters.

The FindOpt function finds a specific option on the command line and returns an iterator referencing it:

CmdLine::Iter CmdLine :: FindOpt( const string & opt ) {
  for ( int i = 1; i < Argc(); i++ ) {
    if ( opt == Argv( i  ) ) {
      return mArgs.begin() + i;
    }
  }
  return mArgs.end();
}

Notice that the function starts looking at position 1, because position 0 is the program name, which cannot be an option.

Lastly, the ExtractOpt function finds an option and removes it from the command line, returning true if it succeeds:

bool CmdLine :: ExtractOpt( const string & opt, string & val ) {
  Iter pos = FindOpt( opt );
  if ( pos != mArgs.end() && pos != mArgs.end() - 1  ) {
    val = *(pos+ 1);
    mArgs.erase( pos, pos + 2 );
    return true;
  }
  else {
    return false;
  }
}

This is the version that also extracts the value after the option, as in -d mydict.dat. There is an overloaded version that takes a single parameter and is used for extracting things like -xml.

Here’s a small example of how you might use the class:

#include "cmdline.h"

int main( int argc, char *argv[] ) {
  CmdLine cl( argc, argv );
  bool xmlformat = cl.ExtractOpt( "-xml" );
  string dictname;
  if ( ! cl.ExtractOpt( "-d", dictname ) ) {
    dictname = "default.dat"
  }
  ...
}

However, if you look at the current state of the main function, you may well think that adding more code is finally going to push it over the brink into complete unreadability, and you would be right.  Some refactoring looks like it will be in order.

Refactoring main()

When faced with specific command line options for a particular application, it is often a good idea to create a Settings class which performs the option extraction and applies any semantic rules that may be necessary – for example, in scheck it does not make sense to specify that you want both XML and CSV output. A Settings class declaration for scheck looks like this:

class Settings {
  public:
    enum Report { rtCSV, rtXML };
    Settings( CmdLine & cl );
    Report ReportType() const; 
    std::string DictName() const;
  private:
    Report mRepType;
    std::string mDictName;
};

The Settings constructor handles all of the extraction, and sets the various settins variables:

const char * const DEF_DICT = "dictionary.dat";
const char * const DICT_OPT = "-d";
const char * const CSV_OPT =  "-csv";
const char * const XML_OPT =  "-xml";

Settings :: Settings( CmdLine & cl ) : mRepType( rtCSV ), mDictName( DEF_DICT )  {
  if ( cl.HasOpt( CSV_OPT ) && cl.HasOpt( XML_OPT ) ) {
    throw ScheckError( "Only one output type can be specified" );
  }
  if ( cl.ExtractOpt( CSV_OPT ) ) {
    mRepType = rtCSV;
  }
  if ( cl.ExtractOpt( XML_OPT ) ) {
    mRepType = rtXML;
  }
  cl.ExtractOpt( DICT_OPT, mDictName ) ;
  if ( cl.MoreOpts() ) {
     throw ScheckError( "Invalid command line" );
  }
}

 

Notice the various options and default values are declared as consts. This is because you don’t want them changed accidentally, and also to limit their scope – consts like this have file scope (local linkage) in C++.

Another thing to notice is that the last thing that gets called is the MoreOpts() member function of CmdLine. At the point of call, you should have dealt with all the legal command line options, so anything that is left is not valid.

Now to refactor main.cpp. The first thing to do is to take out the if-ladder that decides what kind of reporter object to create, and put it in a separate function:

Reporter * MakeReporter( Settings::Report rt ) {
  if ( rt == Settings::rtCSV ) {
    return new CSVReporter( cout );
  }
  else {
    return new XMLReporter( cout );
  }
}

A function like this, which constructs different subtypes of a base class dependent on some parameter(s), is known as a factory. Use of factories is a very common idiom in C++ OO programming.

Next, you should take the while-loop at the heart of the main function and turn that into a separate function too:

void CheckSubmission( const Dictionary & d, istream & sub, 
                      const std::string & subname, Reporter & rep ) {
  Parser p( sub );
  string word;
  rep.ReportHeader();    
  while( ( word = p.NextWord() ) != "" ) {
    if ( ! d.Check( word ) ) {
      rep.ReportError( word, p.Context(), p.LineNo(), subname );
    }
  }
  rep.ReportFooter();
}

 

You now need to rewrite main itself to use these functions. Remember that you have also got to add the ability to process a list of files and to read standard input if no files are specified! This sounds complicated, but with the help of the refactored functions above, and of the Settings object it is actually pretty easy – remember when reading this that Argc() will never be zero, as Argv(0) is always the name of the program.

int main( int argc, char * argv[] ) {
  try {
    CmdLine cl( argc, argv );
    Settings s( cl );
    Dictionary d( s.DictName() );
    auto_ptr <Reporter> rep( MakeReporter( s.ReportType() ) );
    if ( cl.Argc() == 1 ) {
      CheckSubmission( d, cin, "stdin", *rep );
    }
    else {
      for ( int i = 1; i < cl.Argc(); i++ ) {
        ifstream sub( cl.Argv(i).c_str() );
        if ( ! sub.is_open() ) {
          throw ScheckError( "Cannot open file " + cl.Argv(i) );
        }
        CheckSubmission( d, sub, cl.Argv(i), *rep );
      }
    }
  }
  catch( const ScheckError & e ) {
    cerr << "Error: " << e.what() << endl;
    return 1;
  }
  catch( ... ) {
    cerr << "Error: unknown exception" << endl;
    return 2;
  }
}

 

The main() function is now in pretty near its canonical form for C++, which is basically create a few objects and then let these objects interact. It could certainly be simplified  further, possibly by creating a Checker class and making the refactored functions, and most of the existing body of main, class members.

Conclusion

That wraps it for the command line. Quite a lot of code got written here and as yet there is no sign of how to test it. I promise, faithfully, that the next instalment will be all about testing!

Sources for this and all other tutorials in the series are available here.

Advertisements

From → c++, linux, tutorial, windows

6 Comments
  1. Bill permalink

    I’m looking forward to the testing!

    Did you mean to pass the Settings::Report by (const) reference in MakeReporter?

    • > Did you mean to pass the Settings::Report by (const) reference in MakeReporter?

      Nope. There is no point in passing something small like an enum (or an int, or a double) as a reference, unless the function is going to change it.

      • Bill permalink

        Fair enough, that makes sense. My approach is to assume that this kind of object will only get bigger, but in this case you probably have a good sense of the scope of the settings.

      • Bill permalink

        Are you assuming the string will use copy-on-write, or that it won’t be large enough to matter?

  2. @Bill Sorry, what string? The parameter of MakeReporter is an enumeration value. I always pass strings by reference, unless a value has slipped through somewhere (always possible), in which case it is a bug!

    • Bill permalink

      My mistake, I read MakeReporter(Settings), not MakeReporter(Settings::Report)!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: