Skip to content

Common C++ Error Messages #2 – Undefined reference

Introduction

In this article I’ll be looking at the “undefined reference” error message (or “unresolved external symbol, for Visual C++ users). This is not actually a message from the compiler, but is emitted by the linker, so the first thing to do is to understand what the linker is, and what it does.

Linker 101

To understand the linker, you have to understand how C++ programs are built. For all but the very simplest programs, the program is composed of multiple C++ source files (also known as “translation units”). These are compiled separately, using the C++ compiler, to produce object code files (files with a .o or a .obj extension) which contain machine code. Each object code file knows nothing about the others, so if you call a function from one object file that exists in another, the compiler cannot provide the address of the called function.

This is where the the linker comes in. Once all the object files have been produced, the linker looks at them and works out what the final addresses of functions in the executable will be. It then patches up the addresses the compiler could not provide. It does the same for any libraries (.a and .lib files) you may be using. And finally it writes the executable file out to disk.

The linker is normally a separate program from the compiler (for example, the GCC linker is called ld) but will normally be called for you when you use your compiler suite’s driver program (so the GCC driver g++ will call ld for you).

Traditionally, linker technology has lagged behind compilers, mostly because it’s generally more fun to build a compiler than to build a linker. And linkers do not necessarily have access to the source code for the object files they are linking. Put together, you get a situation where linker errors, and the reasons for them, can be cryptic in the extreme.

Undefined reference

Put simply, the “undefined reference”  error means you have a reference (nothing to do with the C++ reference type) to a name (function, variable, constant etc.) in your program that the linker cannot find a definition for when it looks through all the object files and libraries that make up your project. There are any number of reasons why it can’t find the definition – we’ll look at the commonest ones now.

No Definition

Probably the most common reason for unresolved reference errors is that you simply have not defined the thing you are referencing. This code illustrates the problem:

int foo();

int main() {
    foo();
}

Here, we have a declaration of the function foo(), which we call in main(),  but no definition. So we get the error (slightly edited for clarity):

a.cpp:(.text+0xc): undefined reference to `foo()'
error: ld returned 1 exit status

The way to fix it is to provide the definition:

int foo();

int main() {
    foo();
}

int foo() {
    return 42;
}

 

Wrong Definition

Another common error is to provide a definition that does not match up with declaration (or vice versa). For example, if the code above we had provided a definition of foo() that looked like this:

int foo(int n) {
    return n;
}

then we would still get an error from the linker because the signatures (name, plus parameter list types) of the declaration and definition don’t match, so the definition actually defines a completely different function from the one in the declaration. To avoid this problem, take some care when writing declarations and definitions, and remember that things like references, pointers and const all count towards making a function signature unique.

Didn’t Link Object File

This is another common problem. Suppose you have two C++ source files:

// f1.cpp
int foo();

int main() {
    foo();
}

and:

// f2.cpp
int foo() {
    return 42;
}

If you compile f1.cpp on its own you get this:

f1.cpp:(.text+0xc): undefined reference to `foo()'

and if you compile f2.cpp on its own, you get this even more frightening one:

crt0_c.c:(.text.startup+0x39): undefined reference to `WinMain@16

In this situation, you need to compile both the the source files on the same command line, for example, using GCC:

$ g++ f1.cpp f2.cpp -o myprog

or if you have compiled them separately down to object files:

$ g++ f1.o f2.o -o myprog

For further information on compiling and linking multiple files in C++, particularly with GCC, please see my series of three blog articles starting here.

 

Wrong Project Type

The linker error regarding WinMain  above can occur in a number of situations, particularly when you are using a C++ IDE such as CodeBlocks or Visual Studio. These IDEs offer you a number of project types such as “Windows Application” and “Console Application”. If you want to write a program that has a int main() function in it, always make sure that you choose “Console Application”, otherwise the IDE may configure the linker to expect to find a WinMain() function instead.

No Library

To understand this issue, remember that a header file (.h) is not a library. The linker neither knows nor cares about header files – it cares about .a and .lib files. So if you get a linker error regarding a name that is in a library you are using, it is almost certainly because you have not linked with that library. To perform the linkage, if you are using an IDE you can normally simply add the library to your project, if using the command line, once again please see my series of blog articles on the GCC command line starting here, which describes some other linker issues you may have.

Conclusion

The unresolved reference error can have many causes, far from all of which have been described here. But it’s not magic – like all errors it means that you have done something wrong, in you code and/or your project’s configuration, and you need to take some time to sit down, think logically, and figure out what.

My C++ Interview Questions

Introduction

Over the years, I’ve done my share (more, it has often seemed) of interviewing candidates for C++ programming posts. During this time I’ve zeroed in on a small subset of questions I ask candidates. These have no correct answers, do not refer to manhole covers, and require no maths ability to answer. They have, however, proved effective in deciding if the candidate actually has some knowledge of C++. I present them for your delectation.

“Tell me about the copy constructor”

This is my start-off question. If people look blank when I ask it (and depressingly, lots do), then I write out the signature of the constructor for them. If they still look blank, the interview terminates – this has saved me a lot of time over the years.

What I’m looking for here is someone that knows how important copying is in C++, where it takes place, and how it can be avoided. I’d expect a decent C++ programmer to be able to talk for around 15 minutes on this, with me asking subsidiary questions.

“What are your favourite C++ books? And why are they your favourites?”

C++ is a very complex language, and to learn it thoroughly you simply have to read several good books – internet resources will not be enough. I don’t particularly care which books the candidate talks about, providing there is more than one of them, and they can come up with some convincing reasons for liking them.

“Write a program to…”

I’m not a big fan of getting candidates to write code in interviews. Often, the problem they are asked to solve is too small to prove anything, and they are unfamiliar with the setup on your specific workstations, so you are really testing how quickly they can get to grips with a toolset. However, sometimes HR or senior management will insist on a coding test. If that’s the case, then I set this problem, or something very like it, which tests the candidate’s familiarity with the C++ Standard Library in several ways:

Write a program that can be called from a command line environment like this:

myprog.bible.txt

The file bible.txt, which is provided, contains the text of the King James Bible. Your program should read this file and output the 10 most frequently occurring words, together with their frequency,  ignoring any punctuation and character case. Use the style of coding, commenting etc. that you would for a large program.

As with the other questions here, I’m not particularly interested in the details of the solution, provided they show a knowledge of Standard Library classes like strings and maps, and that the program actually works.

“What do you think the three most important features of C++11 are?”

Or similar – if the candidate has professed a knowledge of a library like Boost, I might ask about that instead. Once again, I don’t particularly care what the candidate thinks are important, only that they can talk about them articulately and knowledgably.

“If you were interviewing me, what question would you ask me”

This one is down to a guy I worked with at the now defunct Lehman Brothers (thank you, Mr James). He interviewed me, and after a couple of  questions said, “Well, you obviously know a lot more about C++ than I do, so tell me, if you were interviewing me, what questions would you ask?” I thought that was brilliant at the time, and still do – use it if you are interviewing a self-styled guru. Of course, you have to ask them why they would ask that question!

Conclusion

From the above, I think you can see that you do not have to ask candidates trick questions, or puzzles about manhole covers. Instead, you should be trying to get them to talk at length about their knowledge of the C++language and how it can and should be used.

Languages A-Z Challenge

I was futzing around with my CV today, and this question suddenly occurred to me:

Do I know a language for each letter of the alphabet?

And it turns out, no I don’t. here’s my best shot of an alphabetic  list of languages I do know reasonably well, and have been paid for writing programs in:

  • Awk
  • BASIC
  • COBOL
  • dBase II
  • E??????
  • FORTH
  • Gupta SQLWindows
  • H??????
  • IBM JCL
  • Java
  • K??????
  • Lahey FORTRAN
  • MASM
  • Nomad/2
  • Object Pascal
  • PHP
  • Quick C
  • Rexx
  • SeaChange
  • Transact SQL
  • U??????
  • Visual C++
  • W??????
  • X??????
  • Yacc
  • Z80 Assembler

The ?????? entries are ones I can’t provide.

Obviously, I know more languages than are listed here (many beginning with the letter C). The challenge I lay down is to produce a 26-element list of languages  given these rules:

  • You must have been paid for writing code in the language. For example, I have programmed in Haskell (which would give me the H entry), but nobody has ever paid me to do so.
  • You can use specific implementation names (like Visual C++), but a language (C++)  must be appear  only once in the list.

A small no-prize of  golden NULL pointer will be awarded for anyone that convincingly comes up with a 26 element list!

Less Than Obvious

Introduction

I was trying to revive some old C++ code at the weekend when I noticed a strange problem. I was using a std::set of objects of a class I’d created, I added objects to the set, but when I came to iterate through them a bit later, half of the objects seemed to have disappeared! What the heck could be going on? It took me quite while to realise what my problem was, because it was not at all obvious. This article describes the problem you may encounter when using ordered associative containers like std::set and std::map in your own code, but is not intended to be a complete guide to using these containers, or to sorting in general. You will need a little (but not much) C++11 knowledge in order to follow it. 

std::set 101

The std::set container class is part of the C++ Standard Library and implements a collection of unique key values which are maintained in sorted order. The std::map container is very similar, but associates a value with each key. You can use std::set like this:

#include <set>
#include <iostream>
using namespace std;

int main() {
    std::set <int> s;
    s.insert( 7 );
    s.insert( 2 );
    s.insert( 5 );
    for ( auto i : s ) {
        cout << i << endl;
    }
}

The for/auto stuff is C++11 syntax for iterating over the set,  which prints out:

2
5
7

If I added another insert statement:

s.insert( 2 );

then the insert would not take place, as the set already contains the value 2.

Sets are implemented internally as balanced binary search trees. If you have ever played with BSTs, you will know that you need to compare the value at each node with the thing you want to add to the tree in order to find the correct insertion place. For C++ sets, this comparison is performed by the Standard Library functor std::less, which itself uses the < (less-than) operator for the types in the set. This is fine and dandy if the things in the set actually have a less-than operator (as integers do in the case above), but what if they don’t?

A Point Struct

For the remainder of the article, I’ll look at the problems of creating a Point structure that you can store in a std::set. The Point structure implements a par of integer (x,y) coordinates, and a first attempt looks like this:

#include <iostream>
#include <set>
using namespace std;

struct Point {
    int x, y;
};

ostream & operator << ( ostream & os, const Point & p ) {
    return os << p.x << "," << p.y;
}

int main() {
    Point p1{1,2};
    Point p2{4,1};
    cout << p1 << " " << p2 << endl;
}

I’ve added a streaming output operator<<() for convenience. The braces around the values used to initialise the Point objects are part of C++11 uniform initialisation.

This code produces the output:

1,2 4,1

So far so good. But what happens if we try to create a std:;set of Point objects and to add the point p1 to the set:

int main() {
    Point p1{1,2};
    Point p2{4,1};

    set <Point> s;
    s.insert( p1 );
}

Well, if you try to compile that, your compiler will complain – in the case of my GCC compiler, it complains bitterly, producing one of those wall-of-text sequence of error messages we all know and love. The relevant message here is this one, which I’ve edited slightly:

required from here stl_function.h:235:20: 
error: no match for 'operator<' (operand types are 'const Point' and 'const Point')
  { return __x < __y; }

You can see that some code in the Standard Library implementation header stl_function.h wants to use the less-than operator on a couple of Point objects, but the compiler cannot find such an operator because we have not written one, and one is not provided by C++ by default. If we want to put Point objects in a set, we have to provide a less-than operator.

A less-than Operator

How to go about writing a less-than operator for our Point struct? There are actually several ways of going about doing this, but I am going to suggest that it should be implemented as a free, non-member function that takes two parameters. I don’t intend to go into the reasons for this here except to say that binary operators should generally be implemented as free functions, and that if you can avoid making something a member function, you probably should do so.  Our comparison operator in outline then should look like this:

bool operator < ( const Point & a, const Point & b ) {
    // what to put here?
}

What should the operator return? Put simply, it should return true if the parameter a is less than the parameter b. So your first thought might be to write something like this:

bool operator < ( const Point & a, const Point & b ) {
    return a.x < b.x && a.y < b.y;
}

and to test it like this:

int main() {
    set <Point> s = {{1,1},{1,2},{2,1},{2,2}};
    for ( auto p : s ) {
        cout << p << endl;
    }
}

This code compiles, as the compiler is now happy it has a less-than operator it can use, but unfortunately, when you run the program, you will get the following output:

1,1
2,2

where two of the Points seem to have disappeared! This was the problem I was having with my own code (though to be fair on myself, my code was considerably more complex than this Point structure). So, what is going on?

Strict Weak Ordering

The C++ Standard says that in order for a less-than operator to work correctly with sets and maps (and with some of the Standard Library algorithms), it must enforce what is known as "strict weak ordering". I don’t intend to go into any detail about exactly what this means here – if you want more details I suggest taking a look at this Wikipedia article. Suffice it to say that the comparison operator we have provided does not enforce it, and is in fact rather obviously not a great way of ordering 2D points.

But if what we have written does not enforce strict weak ordering, how to go about writing one that does? Unfortunately, doing so is both somewhat unintuitive (at least for me – you might like to test your own intuition by correcting the above comparison function) and very, very easy to get wrong if you are not methodical about your implementation. I’m going to suggest two approaches – a mechanical, algorithmic one, and a possibly better one using other C++ facilities.

The Mechanical Method

The mechanical method is simple. For each pair of values in the two parameters referred to as a and b in the less-than operator above, write code like this:

if ( a.x < b.x ) 
    return true;
if ( b.x < a.x ) 
    return false;

and finish off by returning false. So, using this method, our comparison operator becomes:

bool operator < ( const Point & a, const Point & b ) {
    if ( a.x < b.x ) 
        return true;
    if ( b.x < a.x ) 
        return false;
    if ( a.y < b.y ) 
        return true;
    if ( b.y < a.y ) 
        return false;
    return false;
}

If we decided to make Point a 3D point, containing a z coordinate, we would just add the statements:

if ( a.z < b.z ) 
    return true;
if ( b.z < a.z ) 
    return false;

With the above changes in place, the test program now produces the correct output:

1,1
1,2
2,1
2,2

This method may seem long-winded, but it has two big advantages. First, it is hard to get it it wrong – you can write the comparison function almost without thinking. And second, it only uses the less-than operator on the types it is comparing – this is important, as many types will not implement the full gamut of comparisons.

Using C++ Features

If you want to avoid the semi-boilerplate code above, C++ offers several other possibilities for providing things that you can store in a set.

One possibility is not to create your own Point struct at all. C++ provides several data structures that work much like a point, and which implement a suitable less-than operator out of the box. The most obvious of these is std::pair:

#include <iostream>
#include <set>
#include <utility>
using namespace std;

int main() {
    set <pair<int,int>> s = {{1,1},{1,2},{2,1},{2,2}};
    for ( auto p : s ) {
        cout << p.first << "," << p.second << endl;
    }
}

If you wanted to retain the Point struct, you could use std::pair to implement your custom less-than operator:

bool operator < ( const Point & a, const Point & b ) {
    pair <int,int> pa{a.x,a.y};
    pair <int,int> pb{b.x,b.y};
    return pa < pb;
}

You can do similar things with std::vector, or in C++11 with std::tuple and the very handy std::tie function:

bool operator < ( const Point & a, const Point & b ) {
    return tie( a.x, a.y ) < tie( b.x, b.y );
}

The latter would be my personal preference, and its what I ended up using to solve my own problem with providing a less-than operator.

Summary

  • If you want to conveniently put objects in a  set or a map, they must provide a less-than operator.
  • The less-than operator must enforce strict weak ordering.
  • Providing strict weak orderingis easy for human programmers to get wrong.
  • Use C++ features like pairs and tuples to provide it.

Thinking Like A Programmer

Introduction

Yesterday, someone posted a question on how to "think like a programmer" on /r/learnprogramming, and I gave this sage advice:

All programming comes down to writing functions. Always think about how you could solve a problem by writing more functions.

While I believe this to be good advice, it made me wonder what my own thought processes really are when I set out to tackle a problem. So I picked on of the other problems on yesterday’s /r/learnprogramming, and studied my own cogitations while I solved it, which may be of interest to beginner programmers. The solution is in C, and should be understandable by anyone with a modicum of experience in this language.

The Problem

The problem is given an array of integers containing the values 1 and zero, find the index of the longest contiguous sequence of the value 1, and the length of that sequence. In other words, given the array:

int v[] = {0,1,1,0,0,0,1,1,1,0,1};

we should end up with the values 6 for the starting index, and 3 for the length.

My Thoughts

Hmm, well I’m going to need a function which takes an array of ints as its parameter, and as we are using C, I’d better pass the length of the array as a parameter too. What shall I call the function? Let’s call it longest for now, so it will look something like this:

longest( int a[], int alen );

And I’ve got to have it return two values? Bugger. I guess I could pass a couple of pointers as parameters to the function and have it return via them?  Nah, lets make a structure to hold the return values. Something like this:

typedef struct {
    int start, length
} Seq;

Now I can write the complete (if empty) function definition:

Seq longest( int a[], int alen ) {
}

Now I’ll flesh this out a little by actually creating a Seq and returning it. And I’ll write a little main() to call it too:

#include <stdio.h>

typedef struct {
  int start, length;
} Seq;

Seq longest( int a[], int alen ) {
  Seq seq = {0,0};
  return seq;
}

int main() {
  int v[] = {0,1,1,0,0,0,1,1,1,0,1};
  int vlen =  sizeof(v) / sizeof(int);
  Seq s = longest( v, vlen );
  printf( "%d %d\n", s.start, s.length );
}

Now I’ll just compile that … yay – it compiles, and run it, which gives me the expected output:

0 0

So now I have to think how to find the longest sequence. I know I’m going to have to walk over that array somehow, so lets write something to do that:

Seq longest( int a[], int alen ) {
  Seq seq = {0,0};
  int i = 0;
  while( i < alen ) {
    i++;
  }
  return seq;
}

And that compiles too – good! Now, at each position in the loop I would really like to know the length of the sequence of the value 1 starting at that position. I’m going to need another function – I’ll call it seqlen. What am I going to pass to it as parameters? Well, it will need the array (and that damned length!) and the current loop index. And the return value – easy; it should return the length of the sequence, or zero if there isn’t one. So it will look like this:

int seqlen( int a[], int alen, int i )

So, how is this going to work? I need to make sure I’m not looking past the end of the array. Then, so long as a[i] contains the value 1 I’m in a sequence of 1’s, so I need to keep looping until I’m not. Something like this:

int seqlen( int a[], int alen, int i ) {
  int n = 0;
  while( i < len && a[i] == 1 ) {
    n++;
    i++;
  }
  return n;
}

Which also compiles! Great. Now I need to use it in my other function.

The problem says that I need to return the index of the longest sequence and its length. This means I’ve got to call seqlen repeatedly, and update the longest and length fields of my Seq struct whenever I find a new longest sequence. So I need to add some code to my while-loop in longest:

while( i < alen ) {
  int sl = seqlen( a, alen, i );
  if ( sl > seq.length ) {
    seq.length = sl;
    seq.start = i;
  }
  i++;
}

And when I run it it gives me the expected output:

6 3

That’s great, but looking at the code I notice that I’m incrementing the loop index in seqlen but then throwing away that increment when the function returns. This means that I’m doing a lot more work than necessary, though I do get the right result. I think a bit about whether this matters, and come to the conclusion that for small problem sizes it doesn’t, but it might for very large ones. In any case, the fix is easy – I just change the increment statement in longest to this:

i += sl ? sl : 1;

which adds the length of the sequence I found, or 1 if there wasn’t one and I just need to skip over a 0 value.

That change worked as  expected, and completes the code, which can be found here.

Summary

OK, so what did my thought processes and programming practices pan out as?

  • The very first thing I did was to think of a function I could write; I didn’t start writing loops or anything like that
  • I used a struct to return multiple values from a function, rather than doing so via pointers.
  • I wrote some simple code which didn’t solve the problem, but did compile. I then started adding small chunks of code to this.
  • I then thought of writing another function to handle a subsidiary part of the problem. If the problem was bigger, I would have come up with yet more functions.
  • The functions I wrote were short – in fact about typical of the length of functions in my code for "real" applications.

So, I guess my advice on /r/programming wasn’t so bad after all!