C++ Antipatterns

This page documents some common mistakes that I see again and again in bug reports and requests for help on sites like StackOverflow.

Reading from an `istream` without checking the result

A very frequently asked question from someone learning C++ looks like this:

I wrote this program but it doesn't work. It reads from the file correctly but does the calculation wrong.

#include <iostream>
#include <fstream>

int calculate(int a, int b, int c)
{
  // blah blah blah complex calculation
  return a + b + c;
}

int main()
{
  std::ifstream in("input.txt");
  if (!in.is_open())
  {
    std::cerr << "Failed to open file\n";
    return 1;
  }
  int i, j, k;
  in >> i >> j >> k;
  std::cout << calculate(i, j, k);
}

Why doesn't the calculation work?

In many, many cases the problem is that the in >> ... statement failed, so the variables contain garbage values and so the inputs to the calculation are garbage.

The program has no way to check the assumption "it reads from the file correctly", so attempts to debug the problem are often just based on guesswork.

The solution is simple, but seems to be rarely taught to beginners: always check your I/O operations.

This improved version of the code only calls calculate(i, j, k) if reading values into all three variables succeeds:

  int i, j, k;
  if (in >> i >> j >> k)
  {
    std::cout << calculate(i, j, k);
  }
  else
  {
    std::cerr << "Failed to read values from the file!\n";
    throw std::runtime_error("Invalid input file");
  }

Now if any of the input operations fails you don't get a garbage result, you get an error that makes the problem clear immediately. You can choose other forms of error handling rather than throwing an exception, the important bit is to check the I/O and not just keep going regardless when something fails.

Recommendation: always check that reading from an istream succeeds.

Testing for `istream.eof()` in a loop

A common mistake when using istreams is to try and use eof() to detect when there is no more input:

  while (!in.eof())
  {
    in >> x;
    process(x);
  }

This doesn't work because the eofbit is only set after you try to read from a stream that has already reached EOF. When all the input has been read the loop will run again, reading into x will fail, and then process(x) is called even though nothing was read.

The solution is to test whether the read succeeds, instead of testing for EOF:

  while (in >> x)
  {
    process(x);
  }

You should never read from an istream without checking the result anyway, so doing that correctly avoids needing to test for EOF.

Recommendation: test for successful reads instead of testing for EOF

Locking and unlocking a `std::mutex`

This is always wrong:

  std::mutex mtx;

  void func()
  {
    mtx.lock();
    // do things
    mtx.unlock();
  }

It should always be done using one of the RAII scoped lock types such as lock_guard or unique_lock e.g.

  std::mutex mtx;

  void func()
  {
    std::lock_guard<std::mutex> lock(mtx);
    // do things
  }

Using a scoped lock is exception-safe, you cannot forget to unlock the mutex if you return early, and it takes fewer lines of code.

Recommendation: always use a scoped lock object to lock and unlock a mutex

Be careful that you don't forget to give a scoped lock variable a name! This will compile, but doesn't do what you expect:

  std::mutex mtx;

  void func()
  {
    std::unique_lock<std::mutex> (mtx);  // OOPS!
    // do things, but the mutex is not locked!
  }

This default-constructs a unique_lock object called mtx, which has nothing to do with the global mtx object (the parentheses around (mtx) are redundant and so it's equivalent to simply std::unique_lock<std::mutex> mtx;).

A similar mistake can happen using braces instead of parentheses:

  std::mutex mtx;

  void func()
  {
    std::unique_lock<std::mutex> {mtx};  // OOPS!
    // do things, but the mutex is not locked!
  }

This does lock the global mutex mtx, but it does so in the constructor of a temporary unique_lock which immediately goes away and unlocks the mutex again.

Inserting into a container of smart pointers with `emplace_back(new X)`

When appending to a std::vector<std::unique_ptr<X>> you cannot just say v.push_back(new X), because there is no implicit conversion from X* to std::unique_ptr<X>.

A popular solution is to use v.emplace_back(new X) because that compiles (emplace_back constructs an element in-place from the arguments, and so can use explicit constructors).

However this is not safe. If the vector is full and needs to reallocate memory that could fail and throw a bad_alloc exception, in which case the pointer will be lost and will never be deleted.

The safe solution is to create a temporary unique_ptr that takes ownership of the pointer before the vector might try to reallocate:

v.push_back(std::unique_ptr<X>(new X))

(You could replace push_back with emplace_back but there is no advantage here because the only conversion is explicit anyway, and emplace_back is more typing!)

In C++14 you should just use std::make_unique and it's a non-issue:

v.push_back(std::make_unique<X>())

Recommendation: do not prefer emplace_back just because it allows you to call an explicit constructor. There might be a good reason the class designer made the constructor explicit that you should think about and not just take a short cut around it.

(Scott Meyers discusses this point as part of Item 42 in Effective Modern C++.)

Defining "less than" and other orderings correctly

When using custom keys in maps and sets a common mistake is to define a "less than" operation like so:

  struct X
  {
    int a;
    int b;
  };

  inline bool operator<(const X& l, const X& r)
  {
    if (l.a < r.a)
      return true;
    if (l.b < r.b)
      return true;
    return false;
  }

This operator< does not define a valid ordering. Consider how it behaves for X{1, 2} and X{2, 1}:

  X x1{1, 2};
  X x2{2, 1};
  assert( x1 < x2 );
  assert( x2 < x1 );

The operator< defined above means that x1 is less than x2 but also that x2 is less than x1, which should be impossible!

The problem is that the operator< says that l is less than r if any member of l is less than the corresponding member of r. That's like saying that 20 is less than 11 because when you compare the second digits '0' is less than '1' (or similarly, that the string "20" should be sorted before "11" because the second character '0' is less than the second character '1').

In technical terms the operator< above fails to define a Strict Weak Ordering.

Another way to define a bogus order is:

  inline bool operator<(const X& l, const X& r)
  {
    return l.a < r.a && l.b < r.b;
  }

Where the first example gave the nonsensical result x1 < x2 && x2 < 1 this definition gives the result !(x1 < x2) && !(x1 < x2) i.e. the two values are considered to be equivalent, and so only one of them could be inserted into a unique associative container such as a std::set. But then if you compare those values to X x3{1, 0} you find that x1 and x3 are equivalent, but x1 and x2 are not. So depending which of x1 and x2 is in the std::set affects whether or not you can add x3!

An invalid order like the ones above will cause undefined behaviour if it is used where the Standard Library expects a correct order, e.g. in std::sort, std::set, or std::map. Trying to insert the x1 and x2 values above into a std::set<X> will give strange results, maybe even crashing the program, because the invariant that a set's elements are always sorted correctly is broken if the comparison function is incapable of sorting correctly.

This is discussed further in Effective STL by Scott Meyers, and in the CERT C++ Coding Standard.

A correct implementation of the function would only consider the second member when the first members are equal i.e.

  inline bool operator<(const X& l, const X& r)
  {
    if (l.a < r.a)
      return true;
    if (l.a == r.a && l.b < r.b)
      return true;
    return false;
  }

Since C++11 defining an order correctly is trivial, just use std::tie:

  inline bool operator<(const X& l, const X& r)
  {
    return std::tie(l.a, l.b) < std::tie(r.a, r.b);
  }

This creates a tuple referring to the members of l and a tuple referring to the members of r, then compares the tuples, which does the right thing (only comparing later elements when the earlier ones are equal).

Recommendation: When writing your own less-than operator make sure it defines a Strict Weak Ordering, and write tests to ensure that it doesn't give impossible results like (a < b && b < a) or (a < a). Prefer using std::tie() to create tuples which can be compared, instead of writing your own error-prone comparison logic.

Consider whether defining a hash function and using std::unordered_set or std::unordered_map would be better than std::set or std::map anyway. Unordered containers require an equality operator, but that's harder to get wrong.

Dynamically allocating `std::thread` objects

  class Thread
  {
  public:
    // ...

    void start() {
      thr = new std::thread(&Thread::run, this);
    }

    void stop() {
      thr->join();
      delete thr;
      thr = nullptr;
    }

  private:
    void run();

    std::thread* thr;
  };

Firstly, this code could use std::unique_ptr<std::thread> so there is no need to do delete thr, but for some reason the people who use this anti-pattern never seem to use smart pointers either.

The real problem with the code above is that it creates a std::thread using dynamic allocation (i.e. on the heap). This is unnecessary and simply makes the code slower and more error-prone. The std::thread type has value semantics, so there is no need to refer to it indirectly through a pointer.

Reasons to refer to an object through a pointer include:

You need a "null" or "invalid" state, which a null pointer can represent.
The object is polymorphic, i.e. has virtual functions, and you want to refer to it through a pointer to a base-class.
The object cannot be copied or moved, so to transfer ownership you need to pass it around via a pointer.
The object is large, or expensive to copy, and you want to avoid copies.
You want a "non-owning" handle to an object managed by another part of the program.

None of these reasons apply here, because std::thread:

has a default constructor that creates an empty object not representing any thread of execution;
has no virtual functions and is not polymorphic;
is move constructible and move assignable, transferring ownership efficiently and safely;
is a lightweight handle that isn't copyable, only movable;
only supports two operations, joining or detaching, and neither of them is reasonable to do with a thread you don't own.

Instead of managing std::thread through a pointer just create it as a normal local variable or a normal class data member, assign new values to it, and let its destructor free its resources (after joining or detaching it if needed).

  class Thread
  {
  public:
    // ...

    void start() {
      thr = std::thread(&Thread::run, this);
    }

    void stop() {
      thr.join();
    }

  private:
    void run();

    std::thread thr;
  };

Recommendation: Treat std::thread as a value type, don't dynamically allocate it.

Using `std::bind` when constructing `std::thread` objects

  std::thread thr(std::bind(&Thread::run, this, arg1, arg2));

This is unnecessary and makes the code less readable for no reason. std::thread has a constructor that takes multple arguments and knows how to invoke them just like bind does. In fact, both std::bind and std::thread are defined in terms of the INVOKE protocol, so they behave similarly.

You can just pass the arguments straight to the thread constructor:

  std::thread thr(&Thread::run, this, arg1, arg2);

There is a difference between this and the version above that uses std::bind, but it's not a good one! The difference is that std::bind supports additional features for leaving some arguments unbound (by using the tags placeholders::_1, placeholders::_2 etc.) and for using nested bind expressions, but neither of those features is useful (or even usable) with std::thread. You pass all the arguments to the constructor, you don't leave some unbound to be provided later. So using std::bind requires additional code to handle placeholders and check for them when invoking the callable, which means more code has to be compiled and more code executes at run-time.

Recommendation: Don't use std::bind to create a std::thread from multiple arguments.

Using `std::string("")` to create empty `std::string` objects

The designers of std::string thought of a very clever way to create an empty string: the default constructor. The previous sentence is sarcasm. It's not particularly clever, it's completely obvious. So I don't know why people insist on writing std::string s = ""; or std::string s(""); when that's more typing, more code, and very possibly slower.

The std::string default constructor will be implemented something like this:

  template<typename CharT, typename Traits, typename Alloc>
    basic_string::basic_string() noexcept
    : m_ptr(m_local_buffer()), m_length(0), m_capacity(m_local_buffer_size())
    { *m_ptr = '\0' }

The constructor taking a const char* will be implemented something like this:

  template<typename CharT, typename Traits, typename Alloc>
    basic_string::basic_string(const char* s)
    {
      m_length = Traits::length(s);
      if (m_length < m_local_buffer_size())
      {
        m_ptr = m_local_buffer();
        m_capacity = m_local_buffer_size();
      }
      else
      {
        Alloc a = get_allocator();
        m_ptr = std::allocator_traits<Alloc>::allocate(a, m_length + 1);
        m_capacity = m_length;
      }
      Traits::copy(m_ptr, s, m_length + 1);
    }

Can you guess which constructor is likely to be faster?

Recommendation: Use the default constructor to create empty strings.