Adjacent circular iterators

Note: Part way through writing this post I discovered that the boost graph library had explored many of these concepts already. I decided to finish the post anyway because I think my design process might be interesting.

One of my personal projects is a radiosity renderer. The world doesn’t need another radiosity renderer, but I wanted to learn about them, and, as usual with my personal projects, there is a secondary reason for writing it – to play around with C++ and see what I can learn.

In order to have a renderer you need to have a scene to render, which means you also need to have some data structures describing that scene. For the time being my data structure is simple – a list of polygons. Each polygon object includes various useful bits of data – things like colour and reflectivity. A polygon also has a vector of vertices. The vertices define the bounds of the polygon.

In addition to the polygons and vertices I need to process edges. An edge is defined by the two vertices it connects. My canonical data structure is the polygons and vertices, so when I need edges I need to construct them from the vertices.

There is nothing too difficult about that – iterate through the vertices and create an edge using each vertex and the following vertex. The twist comes when we get to the end of the vertices. Once we are at the last vertex, the next vertex we want is actually at the beginning of the vector. The polygon is closed so there has to be an edge between the last vertex and the first vertex. I am going to refer to this concept as “adjacent circular” – we already have algorithms like std::adjacent_find and “circular” indicates that we are linking the end to the beginning.

For the sake of simplicity I’ll get rid of the extraneous details and slim things down so that our input is std::vector< char >, and our output is std::vector< std::pair< char, char > >. In other words, we are trying to write this function:

std::vector< std::pair< char, char > >
getEdges( std::vector< char > const& vertices );

(As usual, some using statements and typedefs would make this less verbose, and, as usual I want to be explicit about where everything is coming from so I am going to spell out every component in full. Also, when I get to the point of writing classes I am going to inline the function definitions – in reality I’d split them out from the declaration).

If the input is:

a, b, c, d, e

we expect the output of getEdges to be:

(a b), (b c), (c d), (d e), (e a)

It’s always good to start off with the straightforward solution:

std::vector< std::pair< char, char > >
getEdges( std::vector< char > const& vertices)
{
    std::vector< std::pair< char, char > > edges;

    std::vector< char >::const_iterator i( std::begin( vertices ) );
    std::vector< char >::const_iterator j( std::begin( vertices ) );

    while( i != std::end( vertices ) )
    {
        ++j;
        
        if( j == std::end( vertices ) )
        {
            j = std::begin( vertices );
        }

        edges.push_back( std::make_pair( *i, *j ) );

        ++i;
    }

    return edges;
}

The code is not particularly elegant or efficient but it does the job. At the very least it’s useful as a test baseline for the other solutions we’re going to come up with.

Let’s follow Sean Parent’s dictum of “no raw loops” and see what else we can come up with. The point of “no raw loops” is not that there are no loops at all, just that the loops themselves are encapsulated in a reusable algorithm.


Let’s write ourselves an algorithm:

template<
    typename ForwardIterator,
    typename BinaryOperation >
BinaryOperation for_adjacent_circular(
    ForwardIterator first,
    ForwardIterator last,
    BinaryOperation op)
{
    ForwardIterator i( first );
    ForwardIterator j( first );
    
    while( i != last )
    {
        ++j;
        
        if( j == last )
        {
            j = first;
        }
        
        op( *i, *j );

        ++i;
    }

    return op;
}

As we’d expect, for_adjacent_circular looks very much like our original function. When we use it, it looks very much like std::for_each:

std::vector< std::pair< char, char > >
getEdgesAlgorithm( std::vector< char > const& vertices)
{
    std::vector< std::pair< char, char > > edges;

    for_adjacent_circular(
        std::begin( vertices ),
        std::end( vertices ),
        [ &edges ]( char a, char b )
        {
            edges.push_back( std::make_pair( a, b ) );
        } );

    return edges;
}

This is good as far as it goes – the problem is that it doesn’t go far enough. We wrote an algorithm that is specialized for our original problem – converting a vector of vertices into a vector of edges. Given that it’s a for_each style algorithm we can fake it to do a lot of things that we’d like to do, but it feels like we’re abusing the algorithm and falling into the trap John Potter describes of making std::for_each the “goto of algorithms”.


There are many more algorithms we might want to use – it would be nicer to have iterators that we could feed into one of the existing standard algorithms. Here’s my first pass at an iterator. I’ll display the whole thing first then go through the interesting features of it.

template< typename Iterator >
class ConstAdjacentCircularIterator
{
public:
    ConstAdjacentCircularIterator( 
        Iterator begin, 
        Iterator position, 
        Iterator end )
    : begin_( begin )
    , position_( position )
    , end_( end )
    {
    }

    typedef std::forward_iterator_tag iterator_category;
    typedef std::pair< 
        typename Iterator::value_type const&, 
        typename Iterator::value_type const& > value_type;
    typedef typename Iterator::difference_type difference_type;
    typedef value_type* pointer;
    typedef value_type& reference;

    // Prefix
    ConstAdjacentCircularIterator& operator ++()
    {
        ++position_;

        return *this;
    }

    // Postfix
    ConstAdjacentCircularIterator operator ++( int )
    {
        ConstAdjacentCircularIterator original( *this );
        ++(*this);
        return original;
    }

    value_type operator *() const
    {
        Iterator next( position_ );
        ++next;
        if( next == end_ )
        {
            next = begin_;
        }
        
        return value_type( *position_, *next );
    }

private:

    template< typename I >
    friend bool operator ==( 
        ConstAdjacentCircularIterator< I > const& a, 
        ConstAdjacentCircularIterator< I > const& b )
    {
        return 
            a.begin_ == b.begin_ &&
            a.position_ == b.position_ &&
            a.end_ == b.end_;
    }

    Iterator begin_;
    Iterator position_;
    Iterator end_;
};

template< typename Iterator >
bool operator !=( 
    ConstAdjacentCircularIterator< Iterator > const& a, 
    ConstAdjacentCircularIterator< Iterator > const& b )
{
    return !( a == b );
}

(As usual I needed to break out my copy of C++ Templates by Vandervoorde and Josuttis to get the friend declarations correct. Actually I am still not certain they are correct – I got different errors out of two different compilers and spent some time getting to what might be the right solution. I like C++ and I like templates but the syntax often has me baffled).

I have written the const version of the iterator since that’s what I need for the moment. We’ll look at the non-const version later.


ConstAdjacentCircularIterator( 
    Iterator begin, 
    Iterator position, 
    Iterator end )
: begin_( begin )
, position_( position )
, end_( end )
{
}

The constructor for our adjacent circular iterator takes other iterators as input. I assume we have some kind of underlying sequence which has its own iterators. Interestingly, we have to specify three iterators to construct an adjacent circular iterator. Not only do we specify position, we also need to specify the range of the underlying sequence with begin and end. We’ll see why these are necessary when we implement the increment functions. This is a pattern I have come across before. When I’m writing iterators that do more than just directly reference the elements of an underlying sequence I often need to specify more information about the range that the iterator is iterating over.


Moving on, we have the typedefs that all iterators must have:

typedef std::forward_iterator_tag iterator_category;

I have chosen to make this a forward iterator simply because I didn’t want to implement more than operator ++. There is no reason why we can’t implement all of the other access methods and make the circular iterator as fully featured as the underlying iterator.


typedef std::pair< 
    typename Iterator::value_type const&, 
    typename Iterator::value_type const& > value_type;

value_type is interesting, and by “interesting” I mean “the first sign that the train is eventually going to come off the rails”.

Our basic problem is that the value we want (a pair of the values stored in the underlying container) is not actually what we have stored in the underlying container. To try and work around this I am storing references to the two values that make up our pair. This looks like a decent compromise – it is a decent compromise – but it’s still a compromise.

This is going to come up again later.


typedef typename Iterator::difference_type difference_type;

We take difference_type straight from our underlying iterator – nothing surprising here.


typedef value_type* pointer;
typedef value_type& reference;

pointer and reference should not be surprising. They’re just a pointer and reference to value_type. However, we are already uneasy about value_type so we need to keep an eye on these.


// Prefix
ConstAdjacentCircularIterator& operator ++()
{
    ++position_;

    return *this;
}

Our prefix increment implementation is not surprising. I am not implementing any sort of error checking – I assume that will be handled (if it’s handled at all) by the underlying iterator. Notice that the iterator itself is not wrapping around the container – we only wrap around when we defererence the iterator. Incrementing the iterator will never take us back to the beginning of the container. While it would be possible to create an iterator that wrapped around (although the question of what end would be is an interesting one), that isn’t what we’re doing here.


// Postfix
ConstAdjacentCircularIterator operator ++( int )
{
    ConstAdjacentCircularIterator original( *this );
    ++(*this);
    return original;
}

Postfix increment is just what we’d expect.


value_type operator *() const
{
    Iterator next( position_ );
    ++next;
    if( next == end_ )
    {
        next = begin_;
    }
    
    return value_type( *position_, *next );
}

The deference function is where the magic happens. Let’s get the easy stuff out of the way – we use the current iterator and next iterator, making sure that if we’re at the end of the range we wrap around to the beginning. This is why we need to have begin and end stored in the iterator itself – we can’t implement this function without them.

We dereference position_ and next which gives us references to the underlying values, then we construct a value_type from the two references and return it.

Now let’s look at the subtleties. Our problems with value_type really bite us here. operator * should return a reference to the value type – the standard says so (section 24.2.2 in the C++11 standard). We can’t return a reference to value_type because our underlying container (whatever that is) does not contain objects of that type.

value_type contains references to the underlying objects, but that isn’t the same thing and is not enough to satisfy the standard.


template< typename I >
friend bool operator ==( 
    ConstAdjacentCircularIterator< I > const& a, 
    ConstAdjacentCircularIterator< I > const& b )
{
    return 
        a.begin_ == b.begin_ &&
        a.position_ == b.position_ &&
        a.end_ == b.end_;
}

The equality operator is necessary for us to do almost anything useful with our iterator. There is one interesting point here. There are two possible ways of writing this function. We could just compare position_, however if we do that we can generate situations like this:

ConstAdjacentCircularIterator< 
    typename std::vector< char >::const_iterator > i0( 
        vc.begin(), vc.begin() + 2, vc.end() );

ConstAdjacentCircularIterator< 
    typename std::vector< char >::const_iterator > i1( 
        vc.begin(), vc.begin() + 2, vc.begin() + 3 );

i0 and i1 have the same position – vc.begin() + 2. If we were just comparing position, i0 == i1 is true. Unfortunately, *i0 == *i1 is false.

*i0 is (c, d)

*i1 is (c, a)

It is a normal restriction for operator == to be valid only if the iterators refer to the same underlying sequence (C++11 standard section 24.2.5). Even though i0 and i1 are based off the same underlying container, they are not referring to the same sequence since they have different end values.

All of that is a long-winded way of saying that we need to compare begin_, position_, and end_.


template< typename Iterator >
bool operator !=( 
    ConstAdjacentCircularIterator< Iterator > const& a, 
    ConstAdjacentCircularIterator< Iterator > const& b )
{
    return !( a == b );
}

I define inequality in terms of equality – that’s a common technique. Since this doesn’t need to be a member function it isn’t.


Our getEdges function is now simple (at least it would be if I were using typedefs):

template< typename T >
std::vector< std::pair< typename T::value_type, typename T::value_type > >
getEdgesIterator( T const& vertices)
{
    std::vector< std::pair< 
        typename T::value_type, 
        typename T::value_type > > edges;

    ConstAdjacentCircularIterator< typename T::const_iterator > b( 
        std::begin( vertices ), 
        std::begin( vertices ), 
        std::end( vertices ) );
    ConstAdjacentCircularIterator< typename T::const_iterator > e( 
        std::begin( vertices ), 
        std::end( vertices ), 
        std::end( vertices ) );

    std::copy( b, e, std::back_inserter( edges ) );

    return edges;
}

Even better, since we now have iterators it looks like we can use any standard algorithm that requires iterators – std::for_each, std::transform, std::find etc.

I will award bonus points to anyone who spotted the weasel words in that statement – “looks like”. That’s an important caveat and we’re going to come back to it later. For the time being let’s just say that ConstAdjacentCircularIterator is doing a reasonable impersonation of an iterator.


So much for the const version of the iterator, what about the non-const version? There’s one difference – the type of value_type:

typedef std::pair< 
    typename Iterator::value_type&, 
    typename Iterator::value_type& > value_type;

This time the references to the underlying data type are non-const. We can write to an AdjacentCircularIterator as well as read from it.

For example, if we run this code (I have output operators already defined for std::vector< char > and std::pair< char, char >):

std::vector< char > vc( { 'a', 'b', 'c', 'd', 'e' } );

AdjacentCircularIterator< 
    typename std::vector< char >::iterator > i( 
        vc.begin(), 
        vc.begin(), 
        vc.end() );

++i;

std::cout << vc << "\n";
std::cout << *i << "\n";

*i = std::make_pair( 'x', 'y' );

std::cout << vc << "\n";
std::cout << *i << "\n";

We get this output:

a, b, c, d, e
(b c)
a, x, y, d, e
(x y)

What about range based for? To make range-based for work we need to create a container having begin and end methods. Since we've already defined our iterator classes this is easy:

class AdjacentCircularContainer
{
public:

    typedef AdjacentCircularIterator< 
        std::vector< char >::iterator >::value_type value_type;

    typedef value_type& reference;
    typedef value_type const& const_reference;
    typedef value_type* pointer;
    typedef value_type const* const_pointer;

    typedef AdjacentCircularIterator< 
            typename std::vector< char >::iterator > iterator;
    typedef ConstAdjacentCircularIterator< 
            typename std::vector< char >::const_iterator > const_iterator;
    
    AdjacentCircularContainer( std::vector< char >* p )
    : pContainer_( p )
    {
    }

    const_iterator begin() const
    {
        return const_iterator( 
                pContainer_->begin(), 
                pContainer_->begin(), 
                pContainer_->end() );
    }

    const_iterator end() const
    {
        return const_iterator( 
            pContainer_->begin(), 
            pContainer_->end(), 
            pContainer_->end() );
    }
    
private:
    std::vector< char > *pContainer_;
};

(I am implementing the minimum necessary here - there's more we can, and should, add if we want this to be a full blown container).

We can use our container exactly as we'd expect:

template< typename T >
std::vector< std::pair< typename T::value_type, typename T::value_type > >
getEdgesContainer( T const& vertices)
{
    std::vector< std::pair< char, char > > result;

    AdjacentCircularContainer c( &v_char );

    for( std::pair< char, char > e : c )
    {
        result.push_back( e );
    }

    return result;
}

While we're at it, now we have our container we can start to use the Adobe Source Library algorithms:

template< typename T >
std::vector< std::pair< typename T::value_type, typename T::value_type > >
getEdgesAdobe( T const& vertices)
{
    std::vector< std::pair< char, char > > result;

    AdjacentCircularContainer c( &v_char );

    adobe::copy( c, std::back_inserter( result ) );

    return result;
}

Life is good. We have iterators and we have containers that implement our desired "circular adjacent" functionality.

Except that we don't.

I have dropped unsubtle hints along the way that we've been cheating. While we have things that do reasonable impersonations of iterators and containers (good enough to do the things I need for my current use case), we do not have standard compliant iterators and containers.

The fundamental problem is the one we identified way back when we started to come up with the type of value_type. Normally with a C++ sequence there are objects of type value_type stored somewhere, and they persist for long enough that returning and storing references to them is a sensible thing to do. We don't have that. The only time we ever have an actual pair of values is as a local value inside the dereferencing function, and, as we all know, passing back a reference to a local value (which goes out of scope as soon as the function ends) is a Bad Thing.

We want (and the standard requires) the declaration of operator * to be:

value_type const& operator *() const

but we have to make it:

value_type operator *() const

There are other problems (we can't implement front or back correctly), but they all stem from the same source.

If you're thinking - "this sounds like the problems with std::vector< bool >" - you're right. Because std::vector< bool > is allowed to be specialized so that the boolean values can be stored as a bitset you can't actually get a reference to an individual bool object because the container doesn't store individual bool objects. We can fake it with proxy objects, and more modern versions of C++ can fake it better than older versions can, but we're still faking it, and sooner or later the faking will catch up with us.

For more on std::vector< bool > checkout Howard Hinnant's comments on this Stack Overflow question and this blog post.


In conclusion, it's been an interesting investigation but is it really worth it? For my particular use case the answer is "no" (at least in terms of solving my original "vertices to edges" problem rather than writing a blog post and learning something). Our original simple solution is good enough for my needs. I have it wrapped up in the function getEdges so it's easy enough to swap out if I need something different.

Once I have my edges in a container I can run whatever algorithsm I need on a container that does actually contain edges rather than some fake sequence that just pretends to contain edges.

C++ links

Bjarne Stroustrup and Herb Sutter are putting together a set of C++ core guidelines. The main document is here, and it’s worth looking round the rest of the repository for more links, and slides of talks.

I haven’t read all of the guidelines yet (it’s a big document), but what I have looked at so far is good.

There’s a discussion about the core guidelines on Hacker News. Sadly, one of the comments argues that:

drawline(int,int,int,int);

is superior to:

drawline(Point,Point);

I despair of this industry on a regular basis. We certainly don’t know everything about programming (and I hope in 20 years we’ll be looking back and laughing at our naivete), but we don’t know nothing either. Grouping data values into bundles is not a new idea, it has many demonstrated benefits and yet people still argue against it. Writing good code is hard, and we just make it harder by ignoring even the most basic of guidelines.

I laughed at another of the comments:

I agree, these are too opinionated to be unconditionally useful.

It’s a document by Bjarne Stroustrup and Herb Sutter. I’d be disappointed if it wasn’t opinionated.

Videos from the 2015 CppCon are starting to appear on the CppCon YouTube channel. There are also plenty of videos from previous years.

Notes from Herb Sutter’s talk Writing good C++14… by default are here

Algorithms and variations 2

One way of categorizing the algorithms is by the types of iterators that they take as arguments, or return as results.

Let’s start with a reminder of the hierarchy of iterators (I am deliberately ignoring some of the new iterator functionality being proposed for C++17):

Random access iterator
Bidirectional iterator
Forward iterator
Input iterator Output iterator


The key quality of an input iterator is that the input sequence can only be traversed once. We can make a copy of an input iterator, but the moment we increment one of the copies, the other becomes invalid. The input value is available once – a common example is reading in keys from a keyboard.

An output iterator will only let us write a value once. An example is writing a value to a stream that’s being sent over the network (this example ignores a bunch of details likely to apply in practice but that doesn’t stop it from being a useful reminder for me).

A forward iterator lets us read or write as many times as we want. We can make copies of the iterator and traverse the sequence as many times as we want but we can only traverse it in one direction – forward. The classic example is a singly linked list.

A bidirectional iterator has all of the qualities of a forward iterator but will let us traverse backwards as well as forwards – e.g. a doubly linked list.

Finally, a random access iterator will let us traverse forwards, backwards and additionally, jump to any point in the sequence. std::vector lets us do all of this.


Input iterator

There are lots of algorithms that take input iterators as arguments so I have subdivided them further by their return type. Let’s start off with the algorithms that do not return an iterator:

T accumulate (InputIterator, InputIterator, T)
T accumulate (InputIterator, InputIterator, T, BinaryOperation)
bool all_of (InputIterator, InputIterator, Predicate)
bool any_of (InputIterator, InputIterator, Predicate)
difference_type count (InputIterator, InputIterator, T)
difference_type count_if (InputIterator, InputIterator, Predicate)
bool equal (InputIterator1, InputIterator1, InputIterator2)
bool equal (InputIterator1, InputIterator1, InputIterator2, BinaryPredicate)
Function for_each (InputIterator, InputIterator, Function)
bool includes (InputIterator1, InputIterator1, InputIterator2, InputIterator2)
bool includes (InputIterator1, InputIterator1, InputIterator2, InputIterator2, Compare)
T inner_product (InputIterator1, InputIterator1, InputIterator2, T)
T inner_product (InputIterator1, InputIterator1, InputIterator2, T, BinaryOperation1, BinaryOperation2)
bool is_partitioned (InputIterator, InputIterator, Predicate)
bool lexicographical_compare (InputIterator1, InputIterator1, InputIterator2, InputIterator2)
bool lexicographical_compare (InputIterator1, InputIterator1, InputIterator2, InputIterator2, Compare)
bool none_of (InputIterator, InputIterator, Predicate)

We could subdivide this list even further. We have the algorithms that return bool – they are asking questions (all_of, any_of, equal, includes, is_partitioned, lexicographical_compare, none_of). We have the count and count_if algorithms – also asking questions and getting back an answer in the form of an integer. There are the algorithms that return a single value computed from the input range (accumulate and inner_product) and finally, for_each which, because it allows its function object to change state, returns the final version of the function object.

All of the algorithms can be run in a single pass over the input range – remember that input iterators do not allow us to iterate twice over a range. These algorithms can also be run over any pair of iterators above an input iterator in the hierarchy – i.e. all iterators except output iterators.

Next up, algorithms that take input iterators and return an output iterator:

OutputIterator adjacent_difference (InputIterator, InputIterator, OutputIterator)
OutputIterator adjacent_difference (InputIterator, InputIterator, OutputIterator, BinaryOperation)
OutputIterator copy (InputIterator, InputIterator, OutputIterator)
OutputIterator copy_if (InputIterator, InputIterator, OutputIterator, Predicate)
OutputIterator copy_n (InputIterator, Size, OutputIterator)
OutputIterator merge (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator)
OutputIterator merge (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator, Compare)
OutputIterator move (InputIterator, InputIterator, OutputIterator)
OutputIterator partial_sum (InputIterator, InputIterator, OutputIterator)
OutputIterator partial_sum (InputIterator, InputIterator, OutputIterator, BinaryOperation)
pair<OutputIterator1, OutputIterator2> partition_copy (InputIterator, InputIterator, OutputIterator1, OutputIterator2, Predicate)
OutputIterator remove_copy (InputIterator, InputIterator, OutputIterator, T)
OutputIterator remove_copy_if (InputIterator, InputIterator, OutputIterator, Predicate)
OutputIterator replace_copy (InputIterator, InputIterator, OutputIterator, T, T)
OutputIterator replace_copy_if (InputIterator, InputIterator, OutputIterator, Predicate, T)
OutputIterator set_difference (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator)
OutputIterator set_difference (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator, Compare)
OutputIterator set_intersection (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator)
OutputIterator set_intersection (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator, Compare)
OutputIterator set_symmetric_difference (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator)
OutputIterator set_symmetric_difference (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator, Compare)
OutputIterator set_union (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator)
OutputIterator set_union (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator, Compare)
OutputIterator transform (InputIterator, InputIterator, OutputIterator, UnaryOperation)
OutputIterator transform (InputIterator1, InputIterator1, InputIterator2, OutputIterator, UnaryOperation)
OutputIterator unique_copy (InputIterator, InputIterator, OutputIterator)
OutputIterator unique_copy (InputIterator, InputIterator, OutputIterator, BinaryPredicate)

The algorithms that return an output iterator all take an output iterator as a parameter. The algorithms are taking an input range and converting it to an output range. The result is always the position after the last written element of the output range – i.e. the first element that was not overwritten.

Alex Stepanov had many design principles for the STL. One of them states that an algorithm that calculates a useful value should return that value even when the calculation of that value is not the main point of the algorithm. Take std::copy. The programmer already knows where the beginning and end of the input range are, and where the beginning of the output range is. What the programmer does not necessarily know is where the end of the output range is. Since the algorithm computes the end of the output range, and it is potentially useful information (I’ve used it several times), the algorithm returns the end of the output range.

As well as straightforward copying we also get variants on std::copy : std::copy, std::copy_if, std::copy_n, std::remove_copy, std::remove_copy_if, std::partition_copy (note that std::remove, std::remove_if and std::partition need more powerful iterators – it is the copying which lets them work with input iterators).

We also get std::merge – a building block for merge sort algorithms.

Finally for the input iterators, we have the algorithms that return an input iterator.

InputIterator find (InputIterator, InputIterator, T)
InputIterator find_first_of (InputIterator, InputIterator, ForwardIterator, ForwardIterator)
InputIterator find_first_of (InputIterator, InputIterator, ForwardIterator, ForwardIterator, BinaryPredicate)
InputIterator find_if (InputIterator, InputIterator, Predicate)
InputIterator find_if_not (InputIterator, InputIterator, Predicate)
pair<InputIterator1, InputIterator2> mismatch (InputIterator1, InputIterator1, InputIterator2)
pair<InputIterator1, InputIterator2> mismatch (InputIterator1, InputIterator1, InputIterator2, BinaryPredicate)

All of the algorithms return the position of something that was found, whether that was a given value or a mismatch.

Remember that an input iterator only gets to iterate over a range once. Once we have read a value and moved on that value is gone, however an algorithm like std::find only needs to read that value, it does not need to move on before returning the iterator.


Output iterator

There are only a couple of algorithms that take no iterators other than output iterators. Unsurprisingly, these are algorithms that generate some output:

OutputIterator fill_n (OutputIterator, Size, T)
OutputIterator generate_n (OutputIterator, Size, Generator)

There are other variants of std::fill and std::generate which we’ll look at later. These two algorithms write a sequence of n values. They are useful because we don’t always have the option of getting an “end” output iterator. Imagine appending data to a file, the end is wherever we decide to stop writing – there is no predetermined “end”.

In a previous blog post I mentioned the idea of adding iota_n. This is the category that iota_n would fit into. Incidentally, the boost algorithms library has boost::iota_n available, along with many other useful algorithms.


Forward iterator

ForwardIterator adjacent_find (ForwardIterator, ForwardIterator)
ForwardIterator adjacent_find (ForwardIterator, ForwardIterator, BinaryPredicate)
bool binary_search (ForwardIterator, ForwardIterator, T)
bool binary_search (ForwardIterator, ForwardIterator, T, Compare)
pair<ForwardIterator, ForwardIterator> equal_range (ForwardIterator, ForwardIterator, T)
pair<ForwardIterator, ForwardIterator> equal_range (ForwardIterator, ForwardIterator, T, Compare)
void fill (ForwardIterator, ForwardIterator, T)
ForwardIterator1 find_end (ForwardIterator1, ForwardIterator1, ForwardIterator2, ForwardIterator2)
ForwardIterator1 find_end (ForwardIterator1, ForwardIterator1, ForwardIterator2, ForwardIterator2, BinaryPredicate)
void generate (ForwardIterator, ForwardIterator, Generator)
void iota (ForwardIterator, ForwardIterator, T)
bool is_permutation (ForwardIterator1, ForwardIterator1, ForwardIterator2)
bool is_permutation (ForwardIterator1, ForwardIterator1, ForwardIterator2, BinaryPredicate)
bool is_sorted (ForwardIterator, ForwardIterator)
bool is_sorted (ForwardIterator, ForwardIterator, Compare)
bool is_sorted_until (ForwardIterator, ForwardIterator)
bool is_sorted_until (ForwardIterator, ForwardIterator, Compare)
void iter_swap (ForwardIterator1, ForwardIterator2)
ForwardIterator lower_bound (ForwardIterator, ForwardIterator, T)
ForwardIterator lower_bound (ForwardIterator, ForwardIterator, T, Compare)
ForwardIterator max_element (ForwardIterator, ForwardIterator)
ForwardIterator max_element (ForwardIterator, ForwardIterator, Compare)
ForwardIterator min_element (ForwardIterator, ForwardIterator)
ForwardIterator min_element (ForwardIterator, ForwardIterator, Compare)
pair<ForwardIterator, ForwardIterator> minmax_element (ForwardIterator, ForwardIterator)
pair<ForwardIterator, ForwardIterator> minmax_element (ForwardIterator, ForwardIterator, Compare)
ForwardIterator partition (ForwardIterator, ForwardIterator, Predicate)
ForwardIterator partition_point (ForwardIterator, ForwardIterator, Predicate)
ForwardIterator remove (ForwardIterator, ForwardIterator, T)
ForwardIterator remove_if (ForwardIterator, ForwardIterator, Predicate)
void replace (ForwardIterator, ForwardIterator, T, T)
void replace_if (ForwardIterator, ForwardIterator, Predicate, T)
ForwardIterator rotate (ForwardIterator, ForwardIterator, ForwardIterator)
OutputIterator rotate_copy (ForwardIterator, ForwardIterator, ForwardIterator, OutputIterator)
ForwardIterator1 search (ForwardIterator1, ForwardIterator1, ForwardIterator2, ForwardIterator2)
ForwardIterator1 search (ForwardIterator1, ForwardIterator1, ForwardIterator2, ForwardIterator2, BinaryPredicate)
ForwardIterator search_n (ForwardIterator, ForwardIterator, Size, T)
ForwardIterator search_n (ForwardIterator, ForwardIterator, Size, T, BinaryPredicate)
ForwardIterator2 swap_ranges (ForwardIterator1, ForwardIterator1, ForwardIterator2)
ForwardIterator unique (ForwardIterator, ForwardIterator)
ForwardIterator unique (ForwardIterator, ForwardIterator, BinaryPredicate)
ForwardIterator upper_bound (ForwardIterator, ForwardIterator, T)
ForwardIterator upper_bound (ForwardIterator, ForwardIterator, T, Compare)

Forward iterators give us a lot more options. Since forward iterators can iterate over a range more than once we can handle algorithms such as std::adjacent_find which require us to iterate over each value twice.

As I promised earlier, we have the other variants of std::generate and std::fill here, and we also get std::iota. Now we have moved into forward iterators we can start dealing with output ranges.

One suprise is that std::binary_search, std::equal_range, std::lower_bound and std::upper_bound are present in this list. We normally think of these algorithms as being O(log n) algorithms. Surely we cannot have a O(log n) algorithm working with forward iterators since the algorithm might have to iterate over the entire sequence making it O(n). The reason they are available for forward iterators is because the standard defines their complexity requirements as O(log n) on the number of comparisons, not the number of elements iterated over. We might regard this as cheating.

We are not stuck with one implementation of an algorithm though. There is also a great talk by Stephan T. Lavavej here where he explains how an implementation can select different algorithms for different iterator types so (for example) std::binary_search can use one implementation for forward iterators but a more efficient implementation for random access iterators.

Another surprise for me is that std::min_element, std::max_element and std::minmax_element all require forward iterators. Surely it is possible to find the maximum element of a sequence using input iterators? Well, it is possible to find the maximum element of a sequence using input iterators, however there are two problems:

  1. std::max_element and friends actually return the position of the desired element(s), not just the value of the desired element. The position is a more general piece of information
  2. If we had a version of std::max_element that ran on input iterators it would have to make a copy of the element – both for the comparison and to return the value of the element. Making a copy could be expensive, undesirable or impossible. Having said that, there is a Stack Overflow post here where making a copy increases the performance of the algorithm, probably due to a cache effect.

Working out how to implement some of these algorithms to achieve the required complexity guarantees is an interesting exercise.


Bidirectional iterator

BidirectionalIterator2 copy_backward (BidirectionalIterator1, BidirectionalIterator1, BidirectionalIterator2)
void inplace_merge (BidirectionalIterator, BidirectionalIterator, BidirectionalIterator)
void inplace_merge (BidirectionalIterator, BidirectionalIterator, BidirectionalIterator, Compare)
BidirectionalIterator2 move_backward (BidirectionalIterator1, BidirectionalIterator1, BidirectionalIterator2)
bool next_permutation (BidirectionalIterator, BidirectionalIterator)
bool next_permutation (BidirectionalIterator, BidirectionalIterator, Compare)
bool prev_permutation (BidirectionalIterator, BidirectionalIterator)
bool prev_permutation (BidirectionalIterator, BidirectionalIterator, Compare)
void reverse (BidirectionalIterator, BidirectionalIterator)
OutputIterator reverse_copy (BidirectionalIterator, BidirectionalIterator, OutputIterator)
BidirectionalIterator stable_partition (BidirectionalIterator, BidirectionalIterator, Predicate)

The bidirectional iterator algorithms are an interesting bunch. The algorithms need more functionality than provided by forward iterators, but don’t require the full power of random access iterators.

We get std::copy_backward. As we saw in part 1a, std::copy and std::copy_backward handle overlapping ranges (std::reverse_copy requires non-overlapping ranges). If you’re looking from some additional amusement, look into how using reverse iterators affects the overlapping range guarantees of std::copy and std::copy_backward (we can make similar arguments for the std::move set of algorithms).

We get std::inplace_merge, an algorithm that will make use of extra memory if it can get it. The standard does not specify where the extra memory comes from – usually C++ allows you to handle all memory via custom allocators, that does not appear to be the case here.


Random access iterator

bool is_heap (RandomAccessIterator, RandomAccessIterator)
bool is_heap (RandomAccessIterator, RandomAccessIterator, Compare)
RandomAccessIterator is_heap_until (RandomAccessIterator, RandomAccessIterator)
RandomAccessIterator is_heap_until (RandomAccessIterator, RandomAccessIterator, Compare)
void make_heap (RandomAccessIterator, RandomAccessIterator)
void make_heap (RandomAccessIterator, RandomAccessIterator, Compare)
void nth_element (RandomAccessIterator, RandomAccessIterator, RandomAccessIterator)
void nth_element (RandomAccessIterator, RandomAccessIterator, RandomAccessIterator, Compare)
void partial_sort (RandomAccessIterator, RandomAccessIterator, RandomAccessIterator)
void partial_sort (RandomAccessIterator, RandomAccessIterator, RandomAccessIterator, Compare)
void pop_heap (RandomAccessIterator, RandomAccessIterator)
void pop_heap (RandomAccessIterator, RandomAccessIterator, Compare)
void push_heap (RandomAccessIterator, RandomAccessIterator)
void push_heap (RandomAccessIterator, RandomAccessIterator, Compare)
void random_shuffle (RandomAccessIterator, RandomAccessIterator)
void random_shuffle (RandomAccessIterator, RandomAccessIterator, RandomNumberGenerator)
void shuffle (RandomAccessIterator, RandomAccessIterator, URandomNumberGenerator)
void sort (RandomAccessIterator, RandomAccessIterator)
void sort (RandomAccessIterator, RandomAccessIterator, Compare)
void sort_heap (RandomAccessIterator, RandomAccessIterator)
void sort_heap (RandomAccessIterator, RandomAccessIterator, Compare)
void stable_sort (RandomAccessIterator, RandomAccessIterator)
void stable_sort (RandomAccessIterator, RandomAccessIterator, Compare)

The random access algorithms are all about sorting and shuffling. I have rarely used the heap algorithms directly but they are useful building blocks for algorithms such as std::nth_element (previously discussed here), std::partial_sort_copy and std::partial_sort.


Input iterator and random access iterator

RandomAccessIterator partial_sort_copy (InputIterator, InputIterator, RandomAccessIterator, RandomAccessIterator)
RandomAccessIterator partial_sort_copy (InputIterator, InputIterator, RandomAccessIterator, RandomAccessIterator, Compare)

std::partial_sort_copy uses an interesting combination of input iterators and random access iterators.


No iterator

const T& max (T, T)
const T& max (T, T, Compare)
T max (InitializerList)
T max (InitializerList, Compare)
const T& min (T, T)
const T& min (T, T, Compare)
T min (InitializerList)
T min (InitializerList, Compare)
pair<const T&, const T&> minmax (T, T)
pair<const T&, const T&> minmax (T, T, Compare)
pair<T, T> minmax (InitializerList)
pair<T, T> minmax (InitializerList, Compare)

Finally, there are the core min and max algorithms that don’t use iterators at all.


Conclusion

I don’t have a conclusion for this post. I found it interesting to see which algorithms required which iterators – there were a couple that surprised me. The general rule is to write algorithms that use the weakest iterators possible to make the algorithms as widely useful as possible.

Algorithms and Variations 1a

In the first part of this series we looked at algorithms which guaranteed their order of operations vs. algorithms that don’t. On further reflection, there are two things missing from that article.

First of all, I should have included std::iota in the list of “order guaranteed” algorithms. I overlooked this partly because std::iota doesn’t take an input range, but also because the language used in the standard isn’t entirely clear. I am reasonably confident though that operator++ must be applied starting with the first value output, and continuing in sequence. (As always, if someone has some evidence I am wrong about this please tell me).

This raises the number of algorithms that guarantee ordering to 14.

Secondly, we didn’t look at “why”. Why should some algorithms guarantee their operation order while others don’t? Let’s look at the 14 algorithms in groups, starting with std::copy and std::move:

OutputIterator copy (InputIterator, InputIterator, OutputIterator)
BidirectionalIterator2 copy_backward (BidirectionalIterator1, BidirectionalIterator1, BidirectionalIterator2)
OutputIterator move (InputIterator, InputIterator, OutputIterator)
BidirectionalIterator2 move_backward (BidirectionalIterator1, BidirectionalIterator1, BidirectionalIterator2)

Copy and move have specific forward and backward versions. This is to handle overlapping input and output ranges. In order for copying or moving between overlapping ranges to work as expected, the order must be specified. The order will be different depending on whether the destination range overlaps the beginning or the end of the source range – hence the forward and backward versions.

(There was an issue raised in 1995 about the direction of std::copy. Read it here.)

Moving on, the next algorithm is std::for_each:

Function for_each (InputIterator, InputIterator, Function)

As Kreft and Langer point out here, std::for_each was unusual among the algorithms because it allows its function object to have side effects. I say was because the C++98 standard includes the phrase – op and binary_op shall not have any side effects – in the description of std::transform. The C++11 standard weakens that requirement.

Regardless of the changes to the wording of std::transform, once your function object can have side effects, it is possible for those side effects to vary depending on the order in which the range is processed. To get deterministic results you need a deterministic processing order, hence (at least historically) the order requirement for std::for_each.

That leaves us with 9 algorithms to explain:

T accumulate (InputIterator, InputIterator, T)
T accumulate (InputIterator, InputIterator, T, BinaryOperation)
OutputIterator adjacent_difference (InputIterator, InputIterator, OutputIterator)
OutputIterator adjacent_difference (InputIterator, InputIterator, OutputIterator, BinaryOperation)
T inner_product (InputIterator1, InputIterator1, InputIterator2, T)
T inner_product (InputIterator1, InputIterator1, InputIterator2, T, BinaryOperation1, BinaryOperation2)
void iota (ForwardIterator, ForwardIterator, T)
OutputIterator partial_sum (InputIterator, InputIterator, OutputIterator)
OutputIterator partial_sum (InputIterator, InputIterator, OutputIterator, BinaryOperation)

What do these algorithms have in common?

  1. They are all in the header numeric. In fact, they are the entire contents of the header numeric.
  2. They all involve combining elements of the input range in some way (or the output range for std::iota).

So we have “in order”, “numeric”, and “combining”. “Numeric” leads us to think of floating point values, and the combination of “in order” and “floating point” leads us to the fact that floating point arithmetic is neither associative nor commutative. [Edit: See John Payson’s comment where he points out that IEEE-754 floating point is commutative for addition and multiplication.] Running std::accumulate from front to back does not necessarily give us the same answer as running it from back to front. Once again, if we want a deterministic answer we need a deterministic processing order, hence the “in order” requirement for these algorithms.

As usual, if the standard algorithms do not supply us with what we want, we can write our own.

The C++11 standard also makes it clear that the compiler is limited in its rearrangement of operators:

[ Note: Operators can be regrouped according to the usual mathematical rules only where the operators really are associative or commutative.

C++11 standard 1.9


In case anyone thinks that floating point commutativity [Edit: I should have said associative here] only matters for very large, very small or very accurate calculations, it doesn’t. One of my home projects is a music typesetting program. It lays out music onto a typical sheet of paper (A4 or 8.5 x 11) and it doesn’t need micron accuracy, however I still got caught by floating point operations.

At one point I had two functions (A and B) that computed what was nominally the same value, but in two different ways. I sorted a vector using a comparison operator that called function A. I then attempted to do a binary search on the vector using a comparison operator that called function B. Fortunately the debug version of the STL I was using detected attempts to use a binary search on a vector that was not sorted – the vector was sorted according to A, it was not sorted according to B. Yet another reason why “don’t repeat yourself” is such a good rule.


In part 1 I mentioned the possibility of having some of the algorithms perform their calculations in parallel. The C++11 standard has this to say:

Unless otherwise specified, C++ standard library functions shall perform all operations solely within the current thread if those operations have effects that are visible (1.10) to users.
[ Note: This allows implementations to parallelize operations if there are no visible side effects. —end note ]

C++11 standard 17.6.5.9

I.e. no parallelism unless it’s invisible to the user.

There are some interesting proposals for explicitly parallel algorithms. For example, see section 6.6 Associativity/Commutativity of Binary Operators (p66) of A Parallel Algorithms Library which points out that some parallel algorithms will require associative and commutative operators.

Parallelizing the Standard Algorithms Library also looks at the topic of parallelization. For example, see section 5.2 std::accumulate versus thrust::reduce for a discussion of the need for associative and commutative operators.


Several of my references in this post came from this website which contains many papers from the C++ standards committee. If you have far too much time to kill, the site contains fascinating information – current and historical.

Auto

I am deeply suspicious of auto and decltype. I acknowledge that I am on the opposite side of the argument to people such as Bjarne Stroustrup, Herb Sutter and Scott Meyers. This is not a comfortable place to be. I picked these three luminaries because it is easy to find quotes from them – I am sure there are plenty of other people who agree with them.

Unless there is a good reason not to, use auto in small scopes

Bjarne Stroustrup The C++ Programming Language 4th edition, 6.3.6.1

Almost Always Auto

Herb Sutter Almost Always Auto

Prefer auto to explicit type declarations

Scott Meyers Effective Modern C++ (Early release) Item 5

This is not an attempt to trash any of these authors – I have quoted their conclusions however, they have well thought through reasons backing up those conclusions. I highly recommend reading the entire pieces that I took these quotes from – as always with Stroustrup, Sutter and Meyers, even when I disagree with their conclusions (which isn’t that often) I learn something from their arguments. All three authors talk about the problems that can arise from auto – note that even in the quotes I have picked there are phrases like “unless there is a good reason not to”, “almost always” and “prefer”.

I think my main disagreement is with the weighting given to some of the disadvantages. My objection to auto can be summarized as “makes code easier to write, makes it harder to read”, and I weight this very highly, as someone who has spent a fair amount of his career trying to get up to speed with large codebases written by someone else, and trying to get new team members up to speed with large codebases written partially by me.

For example, I joined a team with a large, pre-existing, pre-C++11 codebase that used compiler extensions to fake auto and decltype. Learning the codebase was made far more difficult by the absence of explicitly named types in many functions. The code would not have compiled unless all the types were compatible, that’s good, however I want to know more than that. I want to know what the types are so that I can go and look them up and find out what else they can do for me. I want to look at a function and know what types are the same and what aren’t. Sooner or later I am going to make changes to that function – the changes I can make will be constrained by what the types let me do (and if the types constrain me too much I am going to have to either extend the types or use different types). I am saying “types” over and over again – they’re a big deal.

I want to know what type everything is, and I weight that desire very highly.

There will be IDEs which will make it easy to discover the types being deduced, however as Scott Meyers acknowledges in Item 4 of “Effective Modern C++” they don’t do a perfect job (although I assume that will improve), and not all of us are in a position to use IDEs anyway.

Type deduction is not new in C++ – we’ve been using templates for years, and I have fewer objections to using templates. I am trying to work out why I find templates more acceptable than auto.

Reason #1 has to be that I am more used to templates than I am to auto. It takes me time to get used to a new feature. However, there is more to it than that.

Template types still have a name. Imagine this introduction to a template class:

template< typename ITERATOR >

Within the class we don’t know exactly what the type of ITERATOR is, however the name gives us a strong hint that it should be some sort of iterator.

The fact that the template types have a name also lets us see what relation (if any) there is between types in a piece of code. Contrast this:

auto i( someFunction() );
auto j( someOtherFunction() );

with

ITERATOR i( someFunction() );
ITERATOR j( someOtherFunction() );

The template version makes it clear that i and j are the same type.

On the one hand I dislike the argument “this feature can be misused so we shouldn’t ever use it”, on the other hand I have already seen this feature be massively misused, and I suspect that it will be misused more often than not. Herb Sutter writes:

Remember that preferring auto variables is motivated primarily by correctness, performance, maintainability, and robustness – and only lastly about typing convenience.

Herb Sutter Almost Always Auto

I think this is fine advice, I think that everyone in the world should stick to it. I do not believe that they will. In my experience (and I really hope that other people have different experiences), saving some typing often outweighs everything else. Delayed gratification is an unknown concept in many programming shops.

With more experience I might come around to auto. There are certainly people who have thought about this more deeply than I have who are in favour of auto. There are also definitely places where auto will help in template code. I know I dismissed the “reduces typing” argument earlier, but it would be really nice to avoid have to type typename quite so often.


A little more follow up on the team using the compiler extension versions of auto and decltype. Their use of auto was problematic, it was also exacerbated by a number of other factors. The team loved macros and loved token pasting (##). There were many classes whose declaration and definition were created using macros, and the name of the class was generated using token pasting. Combine that with their use of auto and there were classes whose full name never appeared in the code. They were impossible to grep for and I doubt that there were many IDEs (which that team wasn’t using anyway) that would have helped. I did not last very long on that team.

Google’s C++ style guide – follow up

My recent post about Google’s C++ style guide attracted some great comments. Billy O’Neal corrected my ignorance about C++ formatting tools by alerting me to the existence of clang-format; John Payson came up with a way of getting piecemeal from the “no exceptions allowed” state to the “exceptions allowed and all code able to handle them” state, and Titus Winters (a maintainer of the Google style guide) mentioned a talk he gave at CppCon on the subject.

The talk is on YouTube here, and is worth watching in its entirety, but here are a few highlights I took away from it:

1:42 “How should we format our code? BORING QUESTION!”. Titus introduces clang-format.
3:14 What is the purpose of a style guide?
5:15 Google’s publicly available style guide is “probably optimizing for a different set of values than your organization needs”
6:48 Google has an indexer – Kythe – which should be available as open source soon.
9:39 Google plan to be around for at least 10 years. Almost everywhere I have worked has assumed that they are going to be around for 10 years (and several of those places went bust a long time short of 10 years), but very few of my employers have made long term decisions about their code – they might have assumed that they will be around for 10 years but on the code side they certainly didn’t plan for it.
10:55 The philosophies of the style guide.
14:45 A future version of the style guide will allow streams.
28:10 “No non-const references as function arguments” is one of the most controversial rules in the style guide. I am not surprised by this, but I am disappointed. The arguments in favour of no non-const references always seemed good to me.
38:25 “I don’t care about throwing out of memory exceptions”. There are certainly products where an out of memory exception just translates to “die instantly”, however that isn’t universally true. I have never worked on a project that was able to recover and continue after running out of memory but I have worked on products that have wanted to know about the out of memory case so they can save the data the user is working on. It’s just another reminder that guidelines depend on the products they are being used for.
57:20 Probably the best answer to any formatting question that I have heard.
Q: What about tabs vs. spaces? What’s Google’s take on that?
A: I don’t care … I just use clang-format

The theme of the day is clang-format. I spent some time over the weekend playing around with clang-format and it is excellent. I haven’t worked out how to get it to do quite everything I want, however the consistency benefits far outweigh any minor deviations from my preferred style. Also, if it bothers me that much I have the source code and can add more options.

On a related note, this is the first time I have been able to get Clang and LLVM (mostly) compiling on Windows without spending a ridiculous amount of time on it. Windows has long been the afterthought platform for LLVM and Clang – it looks like it’s getting better.

Google’s C++ Style Guide

I read through Google’s C++ style guide recently and found some interesting recommendations. I am trying to avoid saying “right” or “wrong” – there are plenty of things in a style guide that are matters of opinion or are tailored to the needs of a specific company – so I am focusing on things I find interesting.

The direct link to the guide is http://google-styleguide.googlecode.com/svn/trunk/cppguide.html.

The guide is stored in a subversion repository which can be accessed at
http://google-styleguide.googlecode.com/svn

I am looking at revision 138 in the repository, checked in on 8 September 2014. At the very bottom of the document there’s a version number – 4.45.

I have no insider knowledge apart from one conversation with a Google engineer who told me that the “no exceptions” rule was the most controversial part of the guide.

It is remarkable that Google have a company-wide C++ style guide. I have worked in smaller companies (16 people, roughly half of whom were engineers) where we could not agree on anything code related. In my career, style guides (aka coding guidelines or coding standards) have created more divisive arguments than almost anything else, and enforcing them is the quickest way to upset everyone.

I do like the fact that many of the entries include pros, cons and a discussion of why a particular solution was chosen. As I have said before on this blog, even if you believe that you have made the right choice you should know what the drawbacks of that choice are.

The guide also talks a lot about consistency. I like consistency. There are times where the choices are almost arbitrary, however, it is much nicer not to have the code randomly select different choices.

Let’s start with the most controversial guideline.


Exceptions

We do not use C++ exceptions.

At least the guideline is clear. The writers of the guide also admit:

Things would probably be different if we had to do it all over again from scratch.

Taking away exceptions removes a very powerful and useful way of handling errors. It reduces the need to have error returns everywhere (which are rarely checked). Using exceptions does require your code to be written in an exception-safe manner, however that exception safe manner often has a bunch of other benefits – RAII helps with early returns and means you can’t forget the “clean up” on one code path. Smart pointers are a great way to manage lifetime (and smart pointers are encouraged by the style guide). I have worked at a number of places that banned exceptions for a variety of reasons, some good, some bad. It is possible to write decent C++ code without exceptions, but their absence does make life harder.

I have two major problems though:

  1. According to the given reasoning, there will never be a time when exceptions are allowed in Google’s code. The guideline states:

     
    Because most existing C++ code at Google is not prepared to deal with exceptions, it is comparatively difficult to adopt new code that generates exceptions.

    If the guideline is followed there will be even more code at Google that does not deal with exceptions making them even less likely to be allowed in the future. There is no way to get from here (no exceptions) to there (exceptions allowed). It is one thing to ban a feature because its current use causes problems, but for there to be no path to unbanning the feature is short-sighted in the extreme.

    You can’t even state that all new code must be exception safe. If you want code to be exception safe it has to be fully tested in the presence of exceptions. That would involve an additional build target with different settings, different versions of libraries and some additional specific tests for exception handling. That build is never going to exist.

  2. The knock on effects are horrible, in particular the effect on constructors.

Doing Work in Constructors

Avoid doing complex initialization in constructors.

First of all, the prohibition on calling virtual functions in constructors is perfectly reasonable. It’s one of those things that fall into the “don’t do that” category. The few times I have thought that I needed to do it were caused by errors in my design.

Global variables with complex constructors also fall into the “don’t do that” category, or at least “don’t do that unless you really know what you’re doing”. C++ is a harsh language if you don’t know what you’re doing. It’s a harsh language even if you do know what you’re doing.

Those are the points where I agree with Google, let’s look at the disagreements.

One of the joys of good constructors is that once the constructor has finished running you have a fully initialized object in a well-defined state that is ready to be used. That’s a simple rule and it works in many different creation scenarios. We can create the object on the stack:

SomeClass someObject( arg0, arg1, arg2 );

We can create it on the heap:

SomeClass* pSomeObject( new SomeClass( arg0, arg1, arg2 ) );

We can create the object as a member variable of another class:

SomeOtherClass::SomeOtherClass( some arguments )
: someObject_( arg0, arg1, arg2 )
{
}

We can create a temporary object for passing to a function:

someFunction( SomeClass( arg0, arg1, arg2 ) );

Two stage construction makes all of these examples longer, more fragile and less maintainable. Creating a temporary object is basically not possible – you have to create a named object then call init before passing it to a function (yes we can wrap construction and init inside a create function but why should we have to?)

The style guide states:

If the work fails, we now have an object whose initialization code failed, so it may be an [sic] indeterminate state.

With Google’s suggested “solution” (a separate init function) once the constructor has finished you have an object in an indeterminate (or at least not-fully-initialized) state. You then have to call the init function and you have to check any error return value from the init function (Note – have to. The call and the error check are not optional and you’d better not forget them or allow any future modifications to the code to accidentally delete them, or even move them too far from the original construction). Initializing const members is far more problematic with two stage construction.

In my career I have rarely had a problem with errors in constructors, although Google are certainly dealing with problems of scale that I have never encountered. Even so, I believe that this rule is hurting Google, not helping them.


Names and Order of Includes

Having praised Google for their use of pros and cons they let me down here. Google’s guideline is to start with the related header (most specific), then to jump to the least specific headers (C and C++ library includes), then gradually move to more specific headers.

The only rationale I have seen for a particular include order chooses the opposite order to Google – start with most specific, then move to less specific so that you finish with the C and C++ library headers. The rationale is to make sure that your own headers include all of the standard system files that they require – if you order the includes the other way it is easier to hide incorrect includes (or rather, lack of includes). Also, if for some reason you decided to #define linear_congruential( i ) ( i * 2 ) (or define any other symbol in the standard libraries), having the standard libraries last will almost certainly produce an error. Of course you shouldn’t be using macros in the first place, but for some inexplicable reason, people do.

In practice, I haven’t had many problems that are related to the order of includes (and when I have the problem has nearly always been that an include file wasn’t including all of the headers it required).

I do like their suggestion for alphabetical order within each section. I once worked with a developer who was diligent enough to maintain all the functions in his .cpp files in alphabetical order. I know that broke a bunch of guidelines about keeping related functions together, but damn, it was easy to find a function.


Function Names

Regular functions have mixed case; accessors and mutators match the name of the variable: MyExcitingFunction(), MyExcitingMethod(), my_exciting_member_variable(), set_my_exciting_member_variable().

I said at the beginning that I was going to try to avoid using words like “wrong”, but my tolerance has run out. This guideline is wrong.

How is it wrong? Let me count the ways:

  1. It breaks encapsulation. The name of the member function tells you something about the internal implementation.
  2. If the internal implementation changes so that the value is no longer stored in a member variable are you going to change the accessor / mutation functions and everywhere in the code that calls them? (The answer to that question is almost certainly “no”)
  3. Assume two classes exist, both of which have a concept of some number of files that they know about. One class stores the number of files in a member variable (so will have a function called num_files()), the other class calculates the number of files when necessary (so has a function called NumFiles()). You now have a template class that needs to be instantiated with a class that can tell you how many files it knows about. You cannot write a single template class that will work with both num_files() and NumFiles().
  4. In practice I am sure that if any of the first three points cause problems the name of the accessor will be changed one way or the other. There is a deeper problem though. If you’re thinking about the class in terms of accessors and mutators (or getters and setters) you are fundamentally thinking about the design incorrectly.

    Allen Holub’s provocatively named piece “Why getter and setter methods are evil” (and if anyone is supposed to not be evil it’s Google) is well worth reading. He makes some points that are beyond the scope of this post (“Don’t ask for the information you need to do the work; ask the object that has the information to do the work for you” has some interesting consequences that I am not wild about) , but he makes a killer point about “adding unnecessary capabilities to the classes”. He talks in terms of the capabilities of a class – he does that in other places as well (http://www.holub.com/goodies/rules.html), and in his book “Holub on Patterns: Learning Design Patterns by Looking at Code” where he states:

    An object is a bundle of capabilities … an object is defined by what it can do, not how it does it.

    To go back to our example, is it reasonable for our file handling class to be able to tell you how many files it knows about? If the answer to that question is “yes”, then it is quite reasonable to add a NumFiles() method. One of the capabilities of the class is that it can answer the question “how many files do you know about”. If, for some reason, (and this is a made up example which means I don’t have a good reason) it is reasonable for the class to be told that the number of files has changed, it is reasonable to have a NumFiles( int ) method. The class has the capability to accept a notification that the number of files has changed. As a result of having this information this task can perform its functions correctly. None of this says anything about the implementation of those capabilities.

In practice, I am sure that if #2 or #3 turned out to be problems the guideline will be ignored and the same name for the accessor function would be used in both places. Even if the guideline were changed to preserve encapsulation, #4 will probably still be ignored. Most development shops I have worked in do not think about class design in terms of capabilities. Google are no worse than everyone else, on the other hand, it would be nice if they were better.

For more on the subject of being evil, see the C++ FAQ.


Casting

Use C++ casts like static_cast<>(). Do not use other cast formats like int y = (int)x; or int y = int(x);

I am in complete agreement. I just want to add that in the “cons” section they state:

The syntax is nasty

Yes. The syntax is nasty. That was by design. To quote Bjarne Stroustrup from his FAQ:

An ugly operation should have an ugly syntactic form. That observation was part of the reason for choosing the syntax for the new-style casts. A further reason was for the new-style casts to match the template notation, so that programmers can write their own casts, especially run-time checked casts.

Maybe, because static_cast is so ugly and so relatively hard to type, you’re more likely to think twice before using one? That would be good, because casts really are mostly avoidable in modern C++.


Inline Functions

I am disappointed that Google don’t suggest separating inline declarations from inline definitions. C++ requires separate declarations because it was required to work with the C compilation model (C# for example does not require separate declarations). The C++ model involves repeating information – in general that’s a Bad Thing, however having declarations in a separate header file allows a class declaration to serve as effective documentation for the class and its capabilities (yes, I am ignoring the fact that the protected and private parts of the class are also visible in the declaration – it is far from perfect). Putting an inline (or template) function directly in the declaration clutters up that list of public member functions. You can place the definition later in the header file to get inlining (or template definitions) but still have an uncluttered declaration.


Operator Overloading

Do not overload operators except in rare, special circumstances. Do not create user-defined literals.

I don’t have enough experience yet with user defined literals to comment on whether they are a Good Thing or a Bad Thing, but I am prepared to put operator overloading firmly into the Good Thing category. Yes, operator overloading can be misused, but so can everything in C++. If we’re going to start banning things because of the potential for misuse we’re not going to end up with much of a language.

Overloaded operators are more playful names for functions that are less-colorfully named, such as Equals() or Add().

Oww. That’s the way to be condescending towards overloaded operators. For “playful” I would substitute “clear, obvious, concise and well defined”. When you overload an operator you cannot change the associativity, precedence or number of arguments that it takes. Who knows how many additional (but defaulted) parameters there are in that Equals function?

In particular, do not overload operator== or operator< just so that your class can be used as a key in an STL container; instead, you should create equality and comparison functor types when declaring the container.

As far as I am concerned, being used as a key or being searched for is a great reason to overload operator == and operator <. My usual practice is that operator == compares every independent value in the class and operator < gives an ordering that also depends on every independent value. If I need something different (for example an equality test that just compares the names of two objects) I’ll write a separate function.


Auto

Google likes auto. I still have significant reservations about it. At some point I will write up my concerns, but they can be summarized as “auto makes code easier to write but harder to read”.


Formatting

Coding style and formatting are pretty arbitrary

In many ways formatting is arbitrary, but I have always liked the formatting to be consistent. Even when the formatting seems arbitrary, there are still smart people who can find reasons for preferring one format to another. For example, in Object-Oriented Software Construction, Bertrand Meyer has the following to say about spaces:

The general rule, for simplicity and ease of remembering, is to follow as closely as possible the practice of standard written language. By default we will assume this language to be English, although it may be appropriate to adapt the conventions to the slightly different rules of other languages.

Here are some of the consequences. You will use a space:

  • Before an opening parenthesis, but not after:
    f (x) (not f(x), the C style, or f( x))

The argument isn’t a slam dunk “this is obviously the only sensible way to lay out parenthesis and if it isn’t followed the world will fall apart”, on the other hand it is an argument with an actual reason attached (as opposed to “I prefer it this way”).

Steve McConnell’s Code Complete 2nd edition has lots to say about formatting, backed up by actual reasons. For example, sections 31.3 and 31.4 come up with an interesting way of looking at block and brace placement and indentation. Unfortunately the brace placement that I favour for my personal projects fares badly under McConnell’s arguments.

Google’s preferred function declaration and definition syntax looks like this:

ReturnType ClassName::ReallyLongFunctionName(Type par_name1, Type par_name2,
                                             Type par_name3) {
  DoSomething();
  ...
}

In section 31.7 McConnell points out that this approach does not hold up well under modification – if the length of ReturnType, ClassName or ReallyLongFunctionName change then you have to re-indent the following lines. As McConnell says:

styles that are hard to maintain aren’t maintained

Ever since I started programming I have always heard of the idea of having a tool that automatically formats code to the house standard when it is checked in, and formats it to the individual developer’s standard when it is checked out. I think this is a fabulous idea but I have never come across any development shop that is actually using it – I would love somebody to point me to a working example. Of course the situation is exacerbated in C++ because formatting the language is extremely difficult. I tried a number of formatting tools a few years ago and they all had significant problems so I haven’t investigated them since. I hear good things about astyle – I will take a look at it some time.

I am jealous of Go. It has a standard format and a tool (gofmt) to lay code out in that format.


In conclusion, there are many parts of the guide I agree with (I haven’t mentioned most of them because that doesn’t fit my definition of “interesting”), but there are some major places where it looks like Google are hampering themselves. I said that just having a company-wide style guide was an achievement, but that style guide really needs to be working for them, not against them.

To finish on a positive note, I particularly like this statement at the end of the style guide:

The point of having style guidelines is to have a common vocabulary of coding so people can concentrate on what you are saying, rather than on how you are saying it.

Hear hear.

Include guards

In 1996 John Lakos published “Large-Scale C++ Software Design”, a book containing the lessons learned from a large scale C++ project developed at Mentor Graphics.

I want to make it clear that this post is not meant as a criticism of John Lakos or his book. First, I don’t have a copy of the book in front of me so I am working from memory – I might have things wrong. Second, John Lakos was describing his experience at a particular time and place. We are at a different time and place. Among other things, compilers, hard disks, processors, memory and networks have changed enormously over the last 20 years.

One of the lessons Lakos documents involves include guards. A common idiom is for the contents of every header file to be surrounded by include guards:

myheader.hpp

#ifndef MYHEADER_HPP
#define MYHEADER_HPP

// Contents of include file go here

#endif

Even if the file is #included multiple times it will only be processed once. Since the include guard is entirely internal to the header file this is known as an “internal include guard”. The problem is that a naive compiler will still have to open and read the file every time it is included in a translation unit. If opening a file is slow (perhaps because it is being opened over a network) there is a performance hit even though the contents of the file are not being processed. As a solution to this problem, Lakos suggested using external include guards as well as internal include guards:

myheader2.hpp

#ifndef MYHEADER_HPP
#include "myheader.hpp"
#endif

The external guards ensure that once myheader.hpp has been processed it doesn’t even have to be opened again. Keep the internal include guards to ensure accuracy but use the external include guards to improve compile performance.

It turns out that some compilers are way ahead of us – they implement the “include guard optimization”. When the compiler parses a header file it looks to see if the include guard idiom is in place. If it is, the compiler notes this fact and then does the equivalent of applying its own external include guards if the file is included a second time. In 1999 a discussion arose on comp.lang.c++.moderated about external include guards vs. internal include guards vs. the include guard optimization (I haven’t been able to track down the original discussion I’m afraid. Google’s Usenet search has me baffled). I decided to run some tests and put up a webpage with my results. Some other C++ programmers ran tests on their systems that allowed me to get results from compilers and systems that I don’t have access to.

The main page is here, if you want to skip straight to the results go here.

There are a few patterns. GCC has had the optimization for a long time, VC++ has never had the optimization (although the results for the latest VC++ compiler are strange – something odd is going on). Watcom had the optimization in the 1990s (incidentally, Watcom lives on as Open Watcom, although it looks like there hasn’t been a release for 4 years).

There are a couple of important points to note. All these tests are doing is testing the include times of a simple header – the header itself is very small so there is almost no time spent processing it. The tests are being run with tens if not hundreds of thousands of includes of the same file – a situation I have never encountered in practice

A few years ago I was on a team that was upset about its compilation times and wanted to look for speedups. Tests showed that the single greatest contributor to our build time was <windows.h>. The solution to that was in 2 parts – include <windows.h> in as few places as possible (and in particular try to keep it out of header files), and use precompiled headers. The include guards didn’t come close to being a problem.

As always, nothing takes the place of running tests on your own codebase, although I would start by measuring the processing time for several commonly used headers – that is far more likely to be the problem than include guards.

Unfortunately making sure we only include the necessary headers is time consuming to do, and even more time consuming to maintain. I have never come up with the perfect solution to this problem – at times I have used imperfect solutions but they have all been inconvenient and required serious amounts of run time.

Finally, John Lakos has a new book coming out in 2015 – Large-Scale C++ Volume I: Process and Architecture. It’ll be interesting to see what he finds to focus on in our current computing environment.

Inconsistent sorting

A few years ago I was working on a prototype for some 3D drawing code. I was using the painters algorithm – draw the objects from far to near so that the close objects will occlude the distant objects. I had my objects in a vector, I sorted them according to their distance from the camera, then drew the objects. Then I noticed something odd. When I first started the program I had two objects that would flicker. They would swap which one was in front – sometimes object A would be drawn in front of object B, sometimes B was drawn in front of A.

My start up coordinates for the camera and the objects meant that A and B were exactly the same distance from the camera, so when I did the sort, A and B would compare equal. Since I wasn’t using std::stable_sort I wasn’t expecting any particular ordering for A and B, but I was expecting the ordering to be consistent between calls to std::sort. After some debugging I discovered that giving the exact same input to std::sort would not always give the same output – objects with identical distances might be sorted differently on each call.

I ended up digging into the standard library implementation. I think I was using Metrowerks C++, but I wouldn’t bet on it. The implementation of std::sort used Quicksort – nothing unusual there. One of the problems with Quicksort though is that you have to make good choices of pivot element – the element that you use to to split the input into two partitions. For example, if you use the first element as the pivot and hand Quicksort an already sorted sequence the runtime ends up being O(n^2). Quicksort has great average case behaviour but bad worst case behaviour. There are various techniques for picking the pivot element including:

  • first element
  • last element
  • middle element
  • median of the first, last and middle elements

This particular implementation had found a different method of picking the pivot – it used rand. Effective since the chances of providing the sort with a worst case sequence were low, but it led to this odd behaviour – the same input to std::sort would not necessarily lead to the same output. I switched to std::stable_sort and all my problems went away.

As far as I can tell from the standard (section 25.4.1.1), this behaviour for std::sort is entirely legal. std::sort does not guarantee the order in which equivalent elements are placed, and I don’t see anything that forces std::sort to be consistent. I have never found this implementation in any other standard library, but I have heard of one person who was caught out like I was.

No raw loops – output

Let’s get back to raw loops, or rather the absence of raw loops. How do we write out the contents of a container so that each element appears on its own line? Assume we have some type T and a vector of T:

class T
{
   ...
};

std::vector< T > v_T;

This is the old style way to write the contents:

std::ostringstream ostr;
for( std::vector< T >::const_iterator i( std::begin( v_T ) ); 
    i != std::end( v_T ); ++i )
{
    ostr << *i << "\n";
}

The range-based for solution is straightforward:

for( T const& t : v_T )
{
    ostr << t << "\n";
}

Lambda functions also provide a solution:

std::for_each( 
    std::begin( v_T ), 
    std::end( v_T ), 
    [ &ostr ]( T const& t ){
        ostr << t << "\n";
    } );
adobe::for_each( 
    v_T, 
    [ &ostr ]( T const& t ){
        ostr << t << "\n";
    } );

Conveniently, output streams have iterators. We can use std::ostream_iterator as the destination for std::copy or adobe::copy :

std::copy( 
    std::begin( v_T ), 
    std::end( v_T ), 
    std::ostream_iterator< T >( 
        ostr, 
        "\n" ) );
adobe::copy( 
    v_T, 
    std::ostream_iterator< T >( 
        ostr, 
        "\n" ) );

Notice that we can pass a string (called the delimiter string in the standard) to the std::ostream_iterator constructor. The delimiter string is written out after every element.

Since std::ostream_iterator is an iterator with std::ostream as its underlying container, we might thank that we would be able to do this:

std::copy( 
    begin( v_T ), 
    end( v_T ), 
    std::begin( ostr ) );

We can't. The standard does not include a specialization of std::begin for std::ostream, and we are not allowed to add one ourselves because of Section 17.6.4.2.1 of the C++11 standard which states:

A program may add a template specialization for any standard library template to namespace std only if the declaration depends on a user-defined type and the specialization meets the standard library requirements for the original template and is not explicitly prohibited

In addition to that, std::begin does not allow for the additional delimiter string argument. Also, std::ostream_iterator is instantiated for the type of the source value which is not present in the single argument we pass to std::begin so cannot be deduced.

There is nothing stopping us writing our own function to construct a std::ostream_iterator from std::ostream. Since it's our function it can take an additional argument for the delimiter string, and so long as we don't put it into the standard namespace we can even call it begin. We do have to specify the type of our output when we call the function (the function cannot deduce it from the arguments because that type never appears in the arguments):

template< typename U >
std::ostream_iterator< U >
begin( std::ostream& ostr, const char* delimiterString )
{
    return std::ostream_iterator< U >( ostr, delimiterString );
}
std::copy( 
    begin( v_T ), 
    end( v_T ), 
    begin< T >( ostr, "\n" ) );

And an example that almost works. Since operator << is just a function, we can pass it to std::bind :

std::for_each(
    std::begin( v_T ), 
    std::end( v_T ),
    std::bind( operator <<, std::ref( ostr ), _1 ) );

This writes out the elements of the container, but it does not put each element on its own line - there is nowhere to plug the "\n" into. We can't achieve the each element appears on its own line requirement from our original problem statement.

We can write a function that takes an object of type T and a delimiter string and writes out both of them:

void
writePostText( std::ostream& o, T const& t, const char* text )
{
    o << t << text;
}
std::for_each(
    std::begin( v_T ), 
    std::end( v_T ),
    std::bind( writePostText, std::ref( ostr ), _1, "\n" ) );

Which of these solutions is the best? As usual there isn't a definitive answer to that question, but it looks like the range-based for solution looks has the lowest overhead (overhead being bits of boilerplate syntax that are not directly used to solve the problem), with adobe::for_each close behind.


There is one more thing to throw into the mix. The range-based for looks good, however it can only iterate over the entire container. Most of the time this isn't a problem, I estimate that 95% of my algorithm usage is over the entire container. In the case of writing things out though, there is a common use case where you want to iterate over a portion of the container. If you are writing out JSON, or some similar format where you need to put text between each pair of elements (i.e. the text cannot appear before the first element or after the last element), you have to be able to iterate over a sub range of the container.

A solution to the "separator text" problem might look like this:

void
writePreText( std::ostream& o, const char* text, T const& t )
{
    o << text << t;
}
if( !v_T.empty() )
{
    std::vector< T >::const_iterator i( std::begin( v_T ) ); 
    
    ostr << *i;

    ++i;
    std::for_each(
        i,
        std::cend( v_T ),
        std::bind( writePreText, std::ref( ostr ), ",\n", _1 ) );
}

We can't do this with range-based for. If we need to do both things within our source code - writing with text after each element and writing text between each element - do we want to do those two things differently - one of them using range-based for, one using a standard algorithm? There is a cost to being inconsistent, there is also a cost to being consistent but using a slightly larger solution for one of the cases.

I know that I am overanalyzing this. In practice I wrote a templated writeContainer function which lets you specify text that is written before the first element, text that is written after the last element and text that is written between elements. Problem solved and, with the exception of this blog post, I don't have to consider the subject any more.