One of my personal projects is a radiosity renderer. The world doesn’t need another radiosity renderer, but I wanted to learn about them, and, as usual with my personal projects, there is a secondary reason for writing it – to play around with C++ and see what I can learn.

In order to have a renderer you need to have a scene to render, which means you also need to have some data structures describing that scene. For the time being my data structure is simple – a list of polygons. Each polygon object includes various useful bits of data – things like colour and reflectivity. A polygon also has a vector of vertices. The vertices define the bounds of the polygon.

In addition to the polygons and vertices I need to process edges. An edge is defined by the two vertices it connects. My canonical data structure is the polygons and vertices, so when I need edges I need to construct them from the vertices.

There is nothing too difficult about that – iterate through the vertices and create an edge using each vertex and the following vertex. The twist comes when we get to the end of the vertices. Once we are at the last vertex, the next vertex we want is actually at the beginning of the vector. The polygon is closed so there has to be an edge between the last vertex and the first vertex. I am going to refer to this concept as “adjacent circular” – we already have algorithms like `std::adjacent_find`

and “circular” indicates that we are linking the end to the beginning.

For the sake of simplicity I’ll get rid of the extraneous details and slim things down so that our input is `std::vector< char >`

, and our output is `std::vector< std::pair< char, char > >`

. In other words, we are trying to write this function:

std::vector< std::pair< char, char > > getEdges( std::vector< char > const& vertices );

(As usual, some using statements and typedefs would make this less verbose, and, as usual I want to be explicit about where everything is coming from so I am going to spell out every component in full. Also, when I get to the point of writing classes I am going to inline the function definitions – in reality I’d split them out from the declaration).

If the input is:

a, b, c, d, e

we expect the output of `getEdges`

to be:

(a b), (b c), (c d), (d e), (e a)

It’s always good to start off with the straightforward solution:

std::vector< std::pair< char, char > > getEdges( std::vector< char > const& vertices) { std::vector< std::pair< char, char > > edges; std::vector< char >::const_iterator i( std::begin( vertices ) ); std::vector< char >::const_iterator j( std::begin( vertices ) ); while( i != std::end( vertices ) ) { ++j; if( j == std::end( vertices ) ) { j = std::begin( vertices ); } edges.push_back( std::make_pair( *i, *j ) ); ++i; } return edges; }

The code is not particularly elegant or efficient but it does the job. At the very least it’s useful as a test baseline for the other solutions we’re going to come up with.

Let’s follow Sean Parent’s dictum of “no raw loops” and see what else we can come up with. The point of “no raw loops” is not that there are no loops at all, just that the loops themselves are encapsulated in a reusable algorithm.

Let’s write ourselves an algorithm:

template< typename ForwardIterator, typename BinaryOperation > BinaryOperation for_adjacent_circular( ForwardIterator first, ForwardIterator last, BinaryOperation op) { ForwardIterator i( first ); ForwardIterator j( first ); while( i != last ) { ++j; if( j == last ) { j = first; } op( *i, *j ); ++i; } return op; }

As we’d expect, `for_adjacent_circular`

looks very much like our original function. When we use it, it looks very much like `std::for_each`

:

std::vector< std::pair< char, char > > getEdgesAlgorithm( std::vector< char > const& vertices) { std::vector< std::pair< char, char > > edges; for_adjacent_circular( std::begin( vertices ), std::end( vertices ), [ &edges ]( char a, char b ) { edges.push_back( std::make_pair( a, b ) ); } ); return edges; }

This is good as far as it goes – the problem is that it doesn’t go far enough. We wrote an algorithm that is specialized for our original problem – converting a vector of vertices into a vector of edges. Given that it’s a `for_each`

style algorithm we can fake it to do a lot of things that we’d like to do, but it feels like we’re abusing the algorithm and falling into the trap John Potter describes of making `std::for_each`

the “goto of algorithms”.

There are many more algorithms we might want to use – it would be nicer to have iterators that we could feed into one of the existing standard algorithms. Here’s my first pass at an iterator. I’ll display the whole thing first then go through the interesting features of it.

template< typename Iterator > class ConstAdjacentCircularIterator { public: ConstAdjacentCircularIterator( Iterator begin, Iterator position, Iterator end ) : begin_( begin ) , position_( position ) , end_( end ) { } typedef std::forward_iterator_tag iterator_category; typedef std::pair< typename Iterator::value_type const&, typename Iterator::value_type const& > value_type; typedef typename Iterator::difference_type difference_type; typedef value_type* pointer; typedef value_type& reference; // Prefix ConstAdjacentCircularIterator& operator ++() { ++position_; return *this; } // Postfix ConstAdjacentCircularIterator operator ++( int ) { ConstAdjacentCircularIterator original( *this ); ++(*this); return original; } value_type operator *() const { Iterator next( position_ ); ++next; if( next == end_ ) { next = begin_; } return value_type( *position_, *next ); } private: template< typename I > friend bool operator ==( ConstAdjacentCircularIterator< I > const& a, ConstAdjacentCircularIterator< I > const& b ) { return a.begin_ == b.begin_ && a.position_ == b.position_ && a.end_ == b.end_; } Iterator begin_; Iterator position_; Iterator end_; }; template< typename Iterator > bool operator !=( ConstAdjacentCircularIterator< Iterator > const& a, ConstAdjacentCircularIterator< Iterator > const& b ) { return !( a == b ); }

(As usual I needed to break out my copy of C++ Templates by Vandervoorde and Josuttis to get the friend declarations correct. Actually I am still not certain they are correct – I got different errors out of two different compilers and spent some time getting to what might be the right solution. I like C++ and I like templates but the syntax often has me baffled).

I have written the const version of the iterator since that’s what I need for the moment. We’ll look at the non-const version later.

ConstAdjacentCircularIterator( Iterator begin, Iterator position, Iterator end ) : begin_( begin ) , position_( position ) , end_( end ) { }

The constructor for our adjacent circular iterator takes other iterators as input. I assume we have some kind of underlying sequence which has its own iterators. Interestingly, we have to specify three iterators to construct an adjacent circular iterator. Not only do we specify `position`

, we also need to specify the range of the underlying sequence with `begin`

and `end`

. We’ll see why these are necessary when we implement the increment functions. This is a pattern I have come across before. When I’m writing iterators that do more than just directly reference the elements of an underlying sequence I often need to specify more information about the range that the iterator is iterating over.

Moving on, we have the typedefs that all iterators must have:

```
typedef std::forward_iterator_tag iterator_category;
```

I have chosen to make this a forward iterator simply because I didn’t want to implement more than `operator ++`

. There is no reason why we can’t implement all of the other access methods and make the circular iterator as fully featured as the underlying iterator.

```
typedef std::pair<
typename Iterator::value_type const&,
typename Iterator::value_type const& > value_type;
```

`value_type`

is interesting, and by “interesting” I mean “the first sign that the train is eventually going to come off the rails”.

Our basic problem is that the value we want (a pair of the values stored in the underlying container) is not actually what we have stored in the underlying container. To try and work around this I am storing references to the two values that make up our pair. This looks like a decent compromise – it is a decent compromise – but it’s still a compromise.

This is going to come up again later.

typedef typename Iterator::difference_type difference_type;

We take `difference_type`

straight from our underlying iterator – nothing surprising here.

typedef value_type* pointer;

typedef value_type& reference;

`pointer`

and `reference`

should not be surprising. They’re just a pointer and reference to `value_type`

. However, we are already uneasy about `value_type`

so we need to keep an eye on these.

// Prefix ConstAdjacentCircularIterator& operator ++() { ++position_; return *this; }

Our prefix increment implementation is not surprising. I am not implementing any sort of error checking – I assume that will be handled (if it’s handled at all) by the underlying iterator. Notice that the iterator itself is not wrapping around the container – we only wrap around when we defererence the iterator. Incrementing the iterator will never take us back to the beginning of the container. While it would be possible to create an iterator that wrapped around (although the question of what `end`

would be is an interesting one), that isn’t what we’re doing here.

// Postfix ConstAdjacentCircularIterator operator ++( int ) { ConstAdjacentCircularIterator original( *this ); ++(*this); return original; }

Postfix increment is just what we’d expect.

value_type operator *() const { Iterator next( position_ ); ++next; if( next == end_ ) { next = begin_; } return value_type( *position_, *next ); }

The deference function is where the magic happens. Let’s get the easy stuff out of the way – we use the current iterator and next iterator, making sure that if we’re at the end of the range we wrap around to the beginning. This is why we need to have `begin`

and `end`

stored in the iterator itself – we can’t implement this function without them.

We dereference `position_`

and `next`

which gives us references to the underlying values, then we construct a `value_type`

from the two references and return it.

Now let’s look at the subtleties. Our problems with `value_type`

really bite us here. `operator *`

should return a reference to the value type – the standard says so (section 24.2.2 in the C++11 standard). We can’t return a reference to `value_type`

because our underlying container (whatever that is) does not contain objects of that type.

`value_type`

contains references to the underlying objects, but that isn’t the same thing and is not enough to satisfy the standard.

template< typename I > friend bool operator ==( ConstAdjacentCircularIterator< I > const& a, ConstAdjacentCircularIterator< I > const& b ) { return a.begin_ == b.begin_ && a.position_ == b.position_ && a.end_ == b.end_; }

The equality operator is necessary for us to do almost anything useful with our iterator. There is one interesting point here. There are two possible ways of writing this function. We could just compare `position_`

, however if we do that we can generate situations like this:

ConstAdjacentCircularIterator< typename std::vector< char >::const_iterator > i0( vc.begin(), vc.begin() + 2, vc.end() ); ConstAdjacentCircularIterator< typename std::vector< char >::const_iterator > i1( vc.begin(), vc.begin() + 2, vc.begin() + 3 );

`i0`

and `i1`

have the same position – `vc.begin() + 2`

. If we were just comparing position, `i0 == i1`

is `true`

. Unfortunately, `*i0 == *i1`

is `false`

.

`*i0`

is `(c, d)`

`*i1`

is `(c, a)`

It is a normal restriction for `operator ==`

to be valid only if the iterators refer to the same underlying sequence (C++11 standard section 24.2.5). Even though `i0`

and `i1`

are based off the same underlying container, they are not referring to the same sequence since they have different `end`

values.

All of that is a long-winded way of saying that we need to compare `begin_`

, `position_`

, and `end_`

.

template< typename Iterator > bool operator !=( ConstAdjacentCircularIterator< Iterator > const& a, ConstAdjacentCircularIterator< Iterator > const& b ) { return !( a == b ); }

I define inequality in terms of equality – that’s a common technique. Since this doesn’t need to be a member function it isn’t.

Our `getEdges`

function is now simple (at least it would be if I were using typedefs):

template< typename T > std::vector< std::pair< typename T::value_type, typename T::value_type > > getEdgesIterator( T const& vertices) { std::vector< std::pair< typename T::value_type, typename T::value_type > > edges; ConstAdjacentCircularIterator< typename T::const_iterator > b( std::begin( vertices ), std::begin( vertices ), std::end( vertices ) ); ConstAdjacentCircularIterator< typename T::const_iterator > e( std::begin( vertices ), std::end( vertices ), std::end( vertices ) ); std::copy( b, e, std::back_inserter( edges ) ); return edges; }

Even better, since we now have iterators it looks like we can use any standard algorithm that requires iterators – `std::for_each`

, `std::transform`

, `std::find`

etc.

I will award bonus points to anyone who spotted the weasel words in that statement – “looks like”. That’s an important caveat and we’re going to come back to it later. For the time being let’s just say that `ConstAdjacentCircularIterator`

is doing a reasonable impersonation of an iterator.

So much for the const version of the iterator, what about the non-const version? There’s one difference – the type of `value_type`

:

```
typedef std::pair<
typename Iterator::value_type&,
typename Iterator::value_type& > value_type;
```

This time the references to the underlying data type are non-const. We can write to an `AdjacentCircularIterator`

as well as read from it.

For example, if we run this code (I have output operators already defined for `std::vector< char >`

and `std::pair< char, char >`

):

std::vector< char > vc( { 'a', 'b', 'c', 'd', 'e' } ); AdjacentCircularIterator< typename std::vector< char >::iterator > i( vc.begin(), vc.begin(), vc.end() ); ++i; std::cout << vc << "\n"; std::cout << *i << "\n"; *i = std::make_pair( 'x', 'y' ); std::cout << vc << "\n"; std::cout << *i << "\n";

We get this output:

a, b, c, d, e (b c) a, x, y, d, e (x y)

What about range based for? To make range-based for work we need to create a container having `begin`

and `end`

methods. Since we've already defined our iterator classes this is easy:

class AdjacentCircularContainer { public: typedef AdjacentCircularIterator< std::vector< char >::iterator >::value_type value_type; typedef value_type& reference; typedef value_type const& const_reference; typedef value_type* pointer; typedef value_type const* const_pointer; typedef AdjacentCircularIterator< typename std::vector< char >::iterator > iterator; typedef ConstAdjacentCircularIterator< typename std::vector< char >::const_iterator > const_iterator; AdjacentCircularContainer( std::vector< char >* p ) : pContainer_( p ) { } const_iterator begin() const { return const_iterator( pContainer_->begin(), pContainer_->begin(), pContainer_->end() ); } const_iterator end() const { return const_iterator( pContainer_->begin(), pContainer_->end(), pContainer_->end() ); } private: std::vector< char > *pContainer_; };

(I am implementing the minimum necessary here - there's more we can, and should, add if we want this to be a full blown container).

We can use our container exactly as we'd expect:

template< typename T > std::vector< std::pair< typename T::value_type, typename T::value_type > > getEdgesContainer( T const& vertices) { std::vector< std::pair< char, char > > result; AdjacentCircularContainer c( &v_char ); for( std::pair< char, char > e : c ) { result.push_back( e ); } return result; }

While we're at it, now we have our container we can start to use the Adobe Source Library algorithms:

template< typename T > std::vector< std::pair< typename T::value_type, typename T::value_type > > getEdgesAdobe( T const& vertices) { std::vector< std::pair< char, char > > result; AdjacentCircularContainer c( &v_char ); adobe::copy( c, std::back_inserter( result ) ); return result; }

Life is good. We have iterators and we have containers that implement our desired "circular adjacent" functionality.

Except that we don't.

I have dropped unsubtle hints along the way that we've been cheating. While we have things that do reasonable impersonations of iterators and containers (good enough to do the things I need for my current use case), we do not have standard compliant iterators and containers.

The fundamental problem is the one we identified way back when we started to come up with the type of `value_type`

. Normally with a C++ sequence there are objects of type `value_type`

stored somewhere, and they persist for long enough that returning and storing references to them is a sensible thing to do. We don't have that. The only time we ever have an actual pair of values is as a local value inside the dereferencing function, and, as we all know, passing back a reference to a local value (which goes out of scope as soon as the function ends) is a Bad Thing.

We want (and the standard requires) the declaration of `operator *`

to be:

value_type const& operator *() const

but we have to make it:

value_type operator *() const

There are other problems (we can't implement `front`

or `back`

correctly), but they all stem from the same source.

If you're thinking - "this sounds like the problems with `std::vector< bool >`

" - you're right. Because `std::vector< bool >`

is allowed to be specialized so that the boolean values can be stored as a bitset you can't actually get a reference to an individual bool object because the container doesn't store individual bool objects. We can fake it with proxy objects, and more modern versions of C++ can fake it better than older versions can, but we're still faking it, and sooner or later the faking will catch up with us.

For more on `std::vector< bool >`

checkout Howard Hinnant's comments on this Stack Overflow question and this blog post.

In conclusion, it's been an interesting investigation but is it really worth it? For my particular use case the answer is "no" (at least in terms of solving my original "vertices to edges" problem rather than writing a blog post and learning something). Our original simple solution is good enough for my needs. I have it wrapped up in the function `getEdges`

so it's easy enough to swap out if I need something different.

Once I have my edges in a container I can run whatever algorithsm I need on a container that does actually contain edges rather than some fake sequence that just pretends to contain edges.

]]>Here’s how you could have made it easier on yourselves.

**Matt Smith**

A couple of years ago I took a class in improvisational theater from Matt Smith. If you are looking for a way to push yourself out of your comfort zone I highly recommend improv., particularly since in our case we did two public performances at the end of the class. If you’re in Seattle, I highly recommend Matt’s classes – the man is a genius.

(If you have a fear of public speaking, improv. might not cure you of that fear, but at least it will demonstrate that there is something worse than getting up on stage with a speech already prepared. Getting up on stage with nothing prepared is in a whole different category of scary).

In 60 hours of class time over 4 weeks, Matt never once told us that we were doing something wrong. Instead he used the quote above – “here’s how you could have made it easier on yourselves” (there’s a whole series of blog posts waiting to be written about what a great way of giving feedback that phrase is).

I was reminded of Matt’s words when I read this list of techniques for robust software (the author is given as Nick P – I would love to be able to attribute this more precisely, if anyone knows who Nick P is, please post in the comments).

It’s a good list of techniques that have been known for many years (and yet I see many teams failing to implement them), but what struck me is how many of the techniques come down to “making it easier on yourselves”. It’s well worth reading the entire list, but here are a few highlights:

- #2 “they used boring constructs that were easy to analyse”
- #3 “each module must fit in your head”, “control graph that’s pretty predictable”
- #6 “adopt languages that make robust development easier”
- #11 “brilliant tooling that could reliably find race conditions, deadlocks, or livelocks”

Software development is hard, we need to take every opportunity to make it easier on ourselves.

(You should also check out Matt’s TEDx talk on The Failure Bow – a concept that came up over and over again in our class and would improve many workplace cultures).

]]>Code that isn't tested == code that is broken.

— w00t Dude (@w00tDude) February 1, 2011

```
```Corollary. Multi-threaded code that is tested != code that works.

— w00t Dude (@w00tDude) February 1, 2011

Just in case the twitter links stop working:

Code that isn't tested == code that is broken.

Corollary. Multi-threaded code that is tested != code that works.

**Wayne Wooten**

Ah, the joys of parallel programming.

I did my undergraduate degree at Manchester University in the 1980s. In my “Parallel Computing” class we had a lecture about the Manchester Dataflow Machine – a parallel computing architecture that was going to take over the world.

I did my master’s degree at the University of Washington in the 2000s. In my “Parallel Computing” class we had a lecture about the Manchester Dataflow Machine, and why it didn’t take over the world.^{*}

In 20 years it seems like we came up with a lot of ideas for parallel computing, without ever hitting on the single idea that ties everything together. For serial computing we have the Von Neumann architecture, and even though we now have caches and branch prediction and pipelining and more, we can still regard the basic machine as a Von Neumann machine.

The most practical approach I have seen is the use of parallel programming patterns. Depending on who you talk to, and how they group the patterns there are somewhere between 13 and 20 of these. I am partial to Dr. Michael McCool’s explanation of the patterns:

Parallel Programming Talk 82 – Michael McCool

Structured Parallel Programming with Deterministic Patterns

^{*} I am being cruel for comic effect. It isn’t necessary for a machine (particularly not a research machine) to take over the world in order to be groundbreaking, useful and educational. The dataflow model is still a very relevant one – we used a dataflow architecture for hooking together graphs of image processing algorithms on the Adobe Image Foundation project. Search for “Photoshop oil paint filter” for examples produced by a dataflow graph of 13 image processing kernels.

I haven’t read all of the guidelines yet (it’s a big document), but what I have looked at so far is good.

There’s a discussion about the core guidelines on Hacker News. Sadly, one of the comments argues that:

`drawline(int,int,int,int);`

is superior to:

`drawline(Point,Point);`

I despair of this industry on a regular basis. We certainly don’t know everything about programming (and I hope in 20 years we’ll be looking back and laughing at our naivete), but we don’t know nothing either. Grouping data values into bundles is not a new idea, it has many demonstrated benefits and yet people still argue against it. Writing good code is hard, and we just make it harder by ignoring even the most basic of guidelines.

I laughed at another of the comments:

I agree, these are too opinionated to be unconditionally useful.

It’s a document by Bjarne Stroustrup and Herb Sutter. I’d be disappointed if it wasn’t opinionated.

Videos from the 2015 CppCon are starting to appear on the CppCon YouTube channel. There are also plenty of videos from previous years.

Notes from Herb Sutter’s talk Writing good C++14… by default are here

]]>One point in the article bears some further discussion – objects on the moon don’t float gently to the ground. I programmed the simulation using Newton’s equations of motion. I ignored air resistance (for this problem it’s insignificant under earth gravity and non-existent on the moon anyway). I did A level^{*} physics and maths – I’ve solved plenty of problems using these equations. If you asked me to do the calculation I could have told you that if you throw something up at *x* m/s in constant gravity it will come down past your hand at –*x* m/s.

Even knowing all of the theory, it still surprised me when I got into the simulation and tried it. My instinct was that everything fell at a constant speed on the moon. I knew that wasn’t true but I hadn’t experienced it. This might seem obvious to everyone else, but it wasn’t obvious to me. Perhaps I had been deceived by cartoon physics. Perhaps I was thinking that everything on the moon would be as light as a feather (not true), and since feathers float gently to the ground on earth (due to air resistance) objects would float to the ground on the moon.

There is a story (that I haven’t been able to find a good reference for so might be false) that NASA scientists were surprised the first time an astronaut tried turning a bolt in space on the outside of the capsule. The astronaut turned, not the bolt.

This piece wouldn’t be complete without a link to the classic NASA video of dropping a hammer and a feather on the moon. Really on the moon this time. Notice that the hammer and the feather are both accelerating.

The conclusion? Theory is one thing, actually experiencing it is another.

^{*} For those of you who are not Brits but are familiar with Harry Potter, A levels are equivalent to NEWTS^{**}.

^{**} For those of you not familiar with Harry Potter, NEWTS are the exams you do at age 18 having specialized in the subject for 2 years.

Let’s start with a reminder of the hierarchy of iterators (I am deliberately ignoring some of the new iterator functionality being proposed for C++17):

Random access iterator | |

Bidirectional iterator | |

Forward iterator | |

Input iterator | Output iterator |

The key quality of an input iterator is that the input sequence can only be traversed once. We can make a copy of an input iterator, but the moment we increment one of the copies, the other becomes invalid. The input value is available once – a common example is reading in keys from a keyboard.

An output iterator will only let us write a value once. An example is writing a value to a stream that’s being sent over the network (this example ignores a bunch of details likely to apply in practice but that doesn’t stop it from being a useful reminder for me).

A forward iterator lets us read or write as many times as we want. We can make copies of the iterator and traverse the sequence as many times as we want but we can only traverse it in one direction – forward. The classic example is a singly linked list.

A bidirectional iterator has all of the qualities of a forward iterator but will let us traverse backwards as well as forwards – e.g. a doubly linked list.

Finally, a random access iterator will let us traverse forwards, backwards and additionally, jump to any point in the sequence. std::vector lets us do all of this.

There are lots of algorithms that take input iterators as arguments so I have subdivided them further by their return type. Let’s start off with the algorithms that do not return an iterator:

T | accumulate | (InputIterator, InputIterator, T) |

T | accumulate | (InputIterator, InputIterator, T, BinaryOperation) |

bool | all_of | (InputIterator, InputIterator, Predicate) |

bool | any_of | (InputIterator, InputIterator, Predicate) |

difference_type | count | (InputIterator, InputIterator, T) |

difference_type | count_if | (InputIterator, InputIterator, Predicate) |

bool | equal | (InputIterator1, InputIterator1, InputIterator2) |

bool | equal | (InputIterator1, InputIterator1, InputIterator2, BinaryPredicate) |

Function | for_each | (InputIterator, InputIterator, Function) |

bool | includes | (InputIterator1, InputIterator1, InputIterator2, InputIterator2) |

bool | includes | (InputIterator1, InputIterator1, InputIterator2, InputIterator2, Compare) |

T | inner_product | (InputIterator1, InputIterator1, InputIterator2, T) |

T | inner_product | (InputIterator1, InputIterator1, InputIterator2, T, BinaryOperation1, BinaryOperation2) |

bool | is_partitioned | (InputIterator, InputIterator, Predicate) |

bool | lexicographical_compare | (InputIterator1, InputIterator1, InputIterator2, InputIterator2) |

bool | lexicographical_compare | (InputIterator1, InputIterator1, InputIterator2, InputIterator2, Compare) |

bool | none_of | (InputIterator, InputIterator, Predicate) |

We could subdivide this list even further. We have the algorithms that return bool – they are asking questions (all_of, any_of, equal, includes, is_partitioned, lexicographical_compare, none_of). We have the count and count_if algorithms – also asking questions and getting back an answer in the form of an integer. There are the algorithms that return a single value computed from the input range (accumulate and inner_product) and finally, for_each which, because it allows its function object to change state, returns the final version of the function object.

All of the algorithms can be run in a single pass over the input range – remember that input iterators do not allow us to iterate twice over a range. These algorithms can also be run over any pair of iterators above an input iterator in the hierarchy – i.e. all iterators except output iterators.

Next up, algorithms that take input iterators and return an output iterator:

OutputIterator | adjacent_difference | (InputIterator, InputIterator, OutputIterator) |

OutputIterator | adjacent_difference | (InputIterator, InputIterator, OutputIterator, BinaryOperation) |

OutputIterator | copy | (InputIterator, InputIterator, OutputIterator) |

OutputIterator | copy_if | (InputIterator, InputIterator, OutputIterator, Predicate) |

OutputIterator | copy_n | (InputIterator, Size, OutputIterator) |

OutputIterator | merge | (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator) |

OutputIterator | merge | (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator, Compare) |

OutputIterator | move | (InputIterator, InputIterator, OutputIterator) |

OutputIterator | partial_sum | (InputIterator, InputIterator, OutputIterator) |

OutputIterator | partial_sum | (InputIterator, InputIterator, OutputIterator, BinaryOperation) |

pair<OutputIterator1, OutputIterator2> | partition_copy | (InputIterator, InputIterator, OutputIterator1, OutputIterator2, Predicate) |

OutputIterator | remove_copy | (InputIterator, InputIterator, OutputIterator, T) |

OutputIterator | remove_copy_if | (InputIterator, InputIterator, OutputIterator, Predicate) |

OutputIterator | replace_copy | (InputIterator, InputIterator, OutputIterator, T, T) |

OutputIterator | replace_copy_if | (InputIterator, InputIterator, OutputIterator, Predicate, T) |

OutputIterator | set_difference | (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator) |

OutputIterator | set_difference | (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator, Compare) |

OutputIterator | set_intersection | (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator) |

OutputIterator | set_intersection | (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator, Compare) |

OutputIterator | set_symmetric_difference | (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator) |

OutputIterator | set_symmetric_difference | (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator, Compare) |

OutputIterator | set_union | (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator) |

OutputIterator | set_union | (InputIterator1, InputIterator1, InputIterator2, InputIterator2, OutputIterator, Compare) |

OutputIterator | transform | (InputIterator, InputIterator, OutputIterator, UnaryOperation) |

OutputIterator | transform | (InputIterator1, InputIterator1, InputIterator2, OutputIterator, UnaryOperation) |

OutputIterator | unique_copy | (InputIterator, InputIterator, OutputIterator) |

OutputIterator | unique_copy | (InputIterator, InputIterator, OutputIterator, BinaryPredicate) |

The algorithms that return an output iterator all take an output iterator as a parameter. The algorithms are taking an input range and converting it to an output range. The result is always the position after the last written element of the output range – i.e. the first element that was *not* overwritten.

Alex Stepanov had many design principles for the STL. One of them states that an algorithm that calculates a useful value should return that value even when the calculation of that value is not the main point of the algorithm. Take std::copy. The programmer already knows where the beginning and end of the input range are, and where the beginning of the output range is. What the programmer does not necessarily know is where the end of the output range is. Since the algorithm computes the end of the output range, and it is potentially useful information (I’ve used it several times), the algorithm returns the end of the output range.

As well as straightforward copying we also get variants on std::copy : std::copy, std::copy_if, std::copy_n, std::remove_copy, std::remove_copy_if, std::partition_copy (note that std::remove, std::remove_if and std::partition need more powerful iterators – it is the copying which lets them work with input iterators).

We also get std::merge – a building block for merge sort algorithms.

Finally for the input iterators, we have the algorithms that return an input iterator.

InputIterator | find | (InputIterator, InputIterator, T) |

InputIterator | find_first_of | (InputIterator, InputIterator, ForwardIterator, ForwardIterator) |

InputIterator | find_first_of | (InputIterator, InputIterator, ForwardIterator, ForwardIterator, BinaryPredicate) |

InputIterator | find_if | (InputIterator, InputIterator, Predicate) |

InputIterator | find_if_not | (InputIterator, InputIterator, Predicate) |

pair<InputIterator1, InputIterator2> | mismatch | (InputIterator1, InputIterator1, InputIterator2) |

pair<InputIterator1, InputIterator2> | mismatch | (InputIterator1, InputIterator1, InputIterator2, BinaryPredicate) |

All of the algorithms return the position of something that was found, whether that was a given value or a mismatch.

Remember that an input iterator only gets to iterate over a range once. Once we have read a value and moved on that value is gone, however an algorithm like std::find only needs to read that value, it does *not* need to move on before returning the iterator.

There are only a couple of algorithms that take no iterators other than output iterators. Unsurprisingly, these are algorithms that generate some output:

OutputIterator | fill_n | (OutputIterator, Size, T) |

OutputIterator | generate_n | (OutputIterator, Size, Generator) |

There are other variants of std::fill and std::generate which we’ll look at later. These two algorithms write a sequence of *n* values. They are useful because we don’t always have the option of getting an “end” output iterator. Imagine appending data to a file, the end is wherever we decide to stop writing – there is no predetermined “end”.

In a previous blog post I mentioned the idea of adding `iota_n`

. This is the category that `iota_n`

would fit into. Incidentally, the boost algorithms library has boost::iota_n available, along with many other useful algorithms.

ForwardIterator | adjacent_find | (ForwardIterator, ForwardIterator) |

ForwardIterator | adjacent_find | (ForwardIterator, ForwardIterator, BinaryPredicate) |

bool | binary_search | (ForwardIterator, ForwardIterator, T) |

bool | binary_search | (ForwardIterator, ForwardIterator, T, Compare) |

pair<ForwardIterator, ForwardIterator> | equal_range | (ForwardIterator, ForwardIterator, T) |

pair<ForwardIterator, ForwardIterator> | equal_range | (ForwardIterator, ForwardIterator, T, Compare) |

void | fill | (ForwardIterator, ForwardIterator, T) |

ForwardIterator1 | find_end | (ForwardIterator1, ForwardIterator1, ForwardIterator2, ForwardIterator2) |

ForwardIterator1 | find_end | (ForwardIterator1, ForwardIterator1, ForwardIterator2, ForwardIterator2, BinaryPredicate) |

void | generate | (ForwardIterator, ForwardIterator, Generator) |

void | iota | (ForwardIterator, ForwardIterator, T) |

bool | is_permutation | (ForwardIterator1, ForwardIterator1, ForwardIterator2) |

bool | is_permutation | (ForwardIterator1, ForwardIterator1, ForwardIterator2, BinaryPredicate) |

bool | is_sorted | (ForwardIterator, ForwardIterator) |

bool | is_sorted | (ForwardIterator, ForwardIterator, Compare) |

bool | is_sorted_until | (ForwardIterator, ForwardIterator) |

bool | is_sorted_until | (ForwardIterator, ForwardIterator, Compare) |

void | iter_swap | (ForwardIterator1, ForwardIterator2) |

ForwardIterator | lower_bound | (ForwardIterator, ForwardIterator, T) |

ForwardIterator | lower_bound | (ForwardIterator, ForwardIterator, T, Compare) |

ForwardIterator | max_element | (ForwardIterator, ForwardIterator) |

ForwardIterator | max_element | (ForwardIterator, ForwardIterator, Compare) |

ForwardIterator | min_element | (ForwardIterator, ForwardIterator) |

ForwardIterator | min_element | (ForwardIterator, ForwardIterator, Compare) |

pair<ForwardIterator, ForwardIterator> | minmax_element | (ForwardIterator, ForwardIterator) |

pair<ForwardIterator, ForwardIterator> | minmax_element | (ForwardIterator, ForwardIterator, Compare) |

ForwardIterator | partition | (ForwardIterator, ForwardIterator, Predicate) |

ForwardIterator | partition_point | (ForwardIterator, ForwardIterator, Predicate) |

ForwardIterator | remove | (ForwardIterator, ForwardIterator, T) |

ForwardIterator | remove_if | (ForwardIterator, ForwardIterator, Predicate) |

void | replace | (ForwardIterator, ForwardIterator, T, T) |

void | replace_if | (ForwardIterator, ForwardIterator, Predicate, T) |

ForwardIterator | rotate | (ForwardIterator, ForwardIterator, ForwardIterator) |

OutputIterator | rotate_copy | (ForwardIterator, ForwardIterator, ForwardIterator, OutputIterator) |

ForwardIterator1 | search | (ForwardIterator1, ForwardIterator1, ForwardIterator2, ForwardIterator2) |

ForwardIterator1 | search | (ForwardIterator1, ForwardIterator1, ForwardIterator2, ForwardIterator2, BinaryPredicate) |

ForwardIterator | search_n | (ForwardIterator, ForwardIterator, Size, T) |

ForwardIterator | search_n | (ForwardIterator, ForwardIterator, Size, T, BinaryPredicate) |

ForwardIterator2 | swap_ranges | (ForwardIterator1, ForwardIterator1, ForwardIterator2) |

ForwardIterator | unique | (ForwardIterator, ForwardIterator) |

ForwardIterator | unique | (ForwardIterator, ForwardIterator, BinaryPredicate) |

ForwardIterator | upper_bound | (ForwardIterator, ForwardIterator, T) |

ForwardIterator | upper_bound | (ForwardIterator, ForwardIterator, T, Compare) |

Forward iterators give us a lot more options. Since forward iterators can iterate over a range more than once we can handle algorithms such as std::adjacent_find which require us to iterate over each value twice.

As I promised earlier, we have the other variants of std::generate and std::fill here, and we also get std::iota. Now we have moved into forward iterators we can start dealing with output ranges.

One suprise is that std::binary_search, std::equal_range, std::lower_bound and std::upper_bound are present in this list. We normally think of these algorithms as being O(log *n*) algorithms. Surely we cannot have a O(log *n*) algorithm working with forward iterators since the algorithm might have to iterate over the entire sequence making it O(n). The reason they are available for forward iterators is because the standard defines their complexity requirements as O(log *n*) on the number of *comparisons*, not the number of elements iterated over. We might regard this as cheating.

We are not stuck with one implementation of an algorithm though. There is also a great talk by Stephan T. Lavavej here where he explains how an implementation can select different algorithms for different iterator types so (for example) std::binary_search can use one implementation for forward iterators but a more efficient implementation for random access iterators.

Another surprise for me is that std::min_element, std::max_element and std::minmax_element all require forward iterators. Surely it is possible to find the maximum element of a sequence using input iterators? Well, it is possible to find the maximum element of a sequence using input iterators, however there are two problems:

- std::max_element and friends actually return the
*position*of the desired element(s), not just the value of the desired element. The position is a more general piece of information - If we had a version of std::max_element that ran on input iterators it would have to make a copy of the element – both for the comparison and to return the value of the element. Making a copy could be expensive, undesirable or impossible. Having said that, there is a Stack Overflow post here where making a copy increases the performance of the algorithm, probably due to a cache effect.

Working out how to implement some of these algorithms to achieve the required complexity guarantees is an interesting exercise.

BidirectionalIterator2 | copy_backward | (BidirectionalIterator1, BidirectionalIterator1, BidirectionalIterator2) |

void | inplace_merge | (BidirectionalIterator, BidirectionalIterator, BidirectionalIterator) |

void | inplace_merge | (BidirectionalIterator, BidirectionalIterator, BidirectionalIterator, Compare) |

BidirectionalIterator2 | move_backward | (BidirectionalIterator1, BidirectionalIterator1, BidirectionalIterator2) |

bool | next_permutation | (BidirectionalIterator, BidirectionalIterator) |

bool | next_permutation | (BidirectionalIterator, BidirectionalIterator, Compare) |

bool | prev_permutation | (BidirectionalIterator, BidirectionalIterator) |

bool | prev_permutation | (BidirectionalIterator, BidirectionalIterator, Compare) |

void | reverse | (BidirectionalIterator, BidirectionalIterator) |

OutputIterator | reverse_copy | (BidirectionalIterator, BidirectionalIterator, OutputIterator) |

BidirectionalIterator | stable_partition | (BidirectionalIterator, BidirectionalIterator, Predicate) |

The bidirectional iterator algorithms are an interesting bunch. The algorithms need more functionality than provided by forward iterators, but don’t require the full power of random access iterators.

We get std::copy_backward. As we saw in part 1a, std::copy and std::copy_backward handle overlapping ranges (std::reverse_copy requires non-overlapping ranges). If you’re looking from some additional amusement, look into how using reverse iterators affects the overlapping range guarantees of std::copy and std::copy_backward (we can make similar arguments for the std::move set of algorithms).

We get std::inplace_merge, an algorithm that will make use of extra memory if it can get it. The standard does not specify where the extra memory comes from – usually C++ allows you to handle all memory via custom allocators, that does not appear to be the case here.

bool | is_heap | (RandomAccessIterator, RandomAccessIterator) |

bool | is_heap | (RandomAccessIterator, RandomAccessIterator, Compare) |

RandomAccessIterator | is_heap_until | (RandomAccessIterator, RandomAccessIterator) |

RandomAccessIterator | is_heap_until | (RandomAccessIterator, RandomAccessIterator, Compare) |

void | make_heap | (RandomAccessIterator, RandomAccessIterator) |

void | make_heap | (RandomAccessIterator, RandomAccessIterator, Compare) |

void | nth_element | (RandomAccessIterator, RandomAccessIterator, RandomAccessIterator) |

void | nth_element | (RandomAccessIterator, RandomAccessIterator, RandomAccessIterator, Compare) |

void | partial_sort | (RandomAccessIterator, RandomAccessIterator, RandomAccessIterator) |

void | partial_sort | (RandomAccessIterator, RandomAccessIterator, RandomAccessIterator, Compare) |

void | pop_heap | (RandomAccessIterator, RandomAccessIterator) |

void | pop_heap | (RandomAccessIterator, RandomAccessIterator, Compare) |

void | push_heap | (RandomAccessIterator, RandomAccessIterator) |

void | push_heap | (RandomAccessIterator, RandomAccessIterator, Compare) |

void | random_shuffle | (RandomAccessIterator, RandomAccessIterator) |

void | random_shuffle | (RandomAccessIterator, RandomAccessIterator, RandomNumberGenerator) |

void | shuffle | (RandomAccessIterator, RandomAccessIterator, URandomNumberGenerator) |

void | sort | (RandomAccessIterator, RandomAccessIterator) |

void | sort | (RandomAccessIterator, RandomAccessIterator, Compare) |

void | sort_heap | (RandomAccessIterator, RandomAccessIterator) |

void | sort_heap | (RandomAccessIterator, RandomAccessIterator, Compare) |

void | stable_sort | (RandomAccessIterator, RandomAccessIterator) |

void | stable_sort | (RandomAccessIterator, RandomAccessIterator, Compare) |

The random access algorithms are all about sorting and shuffling. I have rarely used the heap algorithms directly but they are useful building blocks for algorithms such as std::nth_element (previously discussed here), std::partial_sort_copy and std::partial_sort.

RandomAccessIterator | partial_sort_copy | (InputIterator, InputIterator, RandomAccessIterator, RandomAccessIterator) |

RandomAccessIterator | partial_sort_copy | (InputIterator, InputIterator, RandomAccessIterator, RandomAccessIterator, Compare) |

std::partial_sort_copy uses an interesting combination of input iterators and random access iterators.

const T& | max | (T, T) |

const T& | max | (T, T, Compare) |

T | max | (InitializerList) |

T | max | (InitializerList, Compare) |

const T& | min | (T, T) |

const T& | min | (T, T, Compare) |

T | min | (InitializerList) |

T | min | (InitializerList, Compare) |

pair<const T&, const T&> | minmax | (T, T) |

pair<const T&, const T&> | minmax | (T, T, Compare) |

pair<T, T> | minmax | (InitializerList) |

pair<T, T> | minmax | (InitializerList, Compare) |

Finally, there are the core `min`

and `max`

algorithms that don’t use iterators at all.

I don’t have a conclusion for this post. I found it interesting to see which algorithms required which iterators – there were a couple that surprised me. The general rule is to write algorithms that use the weakest iterators possible to make the algorithms as widely useful as possible.

]]>A repository of modern C++ code samples curated by the community.

There’s a Github repo so that anyone can fork, add a new sample and make a pull request. There are some interesting samples up there already, and I look forward to seeing what gets added in the future.

Joseph Mansfield is the programmer behind C++ samples and his own blog has plenty of interesting posts.

]]>First of all, I should have included std::iota in the list of “order guaranteed” algorithms. I overlooked this partly because std::iota doesn’t take an input range, but also because the language used in the standard isn’t entirely clear. I am reasonably confident though that operator++ must be applied starting with the first value output, and continuing in sequence. (As always, if someone has some evidence I am wrong about this please tell me).

This raises the number of algorithms that guarantee ordering to 14.

Secondly, we didn’t look at “why”. Why should some algorithms guarantee their operation order while others don’t? Let’s look at the 14 algorithms in groups, starting with std::copy and std::move:

OutputIterator | copy | (InputIterator, InputIterator, OutputIterator) |

BidirectionalIterator2 | copy_backward | (BidirectionalIterator1, BidirectionalIterator1, BidirectionalIterator2) |

OutputIterator | move | (InputIterator, InputIterator, OutputIterator) |

BidirectionalIterator2 | move_backward | (BidirectionalIterator1, BidirectionalIterator1, BidirectionalIterator2) |

Copy and move have specific forward and backward versions. This is to handle overlapping input and output ranges. In order for copying or moving between overlapping ranges to work as expected, the order must be specified. The order will be different depending on whether the destination range overlaps the beginning or the end of the source range – hence the forward and backward versions.

(There was an issue raised in 1995 about the direction of std::copy. Read it here.)

Moving on, the next algorithm is std::for_each:

Function | for_each | (InputIterator, InputIterator, Function) |

As Kreft and Langer point out here, std::for_each was unusual among the algorithms because it allows its function object to have side effects. I say ** was** because the C++98 standard includes the phrase –

Regardless of the changes to the wording of std::transform, once your function object can have side effects, it is possible for those side effects to vary depending on the order in which the range is processed. To get deterministic results you need a deterministic processing order, hence (at least historically) the order requirement for std::for_each.

That leaves us with 9 algorithms to explain:

T | accumulate | (InputIterator, InputIterator, T) |

T | accumulate | (InputIterator, InputIterator, T, BinaryOperation) |

OutputIterator | adjacent_difference | (InputIterator, InputIterator, OutputIterator) |

OutputIterator | adjacent_difference | (InputIterator, InputIterator, OutputIterator, BinaryOperation) |

T | inner_product | (InputIterator1, InputIterator1, InputIterator2, T) |

T | inner_product | (InputIterator1, InputIterator1, InputIterator2, T, BinaryOperation1, BinaryOperation2) |

void | iota | (ForwardIterator, ForwardIterator, T) |

OutputIterator | partial_sum | (InputIterator, InputIterator, OutputIterator) |

OutputIterator | partial_sum | (InputIterator, InputIterator, OutputIterator, BinaryOperation) |

What do these algorithms have in common?

- They are all in the header numeric. In fact, they are the entire contents of the header numeric.
- They all involve combining elements of the input range in some way (or the output range for std::iota).

So we have “in order”, “numeric”, and “combining”. “Numeric” leads us to think of floating point values, and the combination of “in order” and “floating point” leads us to the fact that floating point arithmetic is neither associative nor commutative. [Edit: See John Payson’s comment where he points out that IEEE-754 floating point *is* commutative for addition and multiplication.] Running std::accumulate from front to back does not necessarily give us the same answer as running it from back to front. Once again, if we want a deterministic answer we need a deterministic processing order, hence the “in order” requirement for these algorithms.

As usual, if the standard algorithms do not supply us with what we want, we can write our own.

The C++11 standard also makes it clear that the compiler is limited in its rearrangement of operators:

[ Note: Operators can be regrouped according to the usual mathematical rules only where the operators really are associative or commutative.

**C++11 standard** *1.9*

In case anyone thinks that floating point commutativity [Edit: I should have said associative here] only matters for very large, very small or very accurate calculations, it doesn’t. One of my home projects is a music typesetting program. It lays out music onto a typical sheet of paper (A4 or 8.5 x 11) and it doesn’t need micron accuracy, however I still got caught by floating point operations.

At one point I had two functions (A and B) that computed what was nominally the same value, but in two different ways. I sorted a vector using a comparison operator that called function A. I then attempted to do a binary search on the vector using a comparison operator that called function B. Fortunately the debug version of the STL I was using detected attempts to use a binary search on a vector that was not sorted – the vector was sorted according to A, it was not sorted according to B. Yet another reason why “don’t repeat yourself” is such a good rule.

In part 1 I mentioned the possibility of having some of the algorithms perform their calculations in parallel. The C++11 standard has this to say:

Unless otherwise specified, C++ standard library functions shall perform all operations solely within the current thread if those operations have effects that are visible (1.10) to users.

[ Note: This allows implementations to parallelize operations if there are no visible side effects. —end note ]

**C++11 standard** *17.6.5.9*

I.e. no parallelism unless it’s invisible to the user.

There are some interesting proposals for explicitly parallel algorithms. For example, see section 6.6 *Associativity/Commutativity of Binary Operators* (p66) of A Parallel Algorithms Library which points out that some parallel algorithms will require associative and commutative operators.

Parallelizing the Standard Algorithms Library also looks at the topic of parallelization. For example, see section 5.2 *std::accumulate versus thrust::reduce* for a discussion of the need for associative and commutative operators.

Several of my references in this post came from this website which contains many papers from the C++ standards committee. If you have far too much time to kill, the site contains fascinating information – current and historical.

]]>… too tight a schedule will inevitably lead to the temptation to take shortcuts. These shortcuts *might* succeed in getting the system working on time – but only if everything goes right, which it rarely does.

**Gerald Weinberg** *The Psychology of Computer Programming Silver Anniversary Edition* p68

*The Psychology of Computer Programming* was originally published in 1971 but remains a great read and very relevant even in a very different computing world. I have the silver anniversary edition which reprints the original with additional commentary at the end of each chapter – what Weinberg calls “wisdom of hindsight remarks”. Weinberg’s comments – 25 years on – about the above quote state:

The wisdom of age has made me ashamed of this statement. “Rarely” is pussy footing; something

alwaysgoes wrong. Shilly-shallying like this by authors like me has perpetuated the myth of the “optimistic” estimate – how long a project would take if everything went right. Such “estimating” might just as well be based on the assumption that the Laws of Thermodynamics will be repealed during the project.

**Gerald Weinberg** *The Psychology of Computer Programming Silver Anniversary Edition* 5.i

]]>

I imposed on [Jack] Real the requirement that he try to design the helicopter so that it could be serviced with six simple tools – any six of his choice. This was more a challenge than an arbitrary decision. I think most good designers want to keep things simple, but sometimes, for the sheer engineering delight of creating, things become unnecessarily complex and cumbersome.

**Clarence L. “Kelly” Johnson** *Kelly: More than My Share of It All*

Kelly Johnson was the head of Lockheed’s Skunk Works and a renowned aeronautical engineer. His Wikipedia article contains many more details, including his 14 Rules of Management.

Kelly: More than My Share of It All is also worth reading. It includes stories of Amelia Earhart, the SR-71 Blackbird and a 400lb lion chasing Althea Johnson (Kelly Johnson’s wife) around a factory.

I am not going to search for a software equivalent of “six simple tools” – that would be an analogy too far, and as Johnson goes on to say, the “six tools” are less of a requirement than they are about setting up an attitude. Maintenance matters. Simplicity matters. What use is the best helicopter in the world if you can’t keep it running?

I think there is a software parallel in this comment from Chris Jaynes (commenting on Tim Bray’s post On Duct Tape, itself a response to Joel Spolsky’s post The Duct Tape Programmer that in turn was inspired by a chapter in the book Coders at Work profiling Jamie Zawinski):

Yes, shipping version 1 is a feature, but shipping version 2 is ALSO a feature!

To that I would add:

]]>Being able to fix bugs is a feature.

Being able to maintain code is a feature.

Being able to add features is itself a feature.