Inconsistent sorting

A few years ago I was working on a prototype for some 3D drawing code. I was using the painters algorithm – draw the objects from far to near so that the close objects will occlude the distant objects. I had my objects in a vector, I sorted them according to their distance from the camera, then drew the objects. Then I noticed something odd. When I first started the program I had two objects that would flicker. They would swap which one was in front – sometimes object A would be drawn in front of object B, sometimes B was drawn in front of A.

My start up coordinates for the camera and the objects meant that A and B were exactly the same distance from the camera, so when I did the sort, A and B would compare equal. Since I wasn’t using std::stable_sort I wasn’t expecting any particular ordering for A and B, but I was expecting the ordering to be consistent between calls to std::sort. After some debugging I discovered that giving the exact same input to std::sort would not always give the same output – objects with identical distances might be sorted differently on each call.

I ended up digging into the standard library implementation. I think I was using Metrowerks C++, but I wouldn’t bet on it. The implementation of std::sort used Quicksort – nothing unusual there. One of the problems with Quicksort though is that you have to make good choices of pivot element – the element that you use to to split the input into two partitions. For example, if you use the first element as the pivot and hand Quicksort an already sorted sequence the runtime ends up being O(n^2). Quicksort has great average case behaviour but bad worst case behaviour. There are various techniques for picking the pivot element including:

  • first element
  • last element
  • middle element
  • median of the first, last and middle elements

This particular implementation had found a different method of picking the pivot – it used rand. Effective since the chances of providing the sort with a worst case sequence were low, but it led to this odd behaviour – the same input to std::sort would not necessarily lead to the same output. I switched to std::stable_sort and all my problems went away.

As far as I can tell from the standard (section 25.4.1.1), this behaviour for std::sort is entirely legal. std::sort does not guarantee the order in which equivalent elements are placed, and I don’t see anything that forces std::sort to be consistent. I have never found this implementation in any other standard library, but I have heard of one person who was caught out like I was.

std::vector vs. tbb::concurrent_vector – API design

My current home project does a lot of processing and since I paid money for a 4 core processor I would like to do as much of that in parallel as possible. I have been using Intel’s Threading Building Blocks (TBB) for a while now and although some of what TBB does has been taken over by the thread library in C++11 there are still some useful features in TBB. In particular, TBB provides a set of concurrent containers:

  • map
  • set
  • hashmap
  • queue
  • bounded queue
  • priority queue
  • vector

One of the interesting things about the concurrent containers is that they don’t just consist of a clever implementation behind the same API as our regular containers, the API itself needs to be different. std::vector has been around for 20 years or so. I have made plenty of use of it, and written some vector-like containers (usually because I want different speed / space tradeoffs) that have used the same API as std::vector so they can be drop in replacements. However, once we start trying to create a concurrent version of a vector we run into problems with the API. In this post I am going to compare a C++98 vector with a TBB 4.2 (update 2) concurrent_vector and see what changes Intel have made to the API.

(Disclaimer: I have no insider knowledge of the Intel’s design process for TBB, these are simply my own thoughts based on using TBB, reading Intel’s documentation and reading James Reinders’ book “Intel Threading Building Blocks”)

I highly recommend reading Eric Lippert’s post What is this thing you call “thread safe”?. Eric’s conclusion is that “thread safe” is a vague term and we need to get much more specific about the behaviours that we claim are “thread safe”. For this post I am going to define “thread safe” as meaning “has an API that allows us to do useful things on multiple threads simultaneously”.

What if we do the simplest and most naïve thing possible? Let’s just take our implementation of std::vector, add a mutex and put a lock around the implementation of every method. We might argue that the vector is now “thread safe”. Depending on our definition of “thread safe”, we might even be correct. What we have not done is create something that’s useful. Take this simple piece of code:

void fn( std::vector< int >& v_i )
{
    v_i.push_back( 7 );
}

Assume that v_i is shared across threads. Once we have called push_back( 7 ) we know nothing. We don’t know what the index of the element “7” is – we can’t use the size of the vector before or after the push_back – another thread might have changed the size of the vector, or deleted the element that we just pushed. We don’t even know whether our “7” is still in the vector (yes, we can take steps to write our multi threaded code to ensure this doesn’t happen, but the whole point of trying to create a thread safe vector is to make it easier for us to access it from multiple threads).

This leads to a principle of concurrent APIs – if you have data and/or operations that are connected you usually cannot access them separately. A single threaded vector’s push_back method does not return a value – if we want to know where the element is in the vector we can just use size() – 1. A concurrent vector’s push_back must return the position of the newly appended element.

Even if we add a return value to push_back we haven’t solved the problem of another thread changing the size of the vector. We call push_back, we know where the element we just pushed ended up, but we do not know if it is still there when we come to use it. Erasing elements and inserting elements in the middle of the vector are problematic.

We can’t even access anything in the vector safely – we can check that our index is less than the size of the vector before we call operator [], but the size of the vector might change between our call to size and our call to operator []. We have to add range checking to operator [] and throw an exception if the index we pass in is out of range. This gives us another principle of concurrent APIs – many more methods have to be capable of returning errors – the concurrent nature of the API makes it difficult or impossible to guarantee that the values passed to a method will be valid when the method actually executes.

Accessing the vector via iterators is no help either, another thread can come along, push_back a few elements and cause the vector to resize and reallocate, invalidating all iterators. The thread that is using the iterators has no way to guard against this.

On top of all of this, just putting a lock around every function is unlikely to give us good performance. Using the vector now becomes a bottleneck even if we are just reading the vector – remember that we are doing this for performance reasons.

Let’s see what Intel did to avoid of these problems.


Access

reference operator[]( size_type index )
const_reference operator[]( size_type index ) const

reference at( size_type index )
const_reference at( size_type index ) const

reference front()
const_reference front() const

reference back()
const_reference back() const

I think it’s safe to say that the bare minimum we can expect from a concurrent vector is that its elements can be safely accessed from multiple threads. We have a set of access methods that are very similar to those in std::vectoroperator [] does not throw an exception, at might throw an exception. The access methods can be used concurrently.


Concurrent growth

iterator push_back(const_reference item )

iterator grow_by( size_type delta );
iterator grow_by( size_type delta, const T& t );
template<typename ForwardIterator> 
iterator grow_by( ForwardIterator first, ForwardIterator last );

iterator grow_to_at_least( size_type n )
iterator grow_to_at_least( size_type n, const T& t )

There are three methods (plus a couple of variations) for appending new elements to the vector.

push_back is the function we know and love from std::vector, however it now returns an iterator to the newly appended element – when we push_back an element we will know where it has been placed.

grow_by appends a certain number of elements to the end of the vector – depending on which overloaded function we use we get default constructed elements, copies of a single element or a new set of elements from a given source sequence. It returns an iterator pointing to the start of the appended sequence.

grow_to_at_least makes sure we have at least n elements in the vector, and if necessary appends default constructed elements or copies of a given element. Again, it returns an iterator pointing to the start of the appended sequence.

As with the access methods, the growth methods can be used concurrently.

It looks like we have covered the basic properties of a vector however, even if we just use the accessors and the growth methods, there is the potential for problems. The Intel documentation states:

The methods described in this section [the access methods] may be concurrently invoked on the same vector as methods for concurrent growth. However, the returned reference may be to an element that is being concurrently constructed.

I.e. if you’re appending a sequence of elements on one thread and trying to access that sequence from another thread you might get back a reference to a partially constructed element. That seems strange. Let’s work out what is going on here by taking a look at the size method.


Size

size seems relatively benign, but the documentation tells us that size might include elements that have had their memory allocated but are still under construction by another thread. This is the same issue we saw earlier with the access methods – you can have a reference to a not-yet-constructed object. What’s going on?

I don’t know for sure why Intel made size work this way, but I can speculate. I’ll use a common technique – I’ll assume the opposite and see what happens. Let’s say that size must reflect the number of fully constructed objects – it will not include partially constructed objects.

Imagine that thread A uses grow_by to append 50 objects to the thread, while thread B appends 1000 objects. Assume that thread A “wins” – it gets access to the memory first, allocates space for 50 objects and starts constructing them. While thread A is constructing its objects, thread B can allocate space for its 1000 objects (because the whole point is to be able to do this concurrently) and thread B starts constructing its objects. Thread A finishes first (it only has 50 elements to construct), it updates size to reflect the additional 50 objects. Thread B finishes and it then updates size to reflect its 1000 objects. Everyone is happy.

What if things go differently? Let’s say that thread B “wins”, allocates space for its 1000 objects and starts constructing them. Thread A then allocates space for its 50 objects and starts constructing them. Thread A finishes first (only 50 objects). What should it update size to? Remember that in this hypothetical situation we want size to reflect the number of fully constructed objects. Unless thread A is prepared to wait until thread B has finished constructing its objects (which would kill our concurrent performance) it cannot change size because that would also include the partially constructed objects that thread B is working on. Since thread A cannot change size, thread B must change it when thread B has finished constructing its objects. Thread B needs to add on the number of elements it just appended and the number of elements that thread A appended. That’s going to add a bunch of complication internally to make sure that everything gets updated correctly, and I haven’t even started on threads C through Z which might also be growing and accessing the vector.


What isn’t there?

Notice what we haven’t seen. There are no erase methods and no insert methods. You cannot remove any elements from the vector (with the exception of clear which cannot be used concurrently) and you cannot insert elements in the middle – that means that you can’t change the position of any element once it is in the vector. There is only one place you can add elements to a concurrent vector – at the end.

Concurrent vector has the usual begin / end / rbegin / rend methods for getting iterators. It also guarantees that elements will never be moved – concurrent vector does not do the std::vector trick of allocating new memory and copying everything over (and invalidating iterators). Couple that with the lack of erase methods and you get the property that once you have a valid index, reference or iterator to an element, that index, reference or iterator will stay valid until the concurrent vector is cleared or destroyed. This means that one thread cannot invalidate the references on another thread – the API is giving us some safety that we didn’t have before.


Capacity

tbb::concurrent_vector has the normal operations around size and capacity:

size_type size() const
bool empty() const
size_type capacity() const
size_type max_size() const

To finish off the API there are the usual mixture of constructors and a destructor, none of which can be used concurrently (and it is possible to call constructors concurrently if you’re using placement new or have some half baked singleton implementation). There’s assignment (both operator = and assign methods), a specialization of swap, and a shrink_to_fit method (to match C++11’s shrink_to_fit method). There are also some parallel iteration methods (range) which make the concurrent vector easily usable in Intel’s concurrent algorithms.

So what’s the conclusion? Intel have stripped a lot of the functionality from std::vector but left in the core operations – growing the vector by appending elements, and accessing them. Even with that limited functionality it is still possible to do things incorrectly, however a little discipline around setup and tear down helps mitigate this. My code has an initial sequence that is single threaded, then it kicks off several threads, and on shutdown it ends up back on a single thread. I make sure that the non-concurrent methods are only called from a single thread.

Concurrent programming is hard to begin with, the fact that we need to rethink our APIs just makes it harder.

No raw loops – output

Let’s get back to raw loops, or rather the absence of raw loops. How do we write out the contents of a container so that each element appears on its own line? Assume we have some type T and a vector of T:

class T
{
   ...
};

std::vector< T > v_T;

This is the old style way to write the contents:

std::ostringstream ostr;
for( std::vector< T >::const_iterator i( std::begin( v_T ) ); 
    i != std::end( v_T ); ++i )
{
    ostr << *i << "\n";
}

The range-based for solution is straightforward:

for( T const& t : v_T )
{
    ostr << t << "\n";
}

Lambda functions also provide a solution:

std::for_each( 
    std::begin( v_T ), 
    std::end( v_T ), 
    [ &ostr ]( T const& t ){
        ostr << t << "\n";
    } );
adobe::for_each( 
    v_T, 
    [ &ostr ]( T const& t ){
        ostr << t << "\n";
    } );

Conveniently, output streams have iterators. We can use std::ostream_iterator as the destination for std::copy or adobe::copy :

std::copy( 
    std::begin( v_T ), 
    std::end( v_T ), 
    std::ostream_iterator< T >( 
        ostr, 
        "\n" ) );
adobe::copy( 
    v_T, 
    std::ostream_iterator< T >( 
        ostr, 
        "\n" ) );

Notice that we can pass a string (called the delimiter string in the standard) to the std::ostream_iterator constructor. The delimiter string is written out after every element.

Since std::ostream_iterator is an iterator with std::ostream as its underlying container, we might thank that we would be able to do this:

std::copy( 
    begin( v_T ), 
    end( v_T ), 
    std::begin( ostr ) );

We can't. The standard does not include a specialization of std::begin for std::ostream, and we are not allowed to add one ourselves because of Section 17.6.4.2.1 of the C++11 standard which states:

A program may add a template specialization for any standard library template to namespace std only if the declaration depends on a user-defined type and the specialization meets the standard library requirements for the original template and is not explicitly prohibited

In addition to that, std::begin does not allow for the additional delimiter string argument. Also, std::ostream_iterator is instantiated for the type of the source value which is not present in the single argument we pass to std::begin so cannot be deduced.

There is nothing stopping us writing our own function to construct a std::ostream_iterator from std::ostream. Since it's our function it can take an additional argument for the delimiter string, and so long as we don't put it into the standard namespace we can even call it begin. We do have to specify the type of our output when we call the function (the function cannot deduce it from the arguments because that type never appears in the arguments):

template< typename U >
std::ostream_iterator< U >
begin( std::ostream& ostr, const char* delimiterString )
{
    return std::ostream_iterator< U >( ostr, delimiterString );
}
std::copy( 
    begin( v_T ), 
    end( v_T ), 
    begin< T >( ostr, "\n" ) );

And an example that almost works. Since operator << is just a function, we can pass it to std::bind :

std::for_each(
    std::begin( v_T ), 
    std::end( v_T ),
    std::bind( operator <<, std::ref( ostr ), _1 ) );

This writes out the elements of the container, but it does not put each element on its own line - there is nowhere to plug the "\n" into. We can't achieve the each element appears on its own line requirement from our original problem statement.

We can write a function that takes an object of type T and a delimiter string and writes out both of them:

void
writePostText( std::ostream& o, T const& t, const char* text )
{
    o << t << text;
}
std::for_each(
    std::begin( v_T ), 
    std::end( v_T ),
    std::bind( writePostText, std::ref( ostr ), _1, "\n" ) );

Which of these solutions is the best? As usual there isn't a definitive answer to that question, but it looks like the range-based for solution looks has the lowest overhead (overhead being bits of boilerplate syntax that are not directly used to solve the problem), with adobe::for_each close behind.


There is one more thing to throw into the mix. The range-based for looks good, however it can only iterate over the entire container. Most of the time this isn't a problem, I estimate that 95% of my algorithm usage is over the entire container. In the case of writing things out though, there is a common use case where you want to iterate over a portion of the container. If you are writing out JSON, or some similar format where you need to put text between each pair of elements (i.e. the text cannot appear before the first element or after the last element), you have to be able to iterate over a sub range of the container.

A solution to the "separator text" problem might look like this:

void
writePreText( std::ostream& o, const char* text, T const& t )
{
    o << text << t;
}
if( !v_T.empty() )
{
    std::vector< T >::const_iterator i( std::begin( v_T ) ); 
    
    ostr << *i;

    ++i;
    std::for_each(
        i,
        std::cend( v_T ),
        std::bind( writePreText, std::ref( ostr ), ",\n", _1 ) );
}

We can't do this with range-based for. If we need to do both things within our source code - writing with text after each element and writing text between each element - do we want to do those two things differently - one of them using range-based for, one using a standard algorithm? There is a cost to being inconsistent, there is also a cost to being consistent but using a slightly larger solution for one of the cases.

I know that I am overanalyzing this. In practice I wrote a templated writeContainer function which lets you specify text that is written before the first element, text that is written after the last element and text that is written between elements. Problem solved and, with the exception of this blog post, I don't have to consider the subject any more.

Algorithms in action – iota and shuffle

When I was working on my sorting code for this post, I wanted a vector of randomly ordered, unique integers for testing.

The code is straightforward:

std::vector< int >
getRandomVector( int n )
{
    std::vector< int > v_i( n );

    std::iota( std::begin( v_i ), std::end( v_i ), 0 );

    std::default_random_engine dre;
    std::shuffle( std::begin( v_i ), std::end( v_i ), dre );

    return v_i;
}

C++11 added std::iota – an algorithm that assigns incrementing values to a sequence. In my example the values are integers starting at zero.

I was confused when I first saw iota. Initially I thought someone had misspelt itoa (not a standard function but sometimes available as an extension), then I thought since iota is atoi backwards there must be a link between them.

I was wrong on all counts. std::iota represents the Greek letter iota ( Ι, &#x03B9 ) which (depending on your font and which case you are examining) looks similar to a lower case “L”, an upper case “i”, a one “1”, or a vertical bar “|”.

The Greek letter iota is used in the programming language APL to generate a sequence of consecutive integers. APL has a wild and wonderful character set described here. My favourite part of the Wikipedia article is the statement “Most APL symbols are present in Unicode”. In other words, some APL symbols are not present in Unicode (the article goes on to identify the missing symbols). I think this is wonderful. There are over 100,000 characters in Unicode and you still can’t use it to write APL (I realize that APL predates Unicode so was not trying to fit into its character set, and also that there are implementations of APL that do limit themselves to characters generally available on a modern computer). I thought digraphs and trigraphs were bad enough – C++ programmers have it easy compared to APL programmers.

std::iota uses prefix operator ++ to generate successive values. I wish I had a good example other than integers to show. I don’t, but I’ll look out for one.

One final point, std::iota assumes that you have a sequence already created – it isn’t possible to use it with std::back_inserter (or any other inserter) to create the sequence. For my example it really doesn’t make any difference, but if there is a use case with a more complicated type, possibly one that doesn’t support a default constructor, it would be nice not to have to default construct the sequence first. Of course, if this ever becomes a real problem it is easy enough to write iota_n.


I am using std::shuffle (added in C++11) rather than std::random_shuffle (available since C++98) so that I can use the bright shiny new random number generators we get in C++11. (See Stephan T. Lavavej’s talk rand() Considered Harmful for more on why the new C++ random number generators are much better than rand).

For this example I haven’t bothered to seed the random number generator, and I am not that worried about perfect randomness anyway but I am trying to drag my programming style into the C++11 world. I suspect I will succeed at embracing C++11 at about the same time C++14 is unleashed.

While we are on the subject of randomness, here’s an interesting note from the C++11 standard. The description of the three shuffle algorithms includes this phrase:

Effects: Permutes the elements in the range [first,last) such that each possible permutation of those elements has equal probability of appearance.

Given a perfect random number generator the output from any of the shuffles should be perfectly uniform. Sounds reasonable enough, however there is at least one “obvious” shuffling algorithm that fails this test:

Iterate over the elements of the sequence and swap each element with a randomly chosen element (possibly itself).

Jeff Atwood goes into more details here, and also mentions a common algorithm which does produce uniform results, the Fisher–Yates shuffle, also known as the Knuth shuffle (both links take you to the same Wikipedia page).

Roughly speaking the Fisher–Yates shuffle is:

Pick an element randomly from the input sequence, add it to the result sequence and remove it from the input sequence. Repeat until the input sequence is empty.

It’s equivalent to picking a card at random from the deck, putting it into a separate “shuffled” pile and repeating until all the cards are in the “shuffled” pile.

Many years ago I looked at a couple of standard library implementations of shuffle algorithms and they were using the Fisher–Yates shuffle. I assumed this would still be true. Once again I was wrong. Both Visual Studio and GCC are using another shuffle algorithm which I haven’t been able to track down the name of (if anyone knows, please post a comment). As far as I can tell, the algorithms works something like this:

Assume you have a perfectly shuffled sequence of n elements. Add another element to the end of the sequence, then swap it with a randomly chosen element of the new (size n + 1) sequence (possibly itself).

I assume that this procedure produces another perfectly shuffled sequence (although I can’t prove it). Since we can start with a single element (perfectly shuffled) we can build up to any number of elements.

References for shuffling:

  • The C++11 standard Section 25.3.12 Random shuffle
  • Knuth: The Art of Computer Programming Volume 2, Seminumerical Algorithms 3rd edition Section 3.4.2 Random Sampling and Shuffling
  • Cormen, Leiserson, Rivest, Stein: Introduction to Algorithms 2nd edition Section 5.3 Randomized algorithms

std::bind and lambda functions 6

This is the last post in this series, and I am going to use it to wrap up a few things that didn’t quite fit in anywhere else.

As I said at the start, this series is not the ultimate guide to std::bind and lambda, it has focused on a few things that I find interesting. As always, if you want definitive information look at these three references:

As you’d expect, Herb Sutter has written and spoken about lambdas many times.


Passing values by reference

All of my examples have assumed that bound or captured arguments are captured by value. Take a look at this code:

double add( int i, float f )
{
    return i + f;
}

typedef std::function< double( int ) > SingleArgumentAdd;
float f( 10.0f );

SingleArgumentAdd fn0( std::bind( add, _1, f ) );
SingleArgumentAdd fn1( [ f ]( int i )
{ 
    return add( i, f );
} );

f = 20.0;

std::cout << fn0( 5 ) << ", ";
std::cout << fn1( 5 ) << "\n";

It writes out 15, 15 showing that the value of f that is used is the value when the function object is created.

We could pass in a pointer to f rather than f itself:

double add1( int i, float* pF )
{
    return i + *pF;
}
float f( 10.0f );
float* pF( &f );

SingleArgumentAdd fn0( std::bind( add1, _1, pF ) );
SingleArgumentAdd fn1( [ pF ]( int i )
{ 
    return add1( i, pF );
} );

f = 20.0;

std::cout << fn0( 5 ) << ", ";
std::cout << fn1( 5 ) << "\n";

This time we get 25, 25 - the pointer is not dereferenced until fn0 and fn1 are called.

What about passing f by reference? That should give us the same result as passing by pointer - the value of f is the value when the function is called. Here's the code:

float f( 10.0f );

SingleArgumentAdd fn0( std::bind( add, _1, std::cref( f ) ) );
SingleArgumentAdd fn1( [ &f ]( int i )
{ 
    return add( i, f );
} );

f = 20.0;

std::cout << fn0( 5 ) << ", ";
std::cout << fn1( 5 ) << "\n";

The code prints out 25, 25, just as we'd expect. We had to do two things differently though:

  1. We have to explicitly tell std::bind that f must be passed by reference (a const reference in this case). If we don't do this, std::bind will dereference f and use whatever value is in f when std::bind is called. std::cref is a function that returns an object that wraps the reference and makes sure that the reference is not dereferenced until fn0 and fn1 are called. See The C++ Programming Language, 4th Edition, section 33.5.1 for more information.
  2. Lambdas are much simpler. We write &f in the capture block to indicate that f should be captured by reference.

Whether we are using pointers, references, lambdas or std::bind there is one thing we need to keep in mind - when we dereference the reference, the object it refers to must still exist.


Nesting and chaining functions

When we have an object of type std::function it is a first class object in C++. We can copy it, pass it to functions, return it from functions, store it as a member variable and treat it like any other object.

This means that we can set up nested functions, we can chain functions together and we can use std::function objects for callbacks. As always, just because you can doesn't mean that you should. I am experimenting with chains of functions for my current project and, while I can make it work, it isn't pretty and it isn't clear what is going on.


operator () should be const for predicates

I am not going to go into this in detail because others have done a better job than I have. In general, when you write your own predicate function object, operator () should be const - it cannot update any stored state in the function object. There are two reasons for this:

  1. The standard does not place any restrictions on the number of times a predicate is copied (I think that std::for_each is the one exception). As Josuttis points out in The C++ Standard Library: A Tutorial and Reference (2nd Edition) section 10.1.4, a typical implementation of std::remove_if does copy the predicate and will give the wrong result if the predicate is storing state.
  2. By default, there is no guarantee that the algorithm will traverse the list from the beginning to the end. As far as I can tell, these are the only algorithms which are guaranteed to traverse from beginning to end: std::for_each, std::copy, std::move, std::accumulate, std::inner_product, std::partial_sum and std::adjacent_difference. I found these by searching the standard for the phrases starting from first and proceeding to last, and in order.

By default, the function object constructed by a lambda expression has a const operator (). If you really want it to be non-const you can declare the lambda mutable.


Why do we need std::find_if and std::count_if?

There are algorithms which have an overloaded predicate and non-predicate version, for example, std::is_sorted :

bool
is_sorted( ForwardIterator beg, ForwardIterator end );

bool
is_sorted( ForwardIterator beg, ForwardIterator end, BinaryPredicate op );

If we look at std::find we see why we can't overload it with a predicate and non-predicate version:

InputIterator
find( InputIterator beg, InputIterator end, const T& value );

Despite the fact that we can supply template specializations that will match a function, we can't know whether we are supplying a function because we are searching for a function in a container of functions or whether we are supplying a function to act as a predicate. That means we have to have a separate std::find_if (and std::count_if) function.


std::bind and member functions

In part 2 we looked at pointers to member functions, but I never demonstrated how these work with std::bind.

class MemberFunctionDemo
{
public:
    double add( int i, float f )
    {
        return i + f;
    }

    MemberFunctionDemo()
    {
        std::cout << std::bind( 
            &MemberFunctionDemo::add, 
            this, 
            _1, 
            20.0f )( 10 );
    }
};

The first argument to std::bind is always the function that will be called. For member functions we have to use & to get a pointer to the function, and the function name must be fully qualified. The second argument is the object to call the function on - it can be a pointer or a reference to the object, in my example, I am using this.


std::bind as a way of limiting scope

A colleague at Adobe pointed out that we can use a lambda to create a block of code that does not have access to all variables in the surrounding scope:

int i, j, k;

// Some code here...

[ i, j ](){
    // Only have access to i & j here.
    // No access to k
}();

I am not sure whether this is a Good Thing, a Bad Thing or just a Thing.

std::bind and lambda functions 5

Let’s start with a quick reminder of what a lambda expression looks like:

std::vector< int > v_i( functionReturningVector() );
std::vector< double > v_d;

float f = functionReturningFloat();

std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    [ f ]( int i ){
        return i + f;
    } );

The lambda expression is this part:

[ f ]( int i ){
    return i + f;
} 

The standard says “Lambda expressions provide a concise way to create simple function objects”. The lambda expression we have just seen corresponds to the Adder function object that we looked at earlier in this series:

class Adder
{
public:
    Adder( float f )
    : f_( f )
    {}
    
    double operator()( int i ) const
    {
        return i + f_;
    }
    
private:
    float f_;
};

Repeating what I said in this post:

The lambda function consists of three parts, contained in three different types of brackets. It starts with square brackets – [] – moves on to round brackets – () – then finishes off with curly brackets – {}.

The three parts correspond to the three essential features of the function object we created for solution #2:

  1. The value(s) passed into the constructor of the function object appear in the square brackets – []
  2. The value(s) passed into operator () appear in the round brackets – ()
  3. The code in operator () appears in the curly brackets – {}

If we take our three options: a custom function object; std::bind wrapping a function and a lambda expression, we can find the commonality between them:

  • f is the value passed in to the constructor of our function object.
  • f is the value bound by std::bind.
  • f is the value we place in the square brackets of the lambda (in standardese, the square brackets are called a lambda-capture, f is a capture-list with a single element).
  • i is the value passed to the function call operator defined on the callable object. An object of type Adder is a callable object, the result of std::bind is a callable object and a lambda expression is a callable object.
  • { return i + f; } is the code that we actually want to execute. For std::bind the code we want to execute has to be wrapped in a function, for a custom function object or a lambda we can just use the code directly.

If we want to, we can invoke the function call operator on a lambda directly:

float f = functionReturningFloat();

double d = [ f ]( int i )
{
    return i + f;
}( 7 );

(we probably don’t want to, although I am sure somebody has already come up with a use for this)


A lambda-expression is an expression, and expressions have a type. We tried to work out what the type of the result of std::bind was last time – let’s try the same exercise with a lambda expression.

Section 5.1.2 of the standard talks about the type of a lambda expression. The type is a unique, unnamed nonunion class type. The standard tells us plenty about the properties of the type without ever revealing what the type actually is. As with std::bind I can force a compiler error and see what type the compiler is using. For Visual Studio I get this:

test_bl5_1::<lambda_d35a8a85fcf19231607c0773125e04ed>

For GCC I get this:

test_bl5_1()::__lambda1

(The lambda in question is declared in function test_bl5_1)

As with the type of the return value of std::bind, we are clearly not meant to use these types deliberately – the standard is telling us so, they are different between different compilers, in the case of Visual Studio, the “type” isn’t really a type at all – it is some implementation magic, the GCC type starts with an underscore, and, in case I haven’t mentioned it enough, the standard tells us that we can’t use these types. Believe the standard – it’ll make life easier. This diversion into the actual types of std::bind and lambda has been exactly that – a diversion from our main purpose of actually using these things.

So we are in the same situation as we were with std::bind. We can’t use the type directly, we don’t know what it is. We do know what the properties of the type are – in particular we can apply operator () to objects of that type.

We can use the type deduction facilities of C++11 to handle lambdas without needing to know their type. We have already used a lambda with a templated function (std::transform) and auto and decltype work exactly as we would expect:

auto fn = [ f ]( int i ){ return i + f; };
typedef decltype( fn ) SingleArgumentAdd;

Last week we looked at std::function and saw that std::function can wrap a callable object. Since a lambda expression is a callable object, std::function can wrap a lambda expression:

typedef std::function< double( int ) > SingleArgumentAdd;

SingleArgumentAdd fn;

fn = [ f ]( int i ){ return i + f; };

std::cout << fn( 42 ); // And of course we can call it

If we need to save callable objects, std::function is the way to go. std::function can be the type of a member variable, it can be passed to and returned from functions. std::function is our all purpose "thing that supports operator ()"

Finally, there is one thing that I have glossed over. When I was describing the commonality between std::bind, a function object and a lambda I looked at every variable and every type except one - the type of the result of invoking operator ().

If we look at the lambda we've been working with, we can see that if we remove the capture block (the bit inside square brackets) we have something that is very close to being a function declaration:

( int i )
{ 
    return i + f;
};

It is missing the function name (which is what we'd expect - a lambda is an unnamed function), but it is also missing the return type. In this case, because the lambda consists of a single statement that is just a return we don't have to specify the return type, the compiler will deduce it for us - that gives us another place where the compiler does type deduction.

In our case, the result of i + f is a double - that is exactly what we want. But what if we did need a return type, either because the type of the expression was not what we wanted, or because we had more complicated code within the lambda?

Lambdas use the new trailing-return-type syntax so if we wanted to be explicit about the function call operator returning a double we would write this:

[ f ]
( int i ) -> double
{ 
    return i + f;
};

That's it for this post, and almost it for this series. I have one more post in mind which will tie up a few loose ends then I'll move on to something else.

std::bind and lambda functions 4

I have been sloppy in my use of language. I am going to try to be less sloppy. Instead of using function or function object (and I suspect getting them reversed in some cases) I am going to start using the standard-approved term callable object. As always, the standard provides a strict definition, for us it will be good enough to say that a callable object allows us to apply operator() to it. Callable objects include functions, function objects, whatever it is we get back from std::bind, and lambda functions.

In part 3 I said I would reveal the return type of std::bind. If we look up the relevant section of the standard (20.8.9.1.2) we see the following declaration:

template
unspecified bind(F&& f, BoundArgs&&... bound_args);

Unspecified. That doesn’t seem terribly helpful. Of course, this being the standard, there is a definition for unspecified. Section 1.3.25 of the standard states:

unspecified behavior behavior, for a well-formed program construct and correct data, that depends on the implementation [ Note: The implementation is not required to document which behavior occurs. The range of possible behaviors is usually delineated by this International Standard. —end note ]

Still not terribly helpful. Perhaps we can find a way to determine the type anyway. By setting up an error condition I can force the compiler to tell me what type it is actually using for the result of std::bind. Assuming that I am binding add in the same way I have been doing, on Visual Studio 2013 I get this for the type:

std::_Bind<true,double,double (__cdecl *const ) (int,float),int,float &>

and on GCC I get this:

std::_Bind_helper<false, double (&)(int, float), int, float&>::type 
{aka std::_Bind<double (*(int, float))(int, float)>}

There are only three problems.

  1. The types are different on the two different compilers (because they use different implementations of the standard library).
  2. Both compilers use types beginning with an underscore – any name starting with an underscore is reserved for the implementation (see 17.6.4.3.2 in the standard).
  3. The standard has already told us clearly and explicitly that the type is unspecified – it can vary between compilers, it can vary between releases of the same compiler, it can vary between updates to the standard library. We cannot rely on the type staying the same.

However much we twist and turn, determining the actual type of the returned value from std::bind is a non starter. Even when we can find out what the type is, we can’t rely on it.

Fortunately, C++11 gives us a way of avoiding knowing the actual type – we can do type deduction.

We can assign the result of std::bind to a variable using auto:

auto fn = std::bind( add, _1, f );

That works for the situations where we can use auto, but doesn’t help us when we want to store the object in a (non-template) class or pass it to a (non-template) function – we can’t declare a parameter as auto.

We’ve already seen one of the other ways we can use type deduction with std::bind. We used std::bind successfully with std::transform, because std::transform is a template function and does type deduction. If we are passing the result to a template function or template class we might can use the result of std::bind without ever needing to know what type it is.

There is a third option. C++11 added decltype. You hand decltype an expression (which it does not evaluate, it just uses it to deduce the type), and it gives us the type of that expression. For example:

auto fn = std::bind( add, _1, f );
typedef decltype( fn ) SingleArgumentAdd;

decltype( fn ) evaluates to the type of fn (at compile time), and we typedef the result to SingleArgumentAdd. This is useful, finally we have an actual type (as in type of an object) that we can type (as in hit keys on the keyboard). It still isn’t perfect though. The example above only works within the current scope – we can’t put the typedef into a header file (at least not without doing other things in the header file that we really shouldn’t). Turns out there is a solution to this problem too. We can do this:

typedef decltype( 
    std::bind( 
        add, 
        _1, 
        std::declval< float >() ) ) SingleArgumentAdd;

We have something new – std::declval. Let’s work out what’s going on here. decltype requires an expression. In our first example of its use we gave it the expression fn. In the second example we want to give it an expression involving std::bind. This means that we need to pass arguments of the right type to std::bind. The first argument is easy – it’s the callable object that we’re trying to wrap. The whole point of this exercise is to wrap that callable object so we need to have it visible. The second argument is also easy, it’s the placeholder _1. For the third argument we must pass it a float value (not the float type, but a value of type float). If we happened to have a float variable in the current scope we could use that. We could also pass it a float constant such as 1.0f. The constant would work well here because it is a nice simple value, but what if it wasn’t a float? What if it was something which required several arguments to its constructor? We don’t want an actual value, just something that represents the value and has the correct type. For that, we can use std::declval. To quote Stroustrup:

The intent is to use declval< X > as a type where the type of a variable of type X is needed.

(The C++ Programming Language Fourth Edition, section 35.4.2)

std::declval does return a value, but we cannot use that value. Since decltype does not evaluate its expression we are safe.

Having jumped through all of those hoops to get the type SingleArgumentAdd it turns out to be a pain to use. When we looked at the definition of unspecified from the standard it stated:

The range of possible behaviors is usually delineated by this International Standard.

That range of possible behaviors for SingleArgumentAdd is pretty small. We already know that the result of std::bind can have operator () applied to it, the standard adds requirements for MoveConstructable and CopyConstructable, but it adds no other requirements. In particular, default construction and assignment are not available:

float f( functionReturningFloat() );
float f1( otherFunctionReturningFloat() );

SingleArgumentAdd fn; // Error, cannot default construct

SingleArgumentAdd fn2 = std::bind( add, _1, f );

fn2 = std::bind( add, _1, f1 ); // Error, cannot assign

Even if we could default construct and assign SingleArgumentAdd it would still be unsatisfactory. Our current definition of SingleArgumentAdd only lets us store the return value from std::bind. Wouldn’t it be nice to have a type that will allow us to store any callable object that has the correct call signature.

A call signature is the name of a return type followed by a parenthesized comma-separated list of zero or more argument types. – C++11 standard, section 20.8.1

The C++11 standard gives us exactly what we want – std::function. Let’s set up a function with the correct parameter and return types:

double singleArgAdd( int i )
{
    return i + 7;
}

Now let’s look at what we can do with std::function:

typedef std::function< double( int ) > SingleArgumentAdd;

SingleArgumentAdd fn; // We can default construct it

fn = std::bind( add, _1, f );  // We can assign the result
                               // of std::bind to it

fn = singleArgAdd;  // We can assign a function 
                    // of the correct signature to it

std::cout << fn( 42 ); // And of course we can call it

std::function wraps a callable object. When we call operator () on a std::function object, it calls operator () on the callable object it is wrapping, known as the target. std::function therefore needs to know what the return type and the parameter types are for operator (). The syntax that we use for the call signature is very similar to the syntax for declaring a function pointer - std::function< double( int ) >

Finally, an uninitialized std::function object is known as empty (and is even displayed like that in the Visual Studio debugger). If we attempt to call an empty std::function, it will throw the exception std::bad_function_call. Fortunately, it is easy to test a std::function object to see if it is empty or not:

SingleArgumentAdd fn2;

// Some code that might or might not assign something to fn2

if( fn2 )
{
    fn2( 42 );
}

That's it for this week, next week we will finally get to lambda functions.

std::bind and lambda functions 3

In part 2 we saw that by wrapping a function inside a function object we could take a function that requires two arguments and turn it into a function object that requires one argument. The code we ended up with looks like this:

class Adder
{
public:
    Adder( float f )
    : f_( f )
    {}
    
    double operator()( int i ) const
    {
        return add( i, f_ );
    }
    
private:
    float f_;
};
std::vector< int > v_i( functionReturningVector() );
std::vector< double > v_d;

float f = functionReturningFloat();

std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    Adder( f ) );

There is a problem here – we had to write a lot of boilerplate in order to wrap one function – Adder is 15 lines long, the call to add is a single line. Fortunately, C++11 gives us a couple of ways to get the same effect but with much less boilerplate. We can use std::bind to adapt add as follows:

std::vector< int > v_i( functionReturningVector() );
std::vector< double > v_d;

float f = functionReturningFloat();

std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    std::bind( add, std::placeholders::_1, f ) );

[ Aside:

I have put in the fully scoped name of the placeholder _1 just to show where it comes from. Even though I am normally a fan of using the fully qualified name this one is just so ugly I will assume that:

using namespace std::placeholders;

is in use for the rest of my examples. This simplifies our loop to:

std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    std::bind( add, _1, f ) );

End aside ]


We have two std::transform loops, one of which uses Adder( f ), the other of which uses std::bind( add, _1, f ).

We know what the code Adder( f ) does. It creates a function object of type Adder which implements operator () to call add with one argument that comes from the function call operator, and another argument that was supplied in the constructor. We need the code std::bind( add, _1, f ) to do something very similar.

Let’s remove std::bind from the context of std::transform so we can focus on one thing at a time. Here’s some code we can use to look at std::bind in isolation:

float f = functionReturningFloat();
auto fn = std::bind( add, _1, f );
std::cout << fn( 7 ) << "\n";

The call to std::bind returns a function object that we assign to the variable fn. I am using auto because I don't want to get into the question of what the type of the return value of std::bind is yet (we'll look at it later). fn is a function object that takes a single argument to its function call operator. We can therefore invoke the function call operator with a single argument - fn( 7 ) and write out the result, which will be a double because the return value of add is a double.

There are a lot of moving parts here. I am going to walk through them.

We have three functions in play:

  • add The function we ultimately want to call. The function we are going to wrap.
  • fn The function (object) that wraps add. This is the thing that converts a two argument function into a single argument function.
  • std::bind The function that creates fn from add.

We call std::bind to create fn. Later we call fn which in turn will call add.

The function we want to wrap - add - is a two argument function. Anything that calls add must call it with two arguments. Therefore, fn must call add with two arguments. Since std::bind is creating fn, std::bind must know what the two arguments are. When we call std::bind we not only supply it with the function to be wrapped - add - we also supply it with the two arguments that must be passed to add. These two arguments are _1 and f.

The second argument is easy. f is a variable of type float and we already know that the second argument to add must be of type float. It all matches just as it should.

The first argument is more interesting - _1. Arguments of the form _n are called placeholders, that's why they live in the std::placeholders namespace. The placeholder argument links the argument we pass when we call fn to the first argument that gets passed to add. Let's look at the code again:

auto fn = std::bind( add, _1, f );
std::cout << fn( 7 ) << "\n";

_1 says "take the first argument that is passed to fn and pass it to add as the first argument". In this example, when we call fn( 7 ), that ultimately results in a call to add( 7, f )

Any argument that is passed to fn can be passed on to add in any position. The placeholder we use (_1, _2 etc.) tells us which argument to use from the argument list passed to fn::operator (). The position of the placeholder in the call to std::bind tells us which position that argument will end up in when we call add.

Let me lay out the code slightly differently:

float f = 10.0f;
auto fn = std::bind( 
    add,        // add is the function we are wrapping
    _1, f       // add has two arguments therefore 
                // we supply two arguments here.
    );

// At this point, std::bind has been called but 
// add and fn have not been called

std::cout << fn( 7 ) << "\n";   // Call fn, which in 
                                // turn calls add.

Here's an example with a different placeholder:

float f = 10.0f;
auto fn = std::bind( add, _2, f );
std::cout << "result = " << fn( 7, 12 ) << "\n";

We are now using placeholder _2. This means that we have to supply the call to fn with two arguments (and if we don't we get an unfriendly error message). The output of this piece of code is:

result = 22

showing that the second argument from the call fn is the one that is used. Of course this is a nonsensical example because there is no point in passing two arguments to fn in the first place - the first argument is ignored so there was no point in supplying it.

There are plenty of other cute tricks we can play. Since an int is convertible to a float we can do this:

auto fn = std::bind( add, _1, _2 );
std::cout << "result = " << fn( 7, 12 ) << "\n";
result = 19

or this:

auto fn = std::bind( add, _2, _1 );
std::cout << "result = " << fn( 7, 12 ) << "\n";
result = 19

Since add is commutative the result is the same - I am just showing that we can change the order in which the arguments are supplied to std::bind (and therefore the order of the arguments supplied to add).

We can use a placeholder more than once:

auto fn = std::bind( add, _1, _1 );
std::cout << "result = " << fn( 7 ) << "\n";
result = 14

Or not use a placeholder at all (yes, doing this in real life is a waste of time, I just want to show all the possibilities):

auto fn = std::bind( add, 6, 5 );
std::cout << "result = " << fn() << "\n";
result = 11

I can call the object returned from std::bind directly:

float f = 10.0f;
double d = std::bind( add, _1, f )( 10 );
std::cout << "result = " << d << "\n";
result = 20

For a not-pointless example, back in this post I used placeholders to swap the order of arguments to a comparator in order to reverse the sorting order:

std::partial_sort( 
    ReverseIter( pivotElementIterator ), 
    ReverseIter( displayBegin ), 
    ReverseIter( std::begin( files ) ), 
    std::bind( comparator, _2, _1 ) );

Errors

There are a number of things we can do wrong.

We can specify too many arguments in the std::bind call. add takes two arguments, let's try giving it three:

double d = std::bind( add, _1, f0, f1 )( 10 );

Visual Studio gives the surprisingly helpful error message:

// error C2197: 'double (__cdecl *)(int,float)' : too many arguments for call

The message from GCC is longer and more confusing, but does include the words "too many arguments to function".


We can specify too few arguments in the std::bind call:

double d = std::bind( add, _1 )( 10 );

I get the following (also very helpful) error message from Visual Studio:

error C2198: 'double (__cdecl *)(int,float)' : too few arguments for call

As before, the GCC message is more verbose, but does include the words "too few arguments to function".


We can specify too few arguments in the call to fn:

auto fn = std::bind( add, _1, f );
std::cout << "result = " << fn() << "\n";

Sadly the resulting error message is almost incomprehensible.


Specifying too many arguments in our call to fn does not result in an error:

auto fn = std::bind( add, _1, f );
std::cout << "result = " << fn( 
    7, 15.6, 
    std::complex< double >( 1.0, 2.0 ) ) << "\n";

As always, just because you can do something doesn't mean that you should do it.


We can give an argument of the wrong type to std::bind:

std::complex< double > c( 1.0, 2.0 );
auto fn = std::bind( add, _1, c );
fn( 7.0 );   // Error reported here
error C2664: 'double (int,float)' : cannot convert argument 2 from 'std::complex' to 'float'

Interestingly, the error is not reported until we try and call fn.


We can give an argument of the wrong type when we call fn:

std::complex< double > c( 1.0, 2.0 );
auto fn = std::bind( add, _1, f );
fn( c );
error C2664: 'double (int,float)' : cannot convert argument 1 from 'std::complex' to 'int'

Finally, let's get back to our std::transform call:

std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    std::bind( add, _1, f ) );

std::transform is a function. When we call a function, we evaluate all of the arguments to that function. One of those arguments is the result of a call to std::bind. As we have seen, the result of calling std::bind is a function object that wraps add and turns it into a single argument function. Look at the definition of std::transform (this is from the GCC implementation. I have removed some error checking and done some reformatting):

template<
    typename _InputIterator, 
    typename _OutputIterator,
    typename _UnaryOperation >
_OutputIterator
transform(
    _InputIterator __first, 
    _InputIterator __last,
    _OutputIterator __result, 
    _UnaryOperation __unary_op)
{
    for (; __first != __last; ++__first, ++__result)
        *__result = __unary_op(*__first);
    
    return __result;
}

std::transform loops from __first to __last, and each time around the loop invokes the function call operator on __unary_op with a single argument - __unary_op(*__first). We must supply std::transform with a thing-that-supports-the-function-call-operator-taking-a-single-argument. That is exactly what we have done using std::bind.

More to come in part 4, including the mysterious type of the return value of std::bind.

What header?

I am bad at remembering what header file I need to include to use which standard library facility. I can handle the obvious ones like vector, but that still leaves many where I have to go and look it up. To make looking it up a little easier, I have created a page that lists all of the symbols in the standard library sorted by various criteria – including what header file they are in. There is a C++98 version and a C++11 version. I am sure there are mistakes in the pages, I will correct any mistakes I am told about.

C++11 Standard Library Symbols
C++98 Standard Library Symbols

std::bind and lambda functions 2

In part 1 we looked at various things we can do with functions in C. Since C++ is mostly a superset of C we can do those things in C++ as well. C++ also gives us some other ways of creating functions though. Let’s dispense with the easy options quickly then get on to the more interesting possibilities.


A function in a namespace

namespace arithmetic
{
    double add( int i, float f )
    {
        return i + f;
    }
}
double (*pFn)( int, float ) = arithmetic::add;

A static function in a class

class TestClass
{
public:
    static double add( int i, float f )
    {
        return i + f;
    }
    
};
double (*pFn)( int, float ) = TestClass::add;

Nothing tricky about putting functions into a namespace, or a class acting as a namespace.


Overloaded functions

double add( int i, float f )
{
    return i + f;
}

double add( int i, double f )
{
    return i + f;
}
double (*pFn)( int, float ) = add;

Even overloaded functions work just as we’d expect. Section 13.4 of the C++11 standard states:

The function selected is the one whose type is identical to the function type of the target type required in the context.

Let’s not get too confident though, overloads are going to return to trouble us later.


Pointer to member function

class TestClass2
{
public:
    double add( int i, float f )
    {
        return i + f;
    }
    
};

Ignore the fact that add makes absolutely no sense as a member function (very few of these examples make any sense outside of a narrow context).

What is the type of a pointer to member function? We already know what the type of add would be if it wasn’t a member function:

double (*)( int, float );

The key thing missing is the name of the class of which the function was a member. The type has to specify the class for type safety – you don’t want anyone trying to call the add function for an object that doesn’t even have an add function (and yes, you can force this with casting, but you can force almost anything with casting). Given that * is in the place where the function name used to go, we put the class name in the obvious (ish) place:

double (TestClass2::*)( int, float );

We can write this code to get the address of the add function:

double (TestClass2::*pFn)( int, float ) = &TestClass2::add;

And we can call it like this:

TestClass2 c;
(c.*pFn)( 10, 20.0f );
TestClass2* p = &c;
(p->*pFn)( 10, 20.0f );

We have two new operators .* and ->*.

Notice that this time we had to use the address-of operator & to get the address of the function and, because the function call operator () is higher precedence than .* and ->*, we need the brackets around c.*pFn and p->*pFn.

(This also works with virtual functions).


There is one more trick we need to know about before we can start applying our knowledge to std::bind and lambda. At the beginning of part 1 I wrote:

The round brackets are known as the function call operator.

At the time, referring to the function call operator might have seemed like overkill. It’s a C function, you just call the thing by passing in the arguments in the brackets. That nomenclature is important though because C++ supplies us with a function call operator that we can use as a member function of a class:

class Adder
{
public:
    double operator()( int i, float f ) const
    {
        return add( i, f );
    }
    
};

(There is a reason why I am calling the add function rather than just doing the addition correctly – that reason will become apparent later).

We call the member function operator() by applying the function call operator to an object of type Adder :

Adder a;
a( 10, 20.0f );

While describing a function in part 1 I said:

we take a thing with a name … and apply an operator – () – to it.

Well, that’s what we have here. We have a thing with a name (a) and we’re applying the function call operator (()) to it. That calls the member function operator().

We have a thing that is acting like a function but is actually an object. It is often known as a function object or functor. At first glance a function object seems like a rather heavyweight way of creating a function, but objects have a really useful property – they can store state.

Let’s look at what we can do with that property of storing state. We don’t have to pass in both arguments to operator(), we can have one of the arguments specified at construction time:

class Adder
{
public:
    Adder( float f )
    : f_( f )
    {}
    
    double operator()( int i ) const
    {
        return add( i, f_ );
    }
    
private:
    float f_;
};
Adder a( 20.0f );
a( 10 );

Again, this just looks like a heavyweight way of creating a function-like thing. It’s more work, it’s not as convenient to call – why should we bother? We bother because we have taken a function that takes two arguments to the function call operator – add – and turned it into a function (object) that takes a single argument to the function call operator. The second argument is supplied in the constructor. This is the C++ version of currying.

The one-argument version is useful because there are algorithms that expect a function that takes a single argument, for example the single source sequence version of std::transform :

std::vector< int > v_i( functionReturningVector() );
std::vector< double > v_d;

float f = functionReturningFloat();
Adder a( f );
std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    a );

We don’t even need to create a named Adder object, we can just use a temporary:

std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    Adder( f ) );

The call to std::transform is equivalent to doing this:

for( 
    std::vector< int >::iterator i( std::begin( v_i ) ); 
    i != std::end( v_i ); 
    ++i )
{
    v_d.push_back( add( *i, f ) );
}

And this is why the fact that we took a two argument function and turned it into a single argument function is important. std::transform requires a single argument function. By wrapping add in a function object and specifying the second argument in the constructor we ended up with a single argument function (a unary operation in standardese).

If we take a look at the declaration of std::transform in the standard (again, just the single-source-sequence one), we see this:

template<
    class InputIterator, 
    class OutputIterator,
    class UnaryOperation >
OutputIterator
transform(
    InputIterator first, 
    InputIterator last,
    OutputIterator result, 
    UnaryOperation op);

The final parameter of the function std::transform is a unary operation – it takes one argument. Internally, std::transform is going to use the function call operator on op, i.e. it is going to perform op( elem ) for each element in the input sequence. It is all templated, and op follows the normal rules when passing an object to a templated parameter – whatever functions are called on op must be supported by the type of op type. Since the function that std::transform calls is operator() with a single argument, the type of op must support that. We know of two “things” in C++ world that can support the function call operator – functions and function objects.

What I have been trying to say in multiple different ways is that by wrapping a function (in this case add) in a function object we can turn a function into something requiring fewer arguments at the point where the function call operator is invoked. Of course the extra arguments have to come from somewhere, and in this case they are specified in the constructor of the function object. We can give the standard algorithm exactly what it needs (an operator() with the right number and type of parameters) but supply any necessary extra paraeters via the constructor of the function object.

Here’s another example. Let’s take the file metadata structure I used in a previous post:

struct FileMetaData
{
    std::string fileName_;
    std::string directory_;
    
    std::size_t size_;
    std::time_t lastWriteTime_;
};

Assume we have an unsorted vector of file metadata objects and we want to search for a file with a given file name. std::find_if looks like the obvious choice since it lets us specify our own predicate for the search.

It seems like this function would be useful:

bool compareFileName( 
    FileMetaData const& metaData, 
    std::string const& fileName )
{
    return metaData.fileName_ == fileName;
}

compareFileName will be useful, but it is a two argument function and std::find_if takes a UnaryPredicate. Unary means it expects a single argument, and according to Merriam-Webster a predicate is:

something that is affirmed or denied of the subject in a proposition in logic

For our purposes that comes down to “something returning a bool”. compareFileName is a predicate (it will affirm or deny whether the file names match), but it is not unary.

We’re going to use our “wrap it in a function object” trick to turn compareFileName into a single argument function (object):

class CompareFileName
{
public:
    CompareFileName( std::string const& fileName )
    : fileName_( fileName )
    {
    }
    
    bool operator()( FileMetaData const& metaData ) const
    {
        return compareFileName( metaData, fileName_ );
    }
    
private:
    std::string fileName_;
};

Then we can use CompareFileName like this:

std::vector< FileMetaData > files( getFiles() );
std::string fileNameToFind( getFileName() );

std::vector< FileMetaData >::iterator i(
    std::find_if(
        std::begin( files ),
        std::end( files ),
        CompareFileName( fileNameToFind ) ) );

There is one more thing. When we looked at overloaded functions before we had no problem assigning them to a function pointer. If we try and use overloaded functions in an algorithm we run into trouble.

Here is a pair of overloaded functions:

double negate( int i )
{
    return -i;
}

double negate( float f )
{
    return -f;
}

And here is a way in which we might want to use the integer version:

std::vector< int > v_i( functionReturningVector() );
std::vector< double > v_d;

std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    negate );

Just using the name of the function directly doesn’t work, we get this error message from Visual Studio:

error C2914: 'std::transform' : cannot deduce template argument as function argument is ambiguous

How come it worked before but doesn’t work now? The reason it worked before was because of this clause in the standard:

The function selected is the one whose type is identical to the function type of the target type required in the context.

The important words are target type. When we are supplying an argument to a template function we don’t know what the target type is. We can’t have a target type until we know the function type, but in order to pick the right function (and hence find out its type) we need to know the target type.

There are at least three ways around this. One is to assign the function to a function pointer where we know the target type and therefore won’t get an error:

double (*pNegateFn)( int ) = negate;

std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    pNegateFn );

Secondly, we can use what is possibly the most legitimate cast ever:

std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    static_cast< double (*)( int ) >( negate ) );

And finally we can specify the template arguments to std::transform explicitly. We don’t normally do this with functions, but in this case it gives us the target type that the compiler is looking for:

std::transform< 
  std::vector< int >::const_iterator, 
  std::back_insert_iterator< std::vector< double > >,
  double (*)( int ) >( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    negate );

Note that I had to specify the type of std::back_inserter for this to compile. When you are specifying template arguments explicitly you need to get them right.

In practice I rarely run into problems with overloaded functions, when I do, I use static_cast to get around them.


We’ve reached the end of part 2 and still haven’t talked about std::bind or lambda. At least we’re looking at the right language now, and we have laid all of the groundwork for the next exciting episode where I promise we will get to std::bind (and maybe lambda).