Quote of the week – object oriented vs. functional programming

Object oriented programming makes code understandable by encapsulating moving parts. Functional programming makes code understandable by minimizing moving parts.

Michael Feathers

This week’s quote comes with an article, to be specific, this piece by John Carmack on functional programming. There is plenty to like in the article but two points in particular struck me:

  1. A function doesn’t have to be 100% pure to get many of the benefits of purity. Of course you don’t want to use the fact that it’s not 100% pure as an excuse for making it even less pure either.
  2. A more functional programming style often results in more parameters to a function.

A functional style does lead to more parameters, but as Carmack notes, it is much easier to test the functions in isolation. Also, if a function needs a certain set of data to do its job, it has to obtain that data from somewhere. If the data doesn’t come from the arguments passed in, it either comes from globals, or member variables of a class, and that leads us right into the issues Carmack describes with state – storing it and updating it. There are definitely costs associated with all the options, but I am beginning to think that a couple more parameters on a function might not be the worst thing in the world.

std::bind and lambda functions 4

I have been sloppy in my use of language. I am going to try to be less sloppy. Instead of using function or function object (and I suspect getting them reversed in some cases) I am going to start using the standard-approved term callable object. As always, the standard provides a strict definition, for us it will be good enough to say that a callable object allows us to apply operator() to it. Callable objects include functions, function objects, whatever it is we get back from std::bind, and lambda functions.

In part 3 I said I would reveal the return type of std::bind. If we look up the relevant section of the standard (20.8.9.1.2) we see the following declaration:

template
unspecified bind(F&& f, BoundArgs&&... bound_args);

Unspecified. That doesn’t seem terribly helpful. Of course, this being the standard, there is a definition for unspecified. Section 1.3.25 of the standard states:

unspecified behavior behavior, for a well-formed program construct and correct data, that depends on the implementation [ Note: The implementation is not required to document which behavior occurs. The range of possible behaviors is usually delineated by this International Standard. —end note ]

Still not terribly helpful. Perhaps we can find a way to determine the type anyway. By setting up an error condition I can force the compiler to tell me what type it is actually using for the result of std::bind. Assuming that I am binding add in the same way I have been doing, on Visual Studio 2013 I get this for the type:

std::_Bind<true,double,double (__cdecl *const ) (int,float),int,float &>

and on GCC I get this:

std::_Bind_helper<false, double (&)(int, float), int, float&>::type 
{aka std::_Bind<double (*(int, float))(int, float)>}

There are only three problems.

  1. The types are different on the two different compilers (because they use different implementations of the standard library).
  2. Both compilers use types beginning with an underscore – any name starting with an underscore is reserved for the implementation (see 17.6.4.3.2 in the standard).
  3. The standard has already told us clearly and explicitly that the type is unspecified – it can vary between compilers, it can vary between releases of the same compiler, it can vary between updates to the standard library. We cannot rely on the type staying the same.

However much we twist and turn, determining the actual type of the returned value from std::bind is a non starter. Even when we can find out what the type is, we can’t rely on it.

Fortunately, C++11 gives us a way of avoiding knowing the actual type – we can do type deduction.

We can assign the result of std::bind to a variable using auto:

auto fn = std::bind( add, _1, f );

That works for the situations where we can use auto, but doesn’t help us when we want to store the object in a (non-template) class or pass it to a (non-template) function – we can’t declare a parameter as auto.

We’ve already seen one of the other ways we can use type deduction with std::bind. We used std::bind successfully with std::transform, because std::transform is a template function and does type deduction. If we are passing the result to a template function or template class we might can use the result of std::bind without ever needing to know what type it is.

There is a third option. C++11 added decltype. You hand decltype an expression (which it does not evaluate, it just uses it to deduce the type), and it gives us the type of that expression. For example:

auto fn = std::bind( add, _1, f );
typedef decltype( fn ) SingleArgumentAdd;

decltype( fn ) evaluates to the type of fn (at compile time), and we typedef the result to SingleArgumentAdd. This is useful, finally we have an actual type (as in type of an object) that we can type (as in hit keys on the keyboard). It still isn’t perfect though. The example above only works within the current scope – we can’t put the typedef into a header file (at least not without doing other things in the header file that we really shouldn’t). Turns out there is a solution to this problem too. We can do this:

typedef decltype( 
    std::bind( 
        add, 
        _1, 
        std::declval< float >() ) ) SingleArgumentAdd;

We have something new – std::declval. Let’s work out what’s going on here. decltype requires an expression. In our first example of its use we gave it the expression fn. In the second example we want to give it an expression involving std::bind. This means that we need to pass arguments of the right type to std::bind. The first argument is easy – it’s the callable object that we’re trying to wrap. The whole point of this exercise is to wrap that callable object so we need to have it visible. The second argument is also easy, it’s the placeholder _1. For the third argument we must pass it a float value (not the float type, but a value of type float). If we happened to have a float variable in the current scope we could use that. We could also pass it a float constant such as 1.0f. The constant would work well here because it is a nice simple value, but what if it wasn’t a float? What if it was something which required several arguments to its constructor? We don’t want an actual value, just something that represents the value and has the correct type. For that, we can use std::declval. To quote Stroustrup:

The intent is to use declval< X > as a type where the type of a variable of type X is needed.

(The C++ Programming Language Fourth Edition, section 35.4.2)

std::declval does return a value, but we cannot use that value. Since decltype does not evaluate its expression we are safe.

Having jumped through all of those hoops to get the type SingleArgumentAdd it turns out to be a pain to use. When we looked at the definition of unspecified from the standard it stated:

The range of possible behaviors is usually delineated by this International Standard.

That range of possible behaviors for SingleArgumentAdd is pretty small. We already know that the result of std::bind can have operator () applied to it, the standard adds requirements for MoveConstructable and CopyConstructable, but it adds no other requirements. In particular, default construction and assignment are not available:

float f( functionReturningFloat() );
float f1( otherFunctionReturningFloat() );

SingleArgumentAdd fn; // Error, cannot default construct

SingleArgumentAdd fn2 = std::bind( add, _1, f );

fn2 = std::bind( add, _1, f1 ); // Error, cannot assign

Even if we could default construct and assign SingleArgumentAdd it would still be unsatisfactory. Our current definition of SingleArgumentAdd only lets us store the return value from std::bind. Wouldn’t it be nice to have a type that will allow us to store any callable object that has the correct call signature.

A call signature is the name of a return type followed by a parenthesized comma-separated list of zero or more argument types. – C++11 standard, section 20.8.1

The C++11 standard gives us exactly what we want – std::function. Let’s set up a function with the correct parameter and return types:

double singleArgAdd( int i )
{
    return i + 7;
}

Now let’s look at what we can do with std::function:

typedef std::function< double( int ) > SingleArgumentAdd;

SingleArgumentAdd fn; // We can default construct it

fn = std::bind( add, _1, f );  // We can assign the result
                               // of std::bind to it

fn = singleArgAdd;  // We can assign a function 
                    // of the correct signature to it

std::cout << fn( 42 ); // And of course we can call it

std::function wraps a callable object. When we call operator () on a std::function object, it calls operator () on the callable object it is wrapping, known as the target. std::function therefore needs to know what the return type and the parameter types are for operator (). The syntax that we use for the call signature is very similar to the syntax for declaring a function pointer - std::function< double( int ) >

Finally, an uninitialized std::function object is known as empty (and is even displayed like that in the Visual Studio debugger). If we attempt to call an empty std::function, it will throw the exception std::bad_function_call. Fortunately, it is easy to test a std::function object to see if it is empty or not:

SingleArgumentAdd fn2;

// Some code that might or might not assign something to fn2

if( fn2 )
{
    fn2( 42 );
}

That's it for this week, next week we will finally get to lambda functions.

Quote of the week – overloading

overloading n. 1 Semantics Ganging up on a poor signifier until it collapses from excessive signification. 2 OOP Assigning unlikely meaning to well-known operators. Ideally, for maximum confusion, the overloading definitions should be hidden.

Stan Kelly-Bootle The Computer Contradictionary

Sadly, Stan Kelly-Bootle died in April 2014. He was a major figure in the computer world and the folk song world. Obituaries can be found here, here and here.

I could quote Stan Kelly-Bootle infinitely, and when I say “infinitely” I do mean infinitely. The Devil’s DP Dictionary gives the following definition:

recursive adj. See RECURSIVE.

although this was later modified in The Computer Contradictionary to:

recursive adj. while (unclear) { unclear--; See RECURSIVE }

Of course the definition of operator -- is overloaded and hidden.

Unfortunately The Devil’s DP Dictionary and The Computer Contradictionary are both out of print but there are plenty of second hand copies available from Amazon or Alibris.

std::bind and lambda functions 3

In part 2 we saw that by wrapping a function inside a function object we could take a function that requires two arguments and turn it into a function object that requires one argument. The code we ended up with looks like this:

class Adder
{
public:
    Adder( float f )
    : f_( f )
    {}
    
    double operator()( int i ) const
    {
        return add( i, f_ );
    }
    
private:
    float f_;
};
std::vector< int > v_i( functionReturningVector() );
std::vector< double > v_d;

float f = functionReturningFloat();

std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    Adder( f ) );

There is a problem here – we had to write a lot of boilerplate in order to wrap one function – Adder is 15 lines long, the call to add is a single line. Fortunately, C++11 gives us a couple of ways to get the same effect but with much less boilerplate. We can use std::bind to adapt add as follows:

std::vector< int > v_i( functionReturningVector() );
std::vector< double > v_d;

float f = functionReturningFloat();

std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    std::bind( add, std::placeholders::_1, f ) );

[ Aside:

I have put in the fully scoped name of the placeholder _1 just to show where it comes from. Even though I am normally a fan of using the fully qualified name this one is just so ugly I will assume that:

using namespace std::placeholders;

is in use for the rest of my examples. This simplifies our loop to:

std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    std::bind( add, _1, f ) );

End aside ]


We have two std::transform loops, one of which uses Adder( f ), the other of which uses std::bind( add, _1, f ).

We know what the code Adder( f ) does. It creates a function object of type Adder which implements operator () to call add with one argument that comes from the function call operator, and another argument that was supplied in the constructor. We need the code std::bind( add, _1, f ) to do something very similar.

Let’s remove std::bind from the context of std::transform so we can focus on one thing at a time. Here’s some code we can use to look at std::bind in isolation:

float f = functionReturningFloat();
auto fn = std::bind( add, _1, f );
std::cout << fn( 7 ) << "\n";

The call to std::bind returns a function object that we assign to the variable fn. I am using auto because I don't want to get into the question of what the type of the return value of std::bind is yet (we'll look at it later). fn is a function object that takes a single argument to its function call operator. We can therefore invoke the function call operator with a single argument - fn( 7 ) and write out the result, which will be a double because the return value of add is a double.

There are a lot of moving parts here. I am going to walk through them.

We have three functions in play:

  • add The function we ultimately want to call. The function we are going to wrap.
  • fn The function (object) that wraps add. This is the thing that converts a two argument function into a single argument function.
  • std::bind The function that creates fn from add.

We call std::bind to create fn. Later we call fn which in turn will call add.

The function we want to wrap - add - is a two argument function. Anything that calls add must call it with two arguments. Therefore, fn must call add with two arguments. Since std::bind is creating fn, std::bind must know what the two arguments are. When we call std::bind we not only supply it with the function to be wrapped - add - we also supply it with the two arguments that must be passed to add. These two arguments are _1 and f.

The second argument is easy. f is a variable of type float and we already know that the second argument to add must be of type float. It all matches just as it should.

The first argument is more interesting - _1. Arguments of the form _n are called placeholders, that's why they live in the std::placeholders namespace. The placeholder argument links the argument we pass when we call fn to the first argument that gets passed to add. Let's look at the code again:

auto fn = std::bind( add, _1, f );
std::cout << fn( 7 ) << "\n";

_1 says "take the first argument that is passed to fn and pass it to add as the first argument". In this example, when we call fn( 7 ), that ultimately results in a call to add( 7, f )

Any argument that is passed to fn can be passed on to add in any position. The placeholder we use (_1, _2 etc.) tells us which argument to use from the argument list passed to fn::operator (). The position of the placeholder in the call to std::bind tells us which position that argument will end up in when we call add.

Let me lay out the code slightly differently:

float f = 10.0f;
auto fn = std::bind( 
    add,        // add is the function we are wrapping
    _1, f       // add has two arguments therefore 
                // we supply two arguments here.
    );

// At this point, std::bind has been called but 
// add and fn have not been called

std::cout << fn( 7 ) << "\n";   // Call fn, which in 
                                // turn calls add.

Here's an example with a different placeholder:

float f = 10.0f;
auto fn = std::bind( add, _2, f );
std::cout << "result = " << fn( 7, 12 ) << "\n";

We are now using placeholder _2. This means that we have to supply the call to fn with two arguments (and if we don't we get an unfriendly error message). The output of this piece of code is:

result = 22

showing that the second argument from the call fn is the one that is used. Of course this is a nonsensical example because there is no point in passing two arguments to fn in the first place - the first argument is ignored so there was no point in supplying it.

There are plenty of other cute tricks we can play. Since an int is convertible to a float we can do this:

auto fn = std::bind( add, _1, _2 );
std::cout << "result = " << fn( 7, 12 ) << "\n";
result = 19

or this:

auto fn = std::bind( add, _2, _1 );
std::cout << "result = " << fn( 7, 12 ) << "\n";
result = 19

Since add is commutative the result is the same - I am just showing that we can change the order in which the arguments are supplied to std::bind (and therefore the order of the arguments supplied to add).

We can use a placeholder more than once:

auto fn = std::bind( add, _1, _1 );
std::cout << "result = " << fn( 7 ) << "\n";
result = 14

Or not use a placeholder at all (yes, doing this in real life is a waste of time, I just want to show all the possibilities):

auto fn = std::bind( add, 6, 5 );
std::cout << "result = " << fn() << "\n";
result = 11

I can call the object returned from std::bind directly:

float f = 10.0f;
double d = std::bind( add, _1, f )( 10 );
std::cout << "result = " << d << "\n";
result = 20

For a not-pointless example, back in this post I used placeholders to swap the order of arguments to a comparator in order to reverse the sorting order:

std::partial_sort( 
    ReverseIter( pivotElementIterator ), 
    ReverseIter( displayBegin ), 
    ReverseIter( std::begin( files ) ), 
    std::bind( comparator, _2, _1 ) );

Errors

There are a number of things we can do wrong.

We can specify too many arguments in the std::bind call. add takes two arguments, let's try giving it three:

double d = std::bind( add, _1, f0, f1 )( 10 );

Visual Studio gives the surprisingly helpful error message:

// error C2197: 'double (__cdecl *)(int,float)' : too many arguments for call

The message from GCC is longer and more confusing, but does include the words "too many arguments to function".


We can specify too few arguments in the std::bind call:

double d = std::bind( add, _1 )( 10 );

I get the following (also very helpful) error message from Visual Studio:

error C2198: 'double (__cdecl *)(int,float)' : too few arguments for call

As before, the GCC message is more verbose, but does include the words "too few arguments to function".


We can specify too few arguments in the call to fn:

auto fn = std::bind( add, _1, f );
std::cout << "result = " << fn() << "\n";

Sadly the resulting error message is almost incomprehensible.


Specifying too many arguments in our call to fn does not result in an error:

auto fn = std::bind( add, _1, f );
std::cout << "result = " << fn( 
    7, 15.6, 
    std::complex< double >( 1.0, 2.0 ) ) << "\n";

As always, just because you can do something doesn't mean that you should do it.


We can give an argument of the wrong type to std::bind:

std::complex< double > c( 1.0, 2.0 );
auto fn = std::bind( add, _1, c );
fn( 7.0 );   // Error reported here
error C2664: 'double (int,float)' : cannot convert argument 2 from 'std::complex' to 'float'

Interestingly, the error is not reported until we try and call fn.


We can give an argument of the wrong type when we call fn:

std::complex< double > c( 1.0, 2.0 );
auto fn = std::bind( add, _1, f );
fn( c );
error C2664: 'double (int,float)' : cannot convert argument 1 from 'std::complex' to 'int'

Finally, let's get back to our std::transform call:

std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    std::bind( add, _1, f ) );

std::transform is a function. When we call a function, we evaluate all of the arguments to that function. One of those arguments is the result of a call to std::bind. As we have seen, the result of calling std::bind is a function object that wraps add and turns it into a single argument function. Look at the definition of std::transform (this is from the GCC implementation. I have removed some error checking and done some reformatting):

template<
    typename _InputIterator, 
    typename _OutputIterator,
    typename _UnaryOperation >
_OutputIterator
transform(
    _InputIterator __first, 
    _InputIterator __last,
    _OutputIterator __result, 
    _UnaryOperation __unary_op)
{
    for (; __first != __last; ++__first, ++__result)
        *__result = __unary_op(*__first);
    
    return __result;
}

std::transform loops from __first to __last, and each time around the loop invokes the function call operator on __unary_op with a single argument - __unary_op(*__first). We must supply std::transform with a thing-that-supports-the-function-call-operator-taking-a-single-argument. That is exactly what we have done using std::bind.

More to come in part 4, including the mysterious type of the return value of std::bind.

Quote of the week – design

Without requirements or design, programming is the art of adding bugs to an empty text file.

Louis Srygley

If I were a cynical person I would point out that this quote implies that programming is better with requirements and design, an implication that is too often false. Fortunately I am not a cynical person, so I won’t mention it.

Design of software has always interested me – not just coming up with the design, but also communicating the design and modifying it as the project inevitably evolves. How do you get a design from person A’s head into the heads of persons B-Z?

What header?

I am bad at remembering what header file I need to include to use which standard library facility. I can handle the obvious ones like vector, but that still leaves many where I have to go and look it up. To make looking it up a little easier, I have created a page that lists all of the symbols in the standard library sorted by various criteria – including what header file they are in. There is a C++98 version and a C++11 version. I am sure there are mistakes in the pages, I will correct any mistakes I am told about.

C++11 Standard Library Symbols
C++98 Standard Library Symbols

std::bind and lambda functions 2

In part 1 we looked at various things we can do with functions in C. Since C++ is mostly a superset of C we can do those things in C++ as well. C++ also gives us some other ways of creating functions though. Let’s dispense with the easy options quickly then get on to the more interesting possibilities.


A function in a namespace

namespace arithmetic
{
    double add( int i, float f )
    {
        return i + f;
    }
}
double (*pFn)( int, float ) = arithmetic::add;

A static function in a class

class TestClass
{
public:
    static double add( int i, float f )
    {
        return i + f;
    }
    
};
double (*pFn)( int, float ) = TestClass::add;

Nothing tricky about putting functions into a namespace, or a class acting as a namespace.


Overloaded functions

double add( int i, float f )
{
    return i + f;
}

double add( int i, double f )
{
    return i + f;
}
double (*pFn)( int, float ) = add;

Even overloaded functions work just as we’d expect. Section 13.4 of the C++11 standard states:

The function selected is the one whose type is identical to the function type of the target type required in the context.

Let’s not get too confident though, overloads are going to return to trouble us later.


Pointer to member function

class TestClass2
{
public:
    double add( int i, float f )
    {
        return i + f;
    }
    
};

Ignore the fact that add makes absolutely no sense as a member function (very few of these examples make any sense outside of a narrow context).

What is the type of a pointer to member function? We already know what the type of add would be if it wasn’t a member function:

double (*)( int, float );

The key thing missing is the name of the class of which the function was a member. The type has to specify the class for type safety – you don’t want anyone trying to call the add function for an object that doesn’t even have an add function (and yes, you can force this with casting, but you can force almost anything with casting). Given that * is in the place where the function name used to go, we put the class name in the obvious (ish) place:

double (TestClass2::*)( int, float );

We can write this code to get the address of the add function:

double (TestClass2::*pFn)( int, float ) = &TestClass2::add;

And we can call it like this:

TestClass2 c;
(c.*pFn)( 10, 20.0f );
TestClass2* p = &c;
(p->*pFn)( 10, 20.0f );

We have two new operators .* and ->*.

Notice that this time we had to use the address-of operator & to get the address of the function and, because the function call operator () is higher precedence than .* and ->*, we need the brackets around c.*pFn and p->*pFn.

(This also works with virtual functions).


There is one more trick we need to know about before we can start applying our knowledge to std::bind and lambda. At the beginning of part 1 I wrote:

The round brackets are known as the function call operator.

At the time, referring to the function call operator might have seemed like overkill. It’s a C function, you just call the thing by passing in the arguments in the brackets. That nomenclature is important though because C++ supplies us with a function call operator that we can use as a member function of a class:

class Adder
{
public:
    double operator()( int i, float f ) const
    {
        return add( i, f );
    }
    
};

(There is a reason why I am calling the add function rather than just doing the addition correctly – that reason will become apparent later).

We call the member function operator() by applying the function call operator to an object of type Adder :

Adder a;
a( 10, 20.0f );

While describing a function in part 1 I said:

we take a thing with a name … and apply an operator – () – to it.

Well, that’s what we have here. We have a thing with a name (a) and we’re applying the function call operator (()) to it. That calls the member function operator().

We have a thing that is acting like a function but is actually an object. It is often known as a function object or functor. At first glance a function object seems like a rather heavyweight way of creating a function, but objects have a really useful property – they can store state.

Let’s look at what we can do with that property of storing state. We don’t have to pass in both arguments to operator(), we can have one of the arguments specified at construction time:

class Adder
{
public:
    Adder( float f )
    : f_( f )
    {}
    
    double operator()( int i ) const
    {
        return add( i, f_ );
    }
    
private:
    float f_;
};
Adder a( 20.0f );
a( 10 );

Again, this just looks like a heavyweight way of creating a function-like thing. It’s more work, it’s not as convenient to call – why should we bother? We bother because we have taken a function that takes two arguments to the function call operator – add – and turned it into a function (object) that takes a single argument to the function call operator. The second argument is supplied in the constructor. This is the C++ version of currying.

The one-argument version is useful because there are algorithms that expect a function that takes a single argument, for example the single source sequence version of std::transform :

std::vector< int > v_i( functionReturningVector() );
std::vector< double > v_d;

float f = functionReturningFloat();
Adder a( f );
std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    a );

We don’t even need to create a named Adder object, we can just use a temporary:

std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    Adder( f ) );

The call to std::transform is equivalent to doing this:

for( 
    std::vector< int >::iterator i( std::begin( v_i ) ); 
    i != std::end( v_i ); 
    ++i )
{
    v_d.push_back( add( *i, f ) );
}

And this is why the fact that we took a two argument function and turned it into a single argument function is important. std::transform requires a single argument function. By wrapping add in a function object and specifying the second argument in the constructor we ended up with a single argument function (a unary operation in standardese).

If we take a look at the declaration of std::transform in the standard (again, just the single-source-sequence one), we see this:

template<
    class InputIterator, 
    class OutputIterator,
    class UnaryOperation >
OutputIterator
transform(
    InputIterator first, 
    InputIterator last,
    OutputIterator result, 
    UnaryOperation op);

The final parameter of the function std::transform is a unary operation – it takes one argument. Internally, std::transform is going to use the function call operator on op, i.e. it is going to perform op( elem ) for each element in the input sequence. It is all templated, and op follows the normal rules when passing an object to a templated parameter – whatever functions are called on op must be supported by the type of op type. Since the function that std::transform calls is operator() with a single argument, the type of op must support that. We know of two “things” in C++ world that can support the function call operator – functions and function objects.

What I have been trying to say in multiple different ways is that by wrapping a function (in this case add) in a function object we can turn a function into something requiring fewer arguments at the point where the function call operator is invoked. Of course the extra arguments have to come from somewhere, and in this case they are specified in the constructor of the function object. We can give the standard algorithm exactly what it needs (an operator() with the right number and type of parameters) but supply any necessary extra paraeters via the constructor of the function object.

Here’s another example. Let’s take the file metadata structure I used in a previous post:

struct FileMetaData
{
    std::string fileName_;
    std::string directory_;
    
    std::size_t size_;
    std::time_t lastWriteTime_;
};

Assume we have an unsorted vector of file metadata objects and we want to search for a file with a given file name. std::find_if looks like the obvious choice since it lets us specify our own predicate for the search.

It seems like this function would be useful:

bool compareFileName( 
    FileMetaData const& metaData, 
    std::string const& fileName )
{
    return metaData.fileName_ == fileName;
}

compareFileName will be useful, but it is a two argument function and std::find_if takes a UnaryPredicate. Unary means it expects a single argument, and according to Merriam-Webster a predicate is:

something that is affirmed or denied of the subject in a proposition in logic

For our purposes that comes down to “something returning a bool”. compareFileName is a predicate (it will affirm or deny whether the file names match), but it is not unary.

We’re going to use our “wrap it in a function object” trick to turn compareFileName into a single argument function (object):

class CompareFileName
{
public:
    CompareFileName( std::string const& fileName )
    : fileName_( fileName )
    {
    }
    
    bool operator()( FileMetaData const& metaData ) const
    {
        return compareFileName( metaData, fileName_ );
    }
    
private:
    std::string fileName_;
};

Then we can use CompareFileName like this:

std::vector< FileMetaData > files( getFiles() );
std::string fileNameToFind( getFileName() );

std::vector< FileMetaData >::iterator i(
    std::find_if(
        std::begin( files ),
        std::end( files ),
        CompareFileName( fileNameToFind ) ) );

There is one more thing. When we looked at overloaded functions before we had no problem assigning them to a function pointer. If we try and use overloaded functions in an algorithm we run into trouble.

Here is a pair of overloaded functions:

double negate( int i )
{
    return -i;
}

double negate( float f )
{
    return -f;
}

And here is a way in which we might want to use the integer version:

std::vector< int > v_i( functionReturningVector() );
std::vector< double > v_d;

std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    negate );

Just using the name of the function directly doesn’t work, we get this error message from Visual Studio:

error C2914: 'std::transform' : cannot deduce template argument as function argument is ambiguous

How come it worked before but doesn’t work now? The reason it worked before was because of this clause in the standard:

The function selected is the one whose type is identical to the function type of the target type required in the context.

The important words are target type. When we are supplying an argument to a template function we don’t know what the target type is. We can’t have a target type until we know the function type, but in order to pick the right function (and hence find out its type) we need to know the target type.

There are at least three ways around this. One is to assign the function to a function pointer where we know the target type and therefore won’t get an error:

double (*pNegateFn)( int ) = negate;

std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    pNegateFn );

Secondly, we can use what is possibly the most legitimate cast ever:

std::transform( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    static_cast< double (*)( int ) >( negate ) );

And finally we can specify the template arguments to std::transform explicitly. We don’t normally do this with functions, but in this case it gives us the target type that the compiler is looking for:

std::transform< 
  std::vector< int >::const_iterator, 
  std::back_insert_iterator< std::vector< double > >,
  double (*)( int ) >( 
    std::begin( v_i ), 
    std::end( v_i ), 
    std::back_inserter( v_d ), 
    negate );

Note that I had to specify the type of std::back_inserter for this to compile. When you are specifying template arguments explicitly you need to get them right.

In practice I rarely run into problems with overloaded functions, when I do, I use static_cast to get around them.


We’ve reached the end of part 2 and still haven’t talked about std::bind or lambda. At least we’re looking at the right language now, and we have laid all of the groundwork for the next exciting episode where I promise we will get to std::bind (and maybe lambda).

std::bind and lambda functions 1

In the “no raw loops” series I have been glossing over the details of std::bind and lambda. I want to dive a little more deeply into some of those details.

I am not going to explore all of the details, as with everything I post I am going to focus on the parts that are interesting and useful to me. If you are looking for definitive descriptions of what is going on, I highly recommend these three references:

The standard looks intimidating, it is written in a very precise subset of English, but with a little practice at reading “standardese” it is very approachable, well written and useful. It is worth taking the time to get used to it.

Let’s go back to basics. If you understand C function pointers skip this installment, the C++ stuff won’t arrive until later.

Let’s go back to C and a simple function in C:

double add( int i, float f )
{
    return i + f;
}

What can we do with this function? The first obvious thing we can do is to call it:

int i = 7;
float f = 10.5f;
double d = add( i, f );
printf( "%f\n", d );

How do we call it? We follow the name of the function with round brackets () and put the list of arguments to the function inside the round brackets. The result of the expression add( i, f ) is the return value and type of the function.

The round brackets are known as the function call operator. Notice that word operator, it’s a word that comes up a lot in C and C++. In general you have something being operated upon (maybe several somethings) and an operator that defines what is happening. For example:

int i = 7;
++i;

We take a thing with a name – i – of type int and apply an operator – ++ – to it.

In the case of our function we take a thing with a name – add – of type unknown (at least at the moment it’s unknown, we’ll get to it soon) and apply an operator – () – to it.

There is definitely some common ground between an int and our function. Is there more common ground? C provides us with something else we can do with functions. We can create a function pointer and point it at a function. The syntax is wacky (but if we had a problem with wacky syntax we shouldn’t be programming in C or C++):

double (*pFn)( int, float ) = &add;

To obtain the type of the function pointer we take the function prototype and remove the function name double ( int, float ), then we add (*) where the function name used to be to indicate that this is a pointer to a function double (*)( int, float ). That gives us the type of the function. In order to get a variable of that type we put in the name of the variable next to the (*)double (*pFn)( int, float ).

So we have pFn – a pointer to a function. We can dereference the pointer and apply the function call operator to it:

int i = 7;
float f = 10.5f;
double d = (*pFn)( i, f );
printf( "%f\n", d );

Given that it’s a pointer we should be able to change the function that it is pointing at. Here’s an unrealistic but illustrative example:

double mul( int i, float f )
{
    return i * f;
}

enum Operator
{
    OPERATOR_ADD,
    OPERATOR_MUL
};

double callOperator( enum Operator op, int i, float f )
{
    double (*pFn)( int, float );
    if( op == OPERATOR_ADD )
    {
        pFn = &add;
    }
    else 
    {
        pFn = &mul;
    }

    return (*pFn)( i, f );
}

Since pFn is a pointer, we might wonder what happens if we set it to NULL, then try to call it:

pFn = NULL;
d = (*pFn)( i, f );

As you might expect, calling a NULL function pointer falls squarely in the “don’t do that” category. In fact, it’s like attempting to dereference any other NULL pointer. On Visual Studio we get this:

Unhandled exception at 0x0000000000000000 in Blog.exe: 0xC0000005: Access violation executing location 0x0000000000000000.

On cygwin we get this:

Segmentation fault (core dumped)

We can make life simpler for ourselves in a couple of ways. The C standard allows us to omit the & (address of) operator when we assign a function to a pointer so instead of:

pFn = &add;

We can write:

pFn = add;

In a similar vein, we can omit the * (indirection) operator. Instead of:

d = (*pFn)( i, f );

We can write:

d = pFn( i, f );

Finally, we can use typedef to keep the wacky function pointer type syntax in one place:

typedef double (*FunctionPtr)( int, float );
FunctionPtr pFn = add;

We can now store a pointer to a function in a variable. That means we can save that function and use it later. We can pass the pointer to a function into another function, and return it from a function. Orginally the only thing we could do with our function was to call it (apply the function call operator to it), now we can pass it around in the same way as we pass ints around. This gets us into the world of “first class citizens” or “first class functions”.

There doesn’t appear to be a precise definition for a first class function, but it is safe to assume that if a function can only be called it is not a first class function, the function must be able to be stored and passed around to have any hope of being a first class function. There are a couple of Wikipedia articles with useful information:

First class citizen

First class functions

That’s it for part #1. We haven’t mentioned std::bind and lambdas yet, in fact we haven’t even made it into the right language. Stay tuned for part #2 where we move on to C++.

Quote of the week – quality

Quality is a team issue.

Andrew Hunt and David Thomas The Pragmatic Programmer

I have worked on projects where quality was the sole responsibility of the QA department. I have worked on projects where quality was (in theory) injected during the two weeks before shipping. None of these projects have produced quality code or quality products.