No raw loops – output | Random programming

Let’s get back to raw loops, or rather the absence of raw loops. How do we write out the contents of a container so that each element appears on its own line? Assume we have some type T and a vector of T:

class T
{
   ...
};

std::vector< T > v_T;

This is the old style way to write the contents:

std::ostringstream ostr;
for( std::vector< T >::const_iterator i( std::begin( v_T ) ); 
    i != std::end( v_T ); ++i )
{
    ostr << *i << "\n";
}

The range-based for solution is straightforward:

for( T const& t : v_T )
{
    ostr << t << "\n";
}

Lambda functions also provide a solution:

std::for_each( 
    std::begin( v_T ), 
    std::end( v_T ), 
    [ &ostr ]( T const& t ){
        ostr << t << "\n";
    } );

adobe::for_each( 
    v_T, 
    [ &ostr ]( T const& t ){
        ostr << t << "\n";
    } );

Conveniently, output streams have iterators. We can use std::ostream_iterator as the destination for std::copy or adobe::copy :

std::copy( 
    std::begin( v_T ), 
    std::end( v_T ), 
    std::ostream_iterator< T >( 
        ostr, 
        "\n" ) );

adobe::copy( 
    v_T, 
    std::ostream_iterator< T >( 
        ostr, 
        "\n" ) );

Notice that we can pass a string (called the delimiter string in the standard) to the std::ostream_iterator constructor. The delimiter string is written out after every element.

Since std::ostream_iterator is an iterator with std::ostream as its underlying container, we might thank that we would be able to do this:

std::copy( 
    begin( v_T ), 
    end( v_T ), 
    std::begin( ostr ) );

We can't. The standard does not include a specialization of std::begin for std::ostream, and we are not allowed to add one ourselves because of Section 17.6.4.2.1 of the C++11 standard which states:

A program may add a template specialization for any standard library template to namespace std only if the declaration depends on a user-defined type and the specialization meets the standard library requirements for the original template and is not explicitly prohibited

In addition to that, std::begin does not allow for the additional delimiter string argument. Also, std::ostream_iterator is instantiated for the type of the source value which is not present in the single argument we pass to std::begin so cannot be deduced.

There is nothing stopping us writing our own function to construct a std::ostream_iterator from std::ostream. Since it's our function it can take an additional argument for the delimiter string, and so long as we don't put it into the standard namespace we can even call it begin. We do have to specify the type of our output when we call the function (the function cannot deduce it from the arguments because that type never appears in the arguments):

template< typename U >
std::ostream_iterator< U >
begin( std::ostream& ostr, const char* delimiterString )
{
    return std::ostream_iterator< U >( ostr, delimiterString );
}

std::copy( 
    begin( v_T ), 
    end( v_T ), 
    begin< T >( ostr, "\n" ) );

And an example that almost works. Since operator << is just a function, we can pass it to std::bind :

std::for_each(
    std::begin( v_T ), 
    std::end( v_T ),
    std::bind( operator <<, std::ref( ostr ), _1 ) );

This writes out the elements of the container, but it does not put each element on its own line - there is nowhere to plug the "\n" into. We can't achieve the each element appears on its own line requirement from our original problem statement.

We can write a function that takes an object of type T and a delimiter string and writes out both of them:

void
writePostText( std::ostream& o, T const& t, const char* text )
{
    o << t << text;
}

std::for_each(
    std::begin( v_T ), 
    std::end( v_T ),
    std::bind( writePostText, std::ref( ostr ), _1, "\n" ) );

Which of these solutions is the best? As usual there isn't a definitive answer to that question, but it looks like the range-based for solution looks has the lowest overhead (overhead being bits of boilerplate syntax that are not directly used to solve the problem), with adobe::for_each close behind.

There is one more thing to throw into the mix. The range-based for looks good, however it can only iterate over the entire container. Most of the time this isn't a problem, I estimate that 95% of my algorithm usage is over the entire container. In the case of writing things out though, there is a common use case where you want to iterate over a portion of the container. If you are writing out JSON, or some similar format where you need to put text between each pair of elements (i.e. the text cannot appear before the first element or after the last element), you have to be able to iterate over a sub range of the container.

A solution to the "separator text" problem might look like this:

void
writePreText( std::ostream& o, const char* text, T const& t )
{
    o << text << t;
}

if( !v_T.empty() )
{
    std::vector< T >::const_iterator i( std::begin( v_T ) ); 
    
    ostr << *i;

    ++i;
    std::for_each(
        i,
        std::cend( v_T ),
        std::bind( writePreText, std::ref( ostr ), ",\n", _1 ) );
}

We can't do this with range-based for. If we need to do both things within our source code - writing with text after each element and writing text between each element - do we want to do those two things differently - one of them using range-based for, one using a standard algorithm? There is a cost to being inconsistent, there is also a cost to being consistent but using a slightly larger solution for one of the cases.

I know that I am overanalyzing this. In practice I wrote a templated writeContainer function which lets you specify text that is written before the first element, text that is written after the last element and text that is written between elements. Problem solved and, with the exception of this blog post, I don't have to consider the subject any more.

Random programming

Random programming

Bob Archer's thoughts about programming

No raw loops – output

Leave a Reply Cancel reply