CMM: Generic programming tutorial: Generic reduction and initialization

When initializing the result, new (old) problems occur. We need the neutral element for arbitrary combinations of value type and operator.
Author: Guntram Berti

The trouble with init

Geschrieben steht: 'Am Anfang war das Wort!' Hier stock ich schon! Wer hilft mir weiter fort?

Johann Wolfgang von Goethe, Faust I

As we just saw, with reduce our old Problem with init pops up again: How should the initialization for arbitrary operators and value type be implemented?

More concretely, we search a sensible value e for init for the maximum operator. This question is of interest independently of the generic implementation of reduce! We must have for all x: max(x,e) == x, that is, for all x e <= x. In mathematics we would set e equal to -∞. As we are dealing with data types of finite value range, it is sufficient to choose the smallest possible value. Fortunately, the standard library seems to come to rescue by offering numeric_limits, as this class is specialized for the built-in numeric types and provides min() and max() members. Unfortunately however, these functions behave differently for integer and floating point types: While for integer types, min() indeed returns the smallest possible value, it returns the smallest positive value for floating point types. Thus, we must be extremely careful:

double ad[N] = { ... };
int    ai[N] = { ... };
double max_ad = reduce(a, a+N, -numeric_limits<double>::max(), std::max<double>);
int    max_ai = reduce(a, a+N,  numeric_limits<int>   ::min(), std::max<int>);

This is ugly, extremely error prone and not usable in a generic context (where we do not know the concrete type when calling reduce). So what shall we do?

`init` and arbitrary operators

On the one hand, we could create a template neutral_element covering the maximum operator and use this to explicitly provide a value for init. As discussed before, another option is to implement a variant without init argument. This could either use the first element for initialization (and thus work with non-empty sequences only), or internally use the extension of neutral_element just mentioned. The explicit use of neutral_element would look like that:

double ad[N] = { ... };
int    ai[N] = { ... };
double max_ad = reduce(a, a+N, neutral_element<double,std::max<double> >::value(), std::max<double>);
int    max_ai = reduce(a, a+N, neutral_element<int,   std::max<int>    >::value(), std::max<int>);

This does make the call safer (albeit not more beautiful). Also, the question is whether the operator should be passed via a template argument to neutral_element as above, or if we would prefer operator-specific template classes, as below:

template<typename T> struct neutral_element_addition;
template<typename T> struct neutral_element_max;
// ... etc ...

The operator-specific variant seems more natural in that the neutral element does not (normally) depend on the concrete operator type, but rather on the mathematical operation it implements (there could be several implementations of the maximum operator). On the other hand, there is again the argument of the generic context: If I don't know which (mathematical) operator is passed to reduce, I cannot select the right neutral_element_xxx class template. The most sensible approach thus seems to be to implement both variants, and define neutral_element on top of neutral_element_xxx.

Implementing `neutral_element` for the maximum operator

In the case of the maximum operator this means, we use neutral_element_max as basis to implement specializations of neutral_element for concrete implementations of the maximum operator. If there is a version of the maximum which cannot use the standard implementation offered by neutral_element_max, we can still provide a full ad hoc specialization of neutral_element. (This could be the case if a different, non-standard comparison is used, e.g. for strings.)

template<typename T> struct neutral_element_max { ... };

template<typename T, typename Op> struct neutral_element {};

template<typename T> 
struct neutral_element<T, std::max<T> > : public neutral_element_max<T> {};

The implementation of neutral_element_max can relay on std::numeric_limits. Fortunately, std::numeric_limits also provides classifications whether it is specialized at all for a type and whether this type is integral. Thus, our logic for neutral_element_max looks as follows:

If std::numeric_limits is specialized for T
1. if T is an integral type: value() returns std::numeric_limits::min()
2. else (T is floating point type): value() returns -std::numeric_limits::max()
else (not specialized for T) neutral_element_max is left undefined (no default implementation)

This logic translates into a somewhat lengthy piece of template code, using partial specialization (left as an exercise). For cases which are not covered by std::numeric_limits we must provide separate specializations of neutral_element_max. For instance, for std::string the empty string is the neutral element wrt. to maximum.

... and if we cannot find a neutral element?

It will not be possible to find a sensible neutral element for all combinations of types and operators. For instance, the minimum operator for strings would have as neutral element the maximal string, which would contain the character with maximal ASCII-code a maximal number of times (i.e. std::string::max_size() times) which is not practical. If no specialization of neutral_element for a combination of value type and operator exists, an implementation of reduce must assume that no neutral element exists. In those cases, an implementation should be chosen at compile time which simply uses the first element for initialization (as discussed above).

Reduce with automatic `init`

Thus we have all building blocks ready to implement a version of reduce with automatic initialization:

// If combination of BinOp and value_type has no neutral element,
// we require begin != end, i.e. a non-empty input range.
template<class Iterator, class BinOp>
typename value_type<Iterator>::result
reduce(Iterator begin, Iterator end, BinOp op);

Now we define 2 variants reduce_non_empty (if no neutral element can be deduced) and reduce_neutral_element (if there is a neutral element). We assume that we can query this using neutral_element<T,Op>::is_specialized (it is straightforward to extend the implementation of neutral_element accordingly). In order to conveniently select one of both versions, we define a helper class reduce_select with a Boolean template parameter (signaling neutral_element specialized or not).

// Version requiring a non-empty range [begin,end)
template<typename Iterator, typename BinOp>
typename value_type<Iterator>::result
reduce_non_empty(Iterator begin, Iterator end, BinOp op)
{ 
  typename value_type<Iterator>::result res = *begin;
  ++begin;
  while(begin != end) {
    res = op(res, *begin);
    ++begin;
  } 
  return res;
}

// Version relying on neutral_element<value_t, BinOp> being specialized
template<class Iterator, class BinOp>
typename value_type<Iterator>::result
reduce_has_neutral_element(Iterator begin, Iterator end, BinOp op)
{ 
  typedef typename value_type<Iterator>::result value_t;
  return reduce(begin, end, neutral_element<value_t, BinOp>::value(), op); 
}


// helper class to select:
// reduce_non_empty           (IsSpecialized = false)
// reduce_has_neutral_element (IsSpecialized = true)
template<bool IsSpecialized> struct reduce_select;

// select reduce_non_empty
template<> struct reduce_select<false>
{ 
  template<typename Iterator, typename BinOp>
  static typename value_type<Iterator>::result
  reduce(Iterator begin, Iterator end, BinOp op) 
  {  
    return reduce_non_empty(begin, end, op); 
  }
};

// select reduce_has_neutral_element
template<> struct reduce_select<true>
{ 
  template<typename Iterator, typename BinOp>
  static typename value_type<Iterator>::result
  reduce(Iterator begin, Iterator end, BinOp op) 
  {  
    typedef typename value_type<Iterator>::result value_t;
    return reduce(begin, end, neutral_element<value_t, BinOp>::value(), op); 
  }
};


// Select either reduce_non_empty or reduce_has_neutral_element
template<class Iterator, class BinOp>
typename value_type<Iterator>::result
reduce(Iterator begin, Iterator end, BinOp op) 
{ 
  typedef typename value_type<Iterator>::result value_t;
  return reduce_select<neutral_element<value_t, BinOp>::is_specialized>::reduce(begin, end, op);
}

This looks rather involved, although in principle the logic is quite simple. Using support libraries for geneneric programming like boost::enable_if we can shorten this "pedestrian" version considerably. Fortunately, a user need only see the top-level reduce and the variant without init-Argument). Is there something left to say about reduce?

The trouble with init

init and arbitrary operators

Implementing neutral_element for the maximum operator

... and if we cannot find a neutral element?

Reduce with automatic init

`init` and arbitrary operators

Implementing `neutral_element` for the maximum operator

Reduce with automatic `init`