1235. Issue with C++0x random number proposal

Section: 29.5.3.6 [rand.req.dist] Status: NAD Submitter: Matthias Troyer Opened: 2009-10-12 Last modified: 2019-02-26

Priority: Not Prioritized

View all other issues in [rand.req.dist].

View all issues with NAD status.

Discussion:

There exist optimized, vectorized vendor libraries for the creation of random number generators, such as Intel's MKL [1] and AMD's ACML [2]. In timing tests we have seen a performance gain of a factor of up to 80 (eighty) compared to a pure C++ implementation (in Boost.Random) when using these generator to generate a sequence of normally distributed random numbers. In codes dominated by the generation of random numbers (we have application codes where random number generation is more than 50% of the CPU time) this factor 80 is very significant.

To make use of these vectorized generators, we use a C++ class modeling the RandomNumberEngine concept and forwarding the generation of random numbers to those optimized generators. For example:

namespace mkl {
 class mt19937 {.... };
}

For the generation of random variates we also want to dispatch to optimized vectorized functions in the MKL or ACML libraries. See this example:

mkl::mt19937 eng;
std::normal_distribution<double> dist;

double n = dist(eng);

Since the variate generation is done through the operator() of the distribution there is no customization point to dispatch to Intel's or AMD's optimized functions to generate normally distributed numbers based on the mt19937 generator. Hence, the performance gain of 80 cannot be achieved.

Contrast this with TR1:

mkl::mt19937 eng;
std::tr1::normal_distribution<double> dist;
std::tr1::variate_generator<mkl::mt19937,std::tr1::normal_distribution<double> > rng(eng,dist);
double n = rng();

This - admittedly much uglier from an aestethic point of view - design allowed optimization by specializing the variate_generator template for mkl::mt19937:

namespace std { namespace tr1 {

template<>
class variate_generator<mkl::mt19937,std::tr1::normal_distribution<double> > { .... };

} }

A similar customization point is missing in the C++0x design and prevents the optimized vectorized version to be used.

Suggested resolution:

Add a customization point to the distribution concept. Instead of the variate_generator template this can be done through a call to a free function generate_variate found by ADL instead of operator() of the distribution:

template <RandomNumberDistribution, class RandomNumberEngine>
typename RandomNumberDistribution ::result_type
generate_variate(RandomNumberDistribution const& dist, RandomNumberEngine& eng);

This function can be overloaded for optimized enginges like mkl::mt19937.

[ 2009-10 Santa Cruz: ]

NAD Future. No time to add this feature for C++0X.

[LEWG Kona 2017]

Recommend NAD: The standard has changed enough that the issue doesn't make sense anymore. Write a paper proposing a way to get this performance as changes to the current library.

[Kona 2019: Jonathan notes:]

Libstdc++ has the following non-standard extensions for more efficient generation of large numbers of random numbers:

template<typename ForwardIterator, typename UniformRandomNumberGenerator>
void __generate(ForwardIterator, ForwardIterator, 
                UniformRandomNumberGenerator&);

template<typename ForwardIterator, typename UniformRandomNumberGenerator>
void __generate(ForwardIterator, ForwardIterator, 
                UniformRandomNumberGenerator&, const param_type&);

template<typename UniformRandomNumberGenerator>
void __generate(result_type*, result_type*,
                UniformRandomNumberGenerator&, const param_type&);

Proposed resolution: