Section: 29.5.3.6 [rand.req.dist] Status: NAD Submitter: Matthias Troyer Opened: 2009-10-12 Last modified: 2019-02-26
Priority: Not Prioritized
View all other issues in [rand.req.dist].
View all issues with NAD status.
Discussion:
There exist optimized, vectorized vendor libraries for the creation of random number generators, such as Intel's MKL [1] and AMD's ACML [2]. In timing tests we have seen a performance gain of a factor of up to 80 (eighty) compared to a pure C++ implementation (in Boost.Random) when using these generator to generate a sequence of normally distributed random numbers. In codes dominated by the generation of random numbers (we have application codes where random number generation is more than 50% of the CPU time) this factor 80 is very significant.
To make use of these vectorized generators, we use a C++ class modeling
the RandomNumberEngine
concept and forwarding the generation of random
numbers to those optimized generators. For example:
namespace mkl { class mt19937 {.... }; }
For the generation of random variates we also want to dispatch to optimized vectorized functions in the MKL or ACML libraries. See this example:
mkl::mt19937 eng; std::normal_distribution<double> dist; double n = dist(eng);
Since the variate generation is done through the operator()
of the
distribution there is no customization point to dispatch to Intel's or
AMD's optimized functions to generate normally distributed numbers based
on the mt19937
generator. Hence, the performance gain of 80 cannot be
achieved.
Contrast this with TR1:
mkl::mt19937 eng; std::tr1::normal_distribution<double> dist; std::tr1::variate_generator<mkl::mt19937,std::tr1::normal_distribution<double> > rng(eng,dist); double n = rng();
This - admittedly much uglier from an aestethic point of view - design
allowed optimization by specializing the variate_generator
template for
mkl::mt19937
:
namespace std { namespace tr1 { template<> class variate_generator<mkl::mt19937,std::tr1::normal_distribution<double> > { .... }; } }
A similar customization point is missing in the C++0x design and prevents the optimized vectorized version to be used.
Suggested resolution:
Add a customization point to the distribution concept. Instead of the
variate_generator
template this can be done through a call to a
free function generate_variate
found by ADL instead of
operator()
of the distribution:
template <RandomNumberDistribution, class RandomNumberEngine> typename RandomNumberDistribution ::result_type generate_variate(RandomNumberDistribution const& dist, RandomNumberEngine& eng);
This function can be overloaded for optimized enginges like
mkl::mt19937
.
[ 2009-10 Santa Cruz: ]
NAD Future. No time to add this feature for C++0X.
[LEWG Kona 2017]
Recommend NAD: The standard has changed enough that the issue doesn't make sense anymore. Write a paper proposing a way to get this performance as changes to the current library.
[Kona 2019: Jonathan notes:]
Libstdc++ has the following non-standard extensions for more efficient generation of large numbers of random numbers:
template<typename ForwardIterator, typename UniformRandomNumberGenerator> void __generate(ForwardIterator, ForwardIterator, UniformRandomNumberGenerator&); template<typename ForwardIterator, typename UniformRandomNumberGenerator> void __generate(ForwardIterator, ForwardIterator, UniformRandomNumberGenerator&, const param_type&); template<typename UniformRandomNumberGenerator> void __generate(result_type*, result_type*, UniformRandomNumberGenerator&, const param_type&);
Proposed resolution: