Section: 29.5.9.6.1 [rand.dist.samp.discrete] Status: NAD Submitter: Stephan Tolksdorf Opened: 2007-09-21 Last modified: 2016-01-28
Priority: Not Prioritized
View all other issues in [rand.dist.samp.discrete].
View all issues with NAD status.
Discussion:
discrete_distribution
requires the member probabilities()
to return a vector of standardized probabilities, which forces the implementation every time to
divide each probability by the sum of all probabilities, as the sum will in practice almost never be
exactly 1.0. This is unnecessarily inef ficient as the implementation would otherwise not need to
compute the standardized probabilities at all and could instead work with the non-standardized
probabilities and the sum. If there was no standardization the user would just get back the
probabilities that were previously supplied to the distribution object, which to me seems to be the
more obvious solution.
discrete_distribution
is not specified in case the number of given
probabilities is larger than the maximum number representable by the IntType.
Possible resolution: I propose to change the specification such that the non-standardized probabilities need to be returned and that an additional requirement is included for the number of probabilities to be smaller than the maximum of IntType.
[ Stephan Tolksdorf adds pre-Bellevue: ]
In reply to the discussion in N2424 of this issue:
Rescaled floating-point parameter vectors can not be expected to compare equal because of the limited precision of floating-point numbers. My proposal would at least guarantee that a parameter vector (of type double) passed into the distribution would compare equal with the one returned by the
probabilities()
method. Furthermore, I do not understand why "the changed requirement would lead to a significant increase in the amount of state in the distribution object". A typical implementation's state would increase by exactly one number: the sum of all probabilities. The textual representation for serialization would not need to grow at all. Finally, the proposed replacement "0 < n <= numeric_limits<IntType>::max() + 1
" makes the implementation unnecessarily complicated, "0 < n <= numeric_limits<IntType>::max()
" would be better.
[ Bellevue: ]
In N2424. We agree with the observation and the proposed resolution to part b). We recommend the wording n > 0 be replaced with 0 < n numeric_limits::max() + 1. However, we disagree with part a), as it would interfere with the definition of parameters' equality. Further, the changed requirement would lead to a significant increase in the amount of state of the distribution object.
As it stands now, it is convenient, and the changes proposed make it much less so.
NAD. Part a the current behavior is desirable. Part b, any constructor can fail, but the rules under which it can fail do not need to be listed here.
Proposed resolution:
See N2424 for the proposed resolution.
[ Stephan Tolksdorf adds pre-Bellevue: ]
In 29.5.9.6.1 [rand.dist.samp.discrete]:
Proposed wording a):
Change in para. 2
Constructs a
discrete_distribution
object withn=1
andp0 = w0 = 1
and change in para. 5
Returns: A
vector<double>
whosesize
member returnsn
and whoseoperator[]
member returnsthe weightpk
wk
as a double value when invoked with argumentk
fork = 0, ..., n-1
Proposed wording b):
Change in para. 3:
If
firstW == lastW
, let the sequencew
have lengthn = 1
and consist of the single valuew0 = 1
. Otherwise,[firstW,lastW)
shall form a sequencew
of lengthn
such that> 00 < n <= numeric_limits<IntType>::max()
, and*firstW
shall yield a valuew0
convertible todouble
. [Note: The valueswk
are commonly known as the weights . -- end note]