Re: Odd Division By Zero Error
Paul Khuong <pvk <at> pvk.ca>
2012-07-10 16:33:53 GMT
In article <E1SoKEb-000696-PN <at> hera.math.uni.wroc.pl>,
Waldek Hebisch <hebisch <at> math.uni.wroc.pl> wrote:
> Christophe Rhodes wrote:
> > The issue at hand is whether the "extra" density of floats around 0
> > should be used by the RNG. At first, it seems obvious that it should,
> > because well, why not?
>
> One paradigm for floating point operations is that operation is
> first performed exactly and after that the result is rounded
> to a representable float.
According to this view, the probability of (random 1.0) returning 1.0
should be positive. In fact, it should be equal to the probability of
the random float being < 2^{-25}. I detect a certain tension with the
spec here.
> > On the other hand, imagine a simple use of a RNG
> > to generate samples from a distribution using the CDF and a lookup
> > table: generate a float between 0 and 1 and transform according to the
> > inverse of the CDF. Ignoring for the moment the actual generation of
> > zeros, if the RNG exploits the wide range of floats around 0, the lower
> > tail of the distribution will be much, much more explored than the upper
> > tail, because the floating point resolution around 0 is far greater than
> > it is around 1.
>
> I am not sure what you mean by "more explored".
Many more distinct values in the left tail than in the right one.
> When RNG makes good use of extra
> absolute precision available close to 0 then tail of transformed
> distribution is much closer to the true exponential distibution.
Ah, but what happens if I need my extra precision at the other end? Or
what if I'm working with a symmetric PDF? Or, what if my uniform's lower
bound isn't 0?
> Of course, the user may do something stupid, like using log(1 - x)
> with x unifor in (0,1) to generate exponential distibution. Then
> extra effort spent close to 0 is wasted.
http://en.wikipedia.org/wiki/Antithetic_variates. It's not stupid, but
*useful*; sophisticated, one might even say.
> Given the above I think that RNG which makes use of extra precision
> around 0 is better than one which does not.
I like the current behaviour for two reasons: it's simple to explain and
to reason about (we generate random fractions with a fixed denominator,
and express them as floats), and it's the most common way to do it.
Simplicity is important to me because, as I noted above, getting small
values around 0 right isn't enough: the exact same problems, modulo
trivial differences, still crop up. The gain in correctness of intuitive
code are extremely partial, and contingent on avoiding intuitively
equivalent reformulations.
This plays in the second reason: AFAICT, it's by far the most common way
to generate uniforms (either the U[1, 2)-1 trick, or by taking
exactly-represented integers *and scaling them by a reciprocal*). Any
issue with this way of doing things is common and language independent,
and workarounds are likely to be known. Intuitively, any solution would
still be applicable when generating tiny values; realistically, I know
better than to trust my intuition here. I could certainly believe that
there are strange side-effects on statistics when using these variates
in stochastic simulations. Either way, someone who cares about that
ought to be basing their experiments on well-tested methods, and these
methods will most likely have been tested with a fixed-precision uniform
generator.
Oh, extra data point: none of the test suite that I know of (diehard,
dieharder, TestU01) attempts to detect that behaviour. Still, I should
be back in Montreal very soon, and I'll try to ask some stochastic
simulation people.
Paul Khuong
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/