breaking hl.rand_unif via CSE
See original GitHub issueshoudn’t be terribly surprising that rand_unif
has weird behavior, but here’s a case that is definitely The Wrong Thing:
Python 3.6.0 |Continuum Analytics, Inc.| (default, Dec 23 2016, 13:19:00)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.6.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import hail as hl
In [2]: r = hl.rand_unif(0, 1)
In [3]: hl.eval(r)
Out[3]: 0.5387579341676381
In [4]: hl.eval(hl.tuple([r, r]))
Out[4]: (0.5387579341676381, 0.5387579341676381)
okay, this makes sense becuase they have the same seed:
In [5]: print(hl.tuple([r, r])._ir)
(MakeTuple (0 1) (ApplySeeded rand_unif 806694938962853089 Float64 (Apply toFloat64 Float64 (I32 0)) (Apply toFloat64 Float64 (I32 1))) (ApplySeeded rand_unif 806694938962853089 Float64 (Apply toFloat64 Float64 (I32 0)) (Apply toFloat64 Float64 (I32 1))))
how about this:
In [6]: hl.eval(hl.range(2).map(lambda x: r))
Out[6]: [0.5387579341676381, 0.9394799645512691]
odd. but maybe rand_unif inside an iteration has some semantics for advancing the RNG (like an aggregation).
In [7]: p = 1 - r
In [8]: hl.eval(hl.range(2).map(lambda x: p))
Out[8]: [0.46124206583236194, 0.06052003544873086]
ok…
In [9]: hl.eval((p, hl.range(2).map(lambda x: p)))
Out[9]: (0.46124206583236194, [0.46124206583236194, 0.46124206583236194])
wtf?
if you look in the logs, its explained by the fact that only the final IR triggers CSE:
(Let __cse_1
(ApplyBinaryPrimOp Subtract
(ApplyIR toFloat64 Float64
(I32 1))
(ApplySeeded rand_unif 806694938962853089 Float64
(ApplyIR toFloat64 Float64
(I32 0))
(ApplyIR toFloat64 Float64
(I32 1))))
(MakeTuple (0 1)
(Ref __cse_1)
(ArrayMap __uid_5
(ArrayRange
(I32 0)
(I32 2)
(I32 1))
(Ref __cse_1))))
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (1 by maintainers)
Top Results From Across the Web
The Computational Structural Mechanics Testbed Architecture
lower case is converted to upper, and, worst of all, the string is broken up into five items! To “protect” the string yoii...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I agree. It’s extremely hard for us to provide these semantics though – we’d need to introspect on the Python AST of the lambda
so just as an aside, the behavior of all the random functions is actually documented in the hail docs:
https://hail.is/docs/0.2/functions/random.html
The one thing that I apparently didn’t write up is how it’s supposed to behave in an array context. The way it’s currently intended to work is as executed in:
since, as Tim points out, it’s kind of hard to differentiate the scope of
hl.range(2).map(lambda x: p)
andhl.range(2).map(lambda x: hl.rand_unif(0, 1))
if p is only used once.(This behavior is also pretty consistent with treating iteration through an array value as the same as iteration through an axis of a table or a matrix table, since that’s exactly what happens there.)
The way you’d use the same value of a random number in an array map, previously, was with bind:
I’m not sure what a bind-free answer to this would be short of implementing a way of differentiating between:
and
which would be nice, but also definitely a breaking change.
Edit: I’m really annoyed with myself because I remember writing something up with the array semantics and bind, but I don’t remember where I put them because they apparently never made it into the docs.