Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

breaking hl.rand_unif via CSE

See original GitHub issue

shoudn’t be terribly surprising that rand_unif has weird behavior, but here’s a case that is definitely The Wrong Thing:

Python 3.6.0 |Continuum Analytics, Inc.| (default, Dec 23 2016, 13:19:00)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.6.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import hail as hl

In [2]: r = hl.rand_unif(0, 1)

In [3]: hl.eval(r)
Out[3]: 0.5387579341676381

In [4]: hl.eval(hl.tuple([r, r]))
Out[4]: (0.5387579341676381, 0.5387579341676381)

okay, this makes sense becuase they have the same seed:

In [5]: print(hl.tuple([r, r])._ir)
(MakeTuple (0 1)  (ApplySeeded rand_unif 806694938962853089 Float64  (Apply toFloat64 Float64  (I32 0)) (Apply toFloat64 Float64  (I32 1))) (ApplySeeded rand_unif 806694938962853089 Float64  (Apply toFloat64 Float64  (I32 0)) (Apply toFloat64 Float64  (I32 1))))

how about this:

In [6]: hl.eval(hl.range(2).map(lambda x: r))
Out[6]: [0.5387579341676381, 0.9394799645512691]

odd. but maybe rand_unif inside an iteration has some semantics for advancing the RNG (like an aggregation).

In [7]: p = 1 - r

In [8]: hl.eval(hl.range(2).map(lambda x: p))
Out[8]: [0.46124206583236194, 0.06052003544873086]

ok…

In [9]: hl.eval((p, hl.range(2).map(lambda x: p)))
Out[9]: (0.46124206583236194, [0.46124206583236194, 0.46124206583236194])

wtf?

if you look in the logs, its explained by the fact that only the final IR triggers CSE:

(Let __cse_1
  (ApplyBinaryPrimOp Subtract
    (ApplyIR toFloat64 Float64
      (I32 1))
    (ApplySeeded rand_unif 806694938962853089 Float64
      (ApplyIR toFloat64 Float64
        (I32 0))
      (ApplyIR toFloat64 Float64
        (I32 1))))
  (MakeTuple (0 1)
    (Ref __cse_1)
    (ArrayMap __uid_5
      (ArrayRange
        (I32 0)
        (I32 2)
        (I32 1))
      (Ref __cse_1))))

Issue Analytics

State:
Created 4 years ago
Comments:5 (1 by maintainers)

Top GitHub Comments

1reaction

tpoterbacommented, Nov 26, 2019

I agree. It’s extremely hard for us to provide these semantics though – we’d need to introspect on the Python AST of the lambda

0reactions

catoverdrivecommented, Dec 5, 2019

so just as an aside, the behavior of all the random functions is actually documented in the hail docs:

https://hail.is/docs/0.2/functions/random.html

The one thing that I apparently didn’t write up is how it’s supposed to behave in an array context. The way it’s currently intended to work is as executed in:

In [8]: hl.eval(hl.range(2).map(lambda x: p))
Out[8]: [0.46124206583236194, 0.06052003544873086]

since, as Tim points out, it’s kind of hard to differentiate the scope of hl.range(2).map(lambda x: p) and hl.range(2).map(lambda x: hl.rand_unif(0, 1)) if p is only used once.

(This behavior is also pretty consistent with treating iteration through an array value as the same as iteration through an axis of a table or a matrix table, since that’s exactly what happens there.)

The way you’d use the same value of a random number in an array map, previously, was with bind:

hl.bind(lambda p: hl.range(2).map(lambda x: p), hl.rand_unif(0, 1))

I’m not sure what a bind-free answer to this would be short of implementing a way of differentiating between:

p = hl.rand_unif(0, 1)
hl.range(2).map(lambda x: p)

and

hl.range(2).map(lambda x: hl.rand_unif(0, 1))

which would be nice, but also definitely a breaking change.

Edit: I’m really annoyed with myself because I remember writing something up with the array semantics and bind, but I don’t remember where I put them because they apparently never made it into the docs.