Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Different results for NUTS on different systems

See original GitHub issue

Bug Description

Dear Rémi,

a colleague and me noticed that the BlackJAX NUTS kernel returns numerically different results on our systems. The differences might seem tiny, but we found that they can add up over a long chain. He’s on macOS, and I’m using Fedora. For the sake of reproducibility, shouldn’t the results be exactly the same? Do you have any ideas what might cause the differences?

Best, Hannes

Steps/Code to Reproduce

import blackjax.nuts as nuts
import jax
import jax.numpy as jnp
import numpy as np

n = 30
rng = np.random.default_rng(1337)
x = rng.normal(0.0, 1.0, size=n)


def log_prob(loc):
    return jnp.sum(jax.scipy.stats.norm.logpdf(x, loc, 1.0))


state = nuts.new_state(0.0, log_prob)
kernel = nuts.kernel(log_prob, step_size=0.2, inverse_mass_matrix=jnp.array([1.0]))
kernel = jax.jit(kernel)

m = 10
key = jax.random.PRNGKey(1337)
key = jax.random.split(key, m)

if __name__ == "__main__":
    for i in range(m):
        state, info = kernel(key[i], state)
        print(f"[iter {i}] position = {state.position.item()}")
        print(f"[iter {i}] acc prob = {info.acceptance_probability.item()}")

Expected Results

Exactly the same results on both systems.

Actual Results

Differences: https://www.diffchecker.com/dj3ZKduJ

Versions

His system: MacBook Pro (Retina, 13 inch, late 2013), macOS 11.6.4, Python 3.10.1, JAX 0.3.1, jaxlib 0.3.0, BlackJAX 0.3.0 My system: Dell Latitude 5420, Fedora 35, Python 3.10.2, JAX 0.3.1, jaxlib 0.3.0, BlackJAX 0.3.0

Issue Analytics

State:
Created 2 years ago
Reactions:1
Comments:5 (4 by maintainers)

Top GitHub Comments

1reaction

wiepcommented, Feb 23, 2022

Using Python 3.10.2 on the MacBook Pro, I see the same results as before. I also tried running the snippet on another more recent MacBook Pro. Same results.

However, when I run the snipped on colab, the results are different from all the others. Also, CPU and GPU differ on colab.

Using double precision, I can still observe differences although they are smaller (see the screenshots below);

It might be just a coincidence that two MacBooks produce the same results, and usually, one can not rely on reproducibility among different machines.

colab_gpu_cpu_64 mac_colab_64

0reactions

rloufcommented, Mar 28, 2022

Yes that’s a shame, but you can also see it as the ultimate test for the quality of your samples: if your colleague can’t reproduce the results you are probably doing it wrong 😃

Closing as it seems there is nothing we can do about it.