Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Use consistent random number generation across hardware

See original GitHub issue

Is your feature request related to a problem? Please describe.

pytorch.randn is not consistent across hardware devices (See https://github.com/pytorch/pytorch/issues/84234).

diffusers calls torch.randn on the device computation is run on (typically ‘cuda’). As a result, results produced with the exact same parameters will differ across machines.

Describe the solution you’d like

Until the issue is resolved in pytorch itself, diffusers should use a deterministic RNG so results can be consistent across hardware.

One possible workaround is to keep using torch.rng while enforcing generation to happen on the cpu, which currently seems consistent no matter the hardware.

Here is an example solution:

def randn(size, generator=None, device=None, **kwargs):
    """
    Wrapper around torch.randn providing proper reproducibility.

    Generation is done on the given generator's device, then moved to the
    given ``device``.

    Args:
        size: tensor size
        generator (torch.Generator): RNG generator
        device (torch.device): Target device for the resulting tensor
    """
    # FIXME: generator RNG device is ignored and needs to be passed to torch.randn (torch issue #62451)
    rng_device = generator.device if generator is not None else device
    image = torch.randn(size, generator=generator, device=rng_device, **kwargs)
    image = image.to(device=device)
    return image


def randn_like(tensor, generator=None, **kwargs):
    return randn(tensor.shape, layout=tensor.layout, generator=generator, device=tensor.device, **kwargs)

Calling these functions instead of the torch ones, with a generator whose device is cpu, gives deterministic results and still allows for the rest of the computations to run on cuda.

This would also simplify and speed up all the tests, which can simply use cpu-bound generators and leave device to be cuda even for those relying on RNG.

Describe alternatives you’ve considered

It’s also possisble to switch to numpy’s RNG, which is deterministic. The above solution is more torch-native.

Issue Analytics

State:
Created 10 months ago
Reactions:1
Comments:7 (7 by maintainers)

Top GitHub Comments

1reaction

antochecommented, Dec 6, 2022

Reproducibility/determinism is not just a "nice thing, it’s vitally important for multiple reasons:

For research, if results are not reproducible, then other researchers cannot compare their results against previous work. Reproducibility is a cornerstone of the scientific process.
For contributors, if tests aren’t reproducible on their machines, then it is a major hindrance in contributing to a project. If the tests don’t even pass on a contributor’s machine without any changes, they don’t know what their changes might be breaking.
For production software, reproducibility is a requirement. If a developer writes software relying on a library that is not deterministic, they can’t make any guarantees that their software will be reliable, and they can’t write reliable tests for their own code.
For users, reproducibility is an important feature. The same inputs ought to result in the same results, so users can explore and refine their work, build on other users’ inputs, as well as recover it if the outputs have been lost.

1reaction

patil-surajcommented, Dec 5, 2022

Agree with Patrick here, IMO it’s better to add a method which can enable reproducibility. I’m usually hesitant to create such wrappers around framework functions as it might make it a bit harder to go through the code. For example, if we replace torch.randn by custom randn then users who are going through the code might wonder that something different is happening here.

Top Results From Across the Web

Hardware random number generator - Wikipedia

In computing, a hardware random number generator (HRNG) or true random number generator (TRNG) is a device that generates random numbers from a...

Constant time random number within range - Stack Overflow

Show activity on this post. I have a random number generator that runs in constant time. The prototype for this function is as...

Controlling Random Number Generation - MathWorks

This example shows how to use the rng function, which provides control over random number generation. (Pseudo)Random numbers in MATLAB® come from the...

A Hardware Efficient Random Number Generator for ... - Hindawi

The fast generation of random numbers is essential for many tasks. One of the major fields of application are Monte Carlo simulation, for...

How Computers Generate Random Numbers | by Erin Herzstein

There are two main methods that a computer generates a random number: true random number generators (TRNGs) and pseudo-random number generators (PRNGs). The ......