Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

random_number helper method does not behave as advertised

See original GitHub issue

Faker version: Faker 4.0.2
OS: macOS

Docstring for random_number says:

Returns a random number with 1 digit (default, when digits==None), a random number with 0 to given number of digits, or a random number with given number to given number of digits (when fix_len==True).

…but returns an int with many digits (rather than 1) by default, and the fix_len=False default does not seem to work either - it always returns exactly the number of digits requested.

Steps to reproduce

In [1]: from faker import Faker

In [2]: fake = Faker()

In [3]: fake.random_number()
Out[3]: 546872285

In [43]: fake.random_number(digits=30)
Out[43]: 390106228857719308559558465502

In [44]: fake.random_number(digits=30)
Out[44]: 185723852041585261753662702190

In [45]: fake.random_number(digits=30)
Out[45]: 823216888040689890638644594788

etc

Expected behavior

“a random number with one digit” by default (digits=None)

or “a random number with 0 to given number of digits” (fix_len=False)

Actual behavior

See above

Issue Analytics

State:
Created 3 years ago
Comments:7 (3 by maintainers)

Top GitHub Comments

1reaction

maleficecommented, Apr 14, 2020

@anentropic yes, no, sort of, but not exactly. it depends on your expectation of how results should be distributed. To simplify this discussion, we only tackle fix_len is False. Off the top of my head, here are two possible implementations of this method.

You first randomly select the number of digits, and then randomly select a number that matches the required number of digits.
You first randomly select the number of digits, compute the maximum value a number can be based on the number of digits selected, and then randomly select a number within that range.

Using the 1st implementation, if digits has an integer value X, then if N <= X, the probability that an N digit number will be generated is 1 / X. However, each N digit number only has a (1 / X) * (1 / (9 * 10 ^ (N - 1) + (1 if N == 1 else 0))) chance of being generated. In non-math speak, numbers are fairly represented according to “class”, i.e. the number of digits, but individual numbers themselves are not. As such, numbers with fewer digits are over-represented and will appear much more often using this implementation.

Using the 2nd implementation, if the value of digits is X, each number has a (1 / 10 ^ X) regardless of the number of digits, but the probability that an N digit number (where N <= X) will be generated is (9 * 10 ^ (N - 1) + (1 if N == 1 else 0)) / ( 10 ^ X). In non-math speak, each number is actually fairly represented, but since there are far more 30 digit numbers than there are 3 digit numbers, like by several orders of magnitude, it gives you the illusion that 3 digit numbers are not represented at all.

The random_number method uses the 2nd implementation I mentioned. If digits == 30, then 90% of the time, it will return a 30 digit number by virtue of there being too many 30 digit numbers compared to all numbers with fewer digits combined. This is what you are experiencing, and you can verify by generating a crap ton of samples or by lowering the value of digits to say 2. Personally, I am not a fan of this method. One reason, the digits argument is misleading, and max_digits would have been apt.

0reactions

maleficecommented, Apr 14, 2020

sorry, my mistake!

No worries. I also thought there was a bug back then when I was rewriting the docstring for that method.

I had also tried with digits=3 and got a bunch of three digit numbers

This is to be expected, because there are 900 3 digit numbers, 90 2 digit numbers, and 10 single digit numbers. With the current implementation, the method will always have a 90% chance of returning an N digit number if digit has a value of N, N > 1, and fix_len is False.

so it works the second way, as you say

Like i said, you can also increase the number of samples you generate. If you generate 20 samples, you will have around an 88% chance of generating at least one number that has fewer than N digits. 50 samples will bring this up to around 99.5%.

Top Results From Across the Web

c# - How do I seed a random class to avoid getting duplicate ...

I have the following code inside a static method in a static class: Random r = new Random(); int randomNumber = r.Next(1,100);.

Durable orchestrator code constraints - Azure Functions

Orchestration function replay and code constraints for Azure Durable Functions.

Form Helpers - Ruby on Rails Guides

The helper methods called on the form builder are identical to the model object helpers except that it is not necessary to specify...

Handlebars Helpers Reference - BigCommerce Dev Center

This article is a reference for Stencil supported Handlebars helpers. It includes custom helpers documentation and a list of whitelisted standard helpers.

bchavez/Bogus - C#, F#, and VB.NET - GitHub

Pros: Easy to get deterministic data setup quickly. Cons: Code changes can impact other data values. Not so good for unit tests. Local...