random_number helper method does not behave as advertised
See original GitHub issue- Faker version: Faker 4.0.2
- OS: macOS
Docstring for random_number
says:
Returns a random number with 1 digit (default, when digits==None), a random number with 0 to given number of digits, or a random number with given number to given number of digits (when
fix_len==True
).
…but returns an int with many digits (rather than 1) by default, and the fix_len=False
default does not seem to work either - it always returns exactly the number of digits requested.
Steps to reproduce
In [1]: from faker import Faker
In [2]: fake = Faker()
In [3]: fake.random_number()
Out[3]: 546872285
In [43]: fake.random_number(digits=30)
Out[43]: 390106228857719308559558465502
In [44]: fake.random_number(digits=30)
Out[44]: 185723852041585261753662702190
In [45]: fake.random_number(digits=30)
Out[45]: 823216888040689890638644594788
etc
Expected behavior
“a random number with one digit” by default (digits=None
)
or “a random number with 0 to given number of digits” (fix_len=False
)
Actual behavior
See above
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (3 by maintainers)
Top Results From Across the Web
c# - How do I seed a random class to avoid getting duplicate ...
I have the following code inside a static method in a static class: Random r = new Random(); int randomNumber = r.Next(1,100);.
Read more >Durable orchestrator code constraints - Azure Functions
Orchestration function replay and code constraints for Azure Durable Functions.
Read more >Form Helpers - Ruby on Rails Guides
The helper methods called on the form builder are identical to the model object helpers except that it is not necessary to specify...
Read more >Handlebars Helpers Reference - BigCommerce Dev Center
This article is a reference for Stencil supported Handlebars helpers. It includes custom helpers documentation and a list of whitelisted standard helpers.
Read more >bchavez/Bogus - C#, F#, and VB.NET - GitHub
Pros: Easy to get deterministic data setup quickly. Cons: Code changes can impact other data values. Not so good for unit tests. Local...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@anentropic yes, no, sort of, but not exactly. it depends on your expectation of how results should be distributed. To simplify this discussion, we only tackle
fix_len
isFalse
. Off the top of my head, here are two possible implementations of this method.digits
, and then randomly select a number that matches the required number ofdigits
.digits
, compute the maximum value a number can be based on the number ofdigits
selected, and then randomly select a number within that range.Using the 1st implementation, if
digits
has an integer valueX
, then ifN <= X
, the probability that anN
digit number will be generated is1 / X
. However, eachN
digit number only has a(1 / X) * (1 / (9 * 10 ^ (N - 1) + (1 if N == 1 else 0)))
chance of being generated. In non-math speak, numbers are fairly represented according to “class”, i.e. the number of digits, but individual numbers themselves are not. As such, numbers with fewer digits are over-represented and will appear much more often using this implementation.Using the 2nd implementation, if the value of digits is
X
, each number has a(1 / 10 ^ X)
regardless of the number of digits, but the probability that anN
digit number (whereN <= X
) will be generated is(9 * 10 ^ (N - 1) + (1 if N == 1 else 0)) / ( 10 ^ X)
. In non-math speak, each number is actually fairly represented, but since there are far more 30 digit numbers than there are 3 digit numbers, like by several orders of magnitude, it gives you the illusion that 3 digit numbers are not represented at all.The
random_number
method uses the 2nd implementation I mentioned. Ifdigits == 30
, then 90% of the time, it will return a 30 digit number by virtue of there being too many 30 digit numbers compared to all numbers with fewer digits combined. This is what you are experiencing, and you can verify by generating a crap ton of samples or by lowering the value ofdigits
to say2
. Personally, I am not a fan of this method. One reason, thedigits
argument is misleading, andmax_digits
would have been apt.No worries. I also thought there was a bug back then when I was rewriting the docstring for that method.
This is to be expected, because there are 900 3 digit numbers, 90 2 digit numbers, and 10 single digit numbers. With the current implementation, the method will always have a 90% chance of returning an
N
digit number ifdigit
has a value ofN
,N > 1
, andfix_len
is False.Like i said, you can also increase the number of samples you generate. If you generate 20 samples, you will have around an 88% chance of generating at least one number that has fewer than
N
digits. 50 samples will bring this up to around 99.5%.