Biased random ".NextString(...)"
See original GitHub issueUsing the remainder modulo the length will generate biased results when (uint)allowedChars.Length
is not a power of two. The version based on Random
might not be terribly important, but if someone is deliberately using the cryptographic RandomNumberGenerator
version, then one should probably make every effort to be as accurate as possible.
I haven’t looked at what System.Random
uses for its bounded Next
implementations, but perhaps there is already an unbiased version there?
Here’s Daniel Lemire’s fast, unbiased version—albeit with an example implementation for 64-bit integers: Nearly Divisionless Random Integer Generation On Various Systems. The ACM paper linked in his blog post goes into more detail.
Granted, the execution time will almost certainly be dominated by calls to RandomNumberGenerator
, which would be aggravated by an unbiased algorithm needing to discard some of the output from rng.GetBytes(...)
. If that is a problem, one might want to handle the case where allowedChars.Length
is less than 256, since that could be handled with 8-bit integers from the generator instead of the general case’s 32-bit integers.
Issue Analytics
- State:
- Created 9 months ago
- Comments:20
Done. After all optimizations including removal of
nuint <-> uint
roundtrip inside of the loop:Yes, 30% performance boost.
.NET doesn’t have AVX-512 support but it is planned in 8th version. Anyway, I see no way how we can use AVX to boost random string generation. Lemier’s algorithm currently implemented for general case (when
allowedInput.Length
is not a power of 2) cannot be vectorized due to the loop.