question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Biased random ".NextString(...)"

See original GitHub issue

https://github.com/dotnet/dotNext/blob/24b6f07def13e008a91c4f6455ba2b686d93cf77/src/DotNext/RandomExtensions.cs#L45

Using the remainder modulo the length will generate biased results when (uint)allowedChars.Length is not a power of two. The version based on Random might not be terribly important, but if someone is deliberately using the cryptographic RandomNumberGenerator version, then one should probably make every effort to be as accurate as possible.

I haven’t looked at what System.Random uses for its bounded Next implementations, but perhaps there is already an unbiased version there?

Here’s Daniel Lemire’s fast, unbiased version—albeit with an example implementation for 64-bit integers: Nearly Divisionless Random Integer Generation On Various Systems. The ACM paper linked in his blog post goes into more detail.

Granted, the execution time will almost certainly be dominated by calls to RandomNumberGenerator, which would be aggravated by an unbiased algorithm needing to discard some of the output from rng.GetBytes(...). If that is a problem, one might want to handle the case where allowedChars.Length is less than 256, since that could be handled with 8-bit integers from the generator instead of the general case’s 32-bit integers.

Issue Analytics

  • State:closed
  • Created 9 months ago
  • Comments:20

github_iconTop GitHub Comments

1reaction
saknocommented, Dec 12, 2022

You’ve been busy; it looks nice.

Only one question springs to mind: how would using an array of uint perform compared to current byte array approach? That would remove the need for Unsafe.ReadUnaligned(). (It might matter more on ARM64.)

Done. After all optimizations including removal of nuint <-> uint roundtrip inside of the loop:

Method Mean Error StdDev
RandomString 90.97 ns 0.407 ns 0.361 ns
GuidString 1,504.66 ns 29.490 ns 33.961 ns
CryptoRngString 1,825.52 ns 9.935 ns 11.042 ns
0reactions
saknocommented, Dec 14, 2022

Did the inlined version make a noticeable impact on performance?

Yes, 30% performance boost.

BTW, have you seen Lemire’s SIMD JSON talk? Since then, they’ve added AVX-512 support.

.NET doesn’t have AVX-512 support but it is planned in 8th version. Anyway, I see no way how we can use AVX to boost random string generation. Lemier’s algorithm currently implemented for general case (when allowedInput.Length is not a power of 2) cannot be vectorized due to the loop.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Random.nextInt(int) is [slightly] biased - java
It is a pseudo-random number generator. This means that you are not actually rolling a dice but rather use a formula to calculate...
Read more >
How to generate random string with bias?
I was wondering how could I generate random strings with controlled bias. For instance, to generate various trimer sequences with distribution ...
Read more >
How can I make a "random" generator that is biased ...
Every time you need to generate an event, you pick a random event from the deck, and then replace it with the next...
Read more >
Java – Random boolean with weight or bias
To generate a random string, concatenate characters drawn randomly from the set of acceptable symbols until the string reaches the desired length.
Read more >
Unbias a random generator
Create a function unbiased that uses only randN as its source of randomness to become an unbiased generator of random ones and zeroes....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found