question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Option to make Faker return unique values

See original GitHub issue

I see random test failures because e.g. factory.Faker('company') returns duplicate values (usually after a few hundred calls, but as low as the second call). To remedy this, I wrote a subclass of Faker that keeps track of values returned so it can ensure uniqueness. The code is fairly trivial:

from factory.faker import Faker


class UniqueFaker(Faker):
    """
    A Faker that keeps track of returned values so it can ensure uniqueness.
    """
    def __init__(self, *args, **kwargs):
        super(UniqueFaker, self).__init__(*args, **kwargs)
        self._values = {None}

    def generate(self, extra_kwargs):
        value = None
        while value in self._values:
            value = super(UniqueFaker, self).generate(extra_kwargs)
        self._values.add(value)
        return value

Is there any interest in either adding this subclass to factoryboy, or integrating the functionality into Faker itself?

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Reactions:12
  • Comments:11 (3 by maintainers)

github_iconTop GitHub Comments

5reactions
gregbrowndevcommented, Oct 11, 2019

Has anyone managed to work around this yet? I’ve tried using a sequence, but that doesn’t work:

name = factory.Sequence(lambda n: factory.Faker("company") + f" {n}")

>   name = factory.Sequence(lambda n: factory.Faker("company") + f" {n}")
E   TypeError: unsupported operand type(s) for +: 'Faker' and 'str'

@danihodovic you can use generate to imperatively create a value in your sequence:

name = factory.Sequence(lambda n: factory.Faker("company").generate() + f" {n}") 
2reactions
x-yuricommented, Feb 10, 2021

Oh, that’s (the OP) almost what I did:

# app/tests/__init__.py
class UniqueFaker(factory.Faker):
    # based on factory.faker.Faker.generate
    def generate(self, params):
        locale = params.pop('locale')
        subfaker = self._get_faker(locale)
        return subfaker.unique.format(self.provider, **params)

class MyTestCase(TestCase):
    def tearDown(self):
        for l, v in factory.Faker._FAKER_REGISTRY.items():
            factory.Faker._get_faker(locale=l).unique.clear()

# app/tests/factories.py
class ProductFactory(factory.django.DjangoModelFactory):
    ...
    size = t.UniqueFaker('size')  # S, M, L, ...

The idea is to override the method that calls the faker’s format method, copy the contents and add unique.

Wrapping faker in a subclass feels like it’d be non-performant and although fine for an individual project, probably not the right way we should solve this at the library level.

Non-performant? Meaning, slow? There seems to be nothing that suggests that. I’d say subclassing is the best workaround I could find. Not that it means factory-boy should follow that path.

For example, when I needed a ton of unique words, I had to hack the words provider by appending a random character to the end.

Personally, I think this might be a case where the best solution is to call faker directly through something like:

title = factory.Sequence(lambda n: fake.text(random.randint(5, 58))[:-1] + str(n))

Sounds like a different use case. I wonder if one’d want to reproduce test failures for such tests…

Other providers never hit that problem, so if we’re ignoring non-unique values, we might accidentally silently swallow an error somewhere.

AFAICS, nobody’s suggesting enforcing uniqueness globally. Usually that’s needed for database fields with a unique constraint.

name = factory.Sequence(lambda n: factory.Faker(“company”).generate() + f" {n}")

Doesn’t work since factory-boy==3.1.0, and in factory-boy==3.2.0 generate() was altogether removed (?).

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to get unique values from faker? - php - Stack Overflow
My solution to this problem is to use a loop to make unique() functional. $max = 10; for($c=1; $c<= ...
Read more >
Unique - Faker
Generates a unique result using the results of the given method. ... The function used to determine whether a value was already returned....
Read more >
Factory Faker Unique() Across All Tests? - Laracasts
Hey all, I'm using Faker\Generator to create sample data, in this case State Codes: define(\App\Statecode::class, function (Faker $faker) { return [ 'id' ...
Read more >
Welcome to Faker's documentation! — Faker 15.3.4 ...
Through use of the .unique property on the generator, you can guarantee that any generated values are unique for this specific instance. ......
Read more >
Laravel Faker: Seed Unique Values in Factories - YouTube
I want to show a few examples of how Faker's library unique () actually works: with a single field, or with a pivot...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found