question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

TypeError: descriptor 'union' of 'set' object needs an argument

See original GitHub issue

I was try to do record linking with csvlink but I keep getting this error during the block training step:

Traceback (most recent call last):
  File "/Users/hhp21/Documents/sandbox/csvlink/venv/bin/csvlink", line 11, in <module>
    sys.exit(launch_new_instance())
  File "/Users/hhp21/Documents/sandbox/csvlink/venv/lib/python3.5/site-packages/csvdedupe/csvlink.py", line 208, in launch_new_instance
    d.main()
  File "/Users/hhp21/Documents/sandbox/csvlink/venv/lib/python3.5/site-packages/csvdedupe/csvlink.py", line 135, in main
    self.dedupe_training(deduper)
  File "/Users/hhp21/Documents/sandbox/csvlink/venv/lib/python3.5/site-packages/csvdedupe/csvhelpers.py", line 267, in dedupe_training
    deduper.train()
  File "/Users/hhp21/Documents/sandbox/csvlink/venv/lib/python3.5/site-packages/dedupe/api.py", line 655, in train
    self._trainBlocker(recall, index_predicates)
  File "/Users/hhp21/Documents/sandbox/csvlink/venv/lib/python3.5/site-packages/dedupe/api.py", line 666, in _trainBlocker
    recall)
  File "/Users/hhp21/Documents/sandbox/csvlink/venv/lib/python3.5/site-packages/dedupe/training.py", line 47, in learn
    final_predicates = searcher.search(dupe_cover)
  File "/Users/hhp21/Documents/sandbox/csvlink/venv/lib/python3.5/site-packages/dedupe/training.py", line 257, in search
    self.search(remaining, partial + (best_predicate,))
  File "/Users/hhp21/Documents/sandbox/csvlink/venv/lib/python3.5/site-packages/dedupe/training.py", line 257, in search
    self.search(remaining, partial + (best_predicate,))
  File "/Users/hhp21/Documents/sandbox/csvlink/venv/lib/python3.5/site-packages/dedupe/training.py", line 257, in search
    self.search(remaining, partial + (best_predicate,))
  File "/Users/hhp21/Documents/sandbox/csvlink/venv/lib/python3.5/site-packages/dedupe/training.py", line 257, in search
    self.search(remaining, partial + (best_predicate,))
  File "/Users/hhp21/Documents/sandbox/csvlink/venv/lib/python3.5/site-packages/dedupe/training.py", line 260, in search
    reachable = covered + len(set.union(*reduced.values()))
TypeError: descriptor 'union' of 'set' object needs an argument

My config file looks like this:

{
    "field_names": ["name","edu","exp"],
    "field_definitions": [{
        "field": "name",
        "type": "Name"
    }, {
        "field": "edu",
        "type": "Text"
    }, {
        "field": "exp",
        "type": "Text"
    }],
    "output_file": "output.csv",
    "skip_training": false,
    "training_file": "training.json",
    "inner_join": true,
    "sample_size": 1500,
    "recall_weight": 0.5
}

I have about 1300 rows in one file and 16000 rows in another. I have 10 positive and 10 negative training samples in training.json. I’m using dedupe 1.6.3, csvdedupe 0.1.15. Any ideas what it’s complaining about?

Update I should add that I tried with dedupe 1.6.2 and it worked fine, so suspect something up with the new version.

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:9 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
likealostcausecommented, Jun 5, 2019

I was getting this error when executing dedupe.consoleLabel() with Dedupe 1.9.7, and it persisted even after rolling back to Dedupe 1.6.2. Turns out the issue was that I was only labeling a couple of examples each for positive/negative matches in the console, and once I labeled at least 10 examples in each category (positive and negative matches) my script proceeded without error. Not sure if that’s the issue for anyone else, but maybe it’ll help somebody

0reactions
Erik-Schafer-PSIcommented, Sep 10, 2019

Had this issue in the same setting as @likealostcause – same resolution: minimum 10 pos and 10 neg manual labels

Read more comments on GitHub >

github_iconTop Results From Across the Web

set.union() complains that it has no argument when passing in ...
In python, once you have looped over all the elements of an iterator, you cannot loop over the iterator again (it is now...
Read more >
set union/intersection/difference could accept zero arguments
... TypeError: descriptor 'union' of 'set' object needs an argument This would allow to handle any sequence of sets which otherwise requires ...
Read more >
Error saying TypeError descriptor object needs an argument
I have this python code: class A: def __init__(self, name): self.name = name class B(A ... descriptor '__init__' of 'super' object needs an...
Read more >
set.union()抱怨它在传入生成器时没有参数
正确代码示例及运行结果如下... TypeError: descriptor 'append' requires a 'list' object but received a 'Tensor'.
Read more >
Re: Different "look and feel" of some built-in functions
A separate union function would be good. ... line 1, in <module> > >> > TypeError: descriptor 'union' of 'set' object needs an...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found