question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Identical keywords in build_kwargs and config_kwargs lead to TypeError in load_dataset_builder()

See original GitHub issue

Describe the bug

In load_dataset_builder(), build_kwargs and config_kwargs can contain the same keywords leading to a TypeError("type object got multiple values for keyword argument “xyz”).

I ran into this problem with the keyword: base_path. It might happen with other kwargs as well. I think a quickfix would be

builder_cls = import_main_class(dataset_module.module_path)
builder_kwargs = dataset_module.builder_kwargs
data_files = builder_kwargs.pop("data_files", data_files)
config_name = builder_kwargs.pop("config_name", name)
hash = builder_kwargs.pop("hash")
base_path = builder_kwargs.pop("base_path")

and then pass base_path into builder_cls.

Steps to reproduce the bug

from datasets import load_dataset
load_dataset("rotten_tomatoes", base_path="./sample_data")

Expected results

The docs state: **config_kwargs — Keyword arguments to be passed to the BuilderConfig and used in the DatasetBuilder.

So I would expect to be able to pass the base_path into load_dataset().

Actual results

TypeError("type object got multiple values for keyword argument “base_path”).

Environment info

  • datasets version: 2.4.0
  • Platform: macOS-12.5-arm64-arm-64bit
  • Python version: 3.8.9
  • PyArrow version: 9.0.0

Issue Analytics

  • State:open
  • Created a year ago
  • Reactions:1
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
lhoestqcommented, Sep 13, 2022

Ok I see - maybe we should check the values of builder_kwargs raise an error if any key in config_kwargs tries to overwrite it ? The builder kwargs are determined from the builder’s type and location (in some cases it forces the base_path, data_files and config name for example)

1reaction
thepurpleowlcommented, Aug 29, 2022

I am getting similar error - TypeError: type object got multiple values for keyword argument 'name' while following this tutorial. I am getting this error with the dataset-cli test command.

datasets version: 2.4.0

Read more comments on GitHub >

github_iconTop Results From Across the Web

Why am I getting this unexpected keyword argument TypeError?
When passing kwargs into a function, it expects to find the exact variable name in the list. If instead your dictionary keys were...
Read more >
Python args and kwargs: Demystified
In this step-by-step tutorial, you'll learn how to use args and kwargs in Python to add more flexibility to your functions. You'll also...
Read more >
10 Examples to Master *args and **kwargs in Python
Keyword arguments are declared by a name and a default value. When a function is called, values for positional arguments must be given....
Read more >
How To Use *args and **kwargs in Python 3 - DigitalOcean
In this tutorial, we will cover the syntax of working with *args and **kwargs as parameters within functions to pass a variable number...
Read more >
Understanding *args and *kwargs arguments in Python
Although we have defined only one parameter in the greet() function, the * operator takes the input as a positional argument and is...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found