Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

init() got an unexpected keyword argument 'cache_dir'

See original GitHub issue

I used command

!python /content/transformers/examples/language-modeling/run_language_modeling.py
–output_dir=/content/output
–model_type=gpt2
–model_name_or_path=gpt2
–do_train
–train_data_file=/content/input.txt
–do_eval
–eval_data_file=/content/dev.txt

and the error occurs

2020-09-08 06:02:43.113931: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1 09/08/2020 06:02:45 - WARNING - main - Process rank: -1, device: cuda:0, n_gpu: 1, distributed training: False, 16-bits training: False 09/08/2020 06:02:45 - INFO - main - Training/evaluation parameters TrainingArguments(output_dir=‘/content/output’, overwrite_output_dir=False, do_train=True, do_eval=True, do_predict=False, evaluate_during_training=False, prediction_loss_only=False, per_device_train_batch_size=8, per_device_eval_batch_size=8, per_gpu_train_batch_size=None, per_gpu_eval_batch_size=None, gradient_accumulation_steps=1, learning_rate=5e-05, weight_decay=0.0, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, max_grad_norm=1.0, num_train_epochs=3.0, max_steps=-1, warmup_steps=0, logging_dir=‘runs/Sep08_06-02-45_58d9f15c989e’, logging_first_step=False, logging_steps=500, save_steps=500, save_total_limit=None, no_cuda=False, seed=42, fp16=False, fp16_opt_level=‘O1’, local_rank=-1, tpu_num_cores=None, tpu_metrics_debug=False, debug=False, dataloader_drop_last=False, eval_steps=1000, past_index=-1, run_name=None, disable_tqdm=False, remove_unused_columns=True) 09/08/2020 06:02:45 - INFO - filelock - Lock 140608954702032 acquired on /root/.cache/torch/transformers/4be02c5697d91738003fb1685c9872f284166aa32e061576bbe6aaeb95649fcf.db13c9bc9c7bdd738ec89e069621d88e05dc670366092d809a9cbcac6798e24e.lock Downloading: 100% 665/665 [00:00<00:00, 556kB/s] 09/08/2020 06:02:46 - INFO - filelock - Lock 140608954702032 released on /root/.cache/torch/transformers/4be02c5697d91738003fb1685c9872f284166aa32e061576bbe6aaeb95649fcf.db13c9bc9c7bdd738ec89e069621d88e05dc670366092d809a9cbcac6798e24e.lock 09/08/2020 06:02:46 - INFO - filelock - Lock 140608954701528 acquired on /root/.cache/torch/transformers/f2808208f9bec2320371a9f5f891c184ae0b674ef866b79c58177067d15732dd.1512018be4ba4e8726e41b9145129dc30651ea4fec86aa61f4b9f40bf94eac71.lock Downloading: 100% 1.04M/1.04M [00:00<00:00, 2.47MB/s] 09/08/2020 06:02:47 - INFO - filelock - Lock 140608954701528 released on /root/.cache/torch/transformers/f2808208f9bec2320371a9f5f891c184ae0b674ef866b79c58177067d15732dd.1512018be4ba4e8726e41b9145129dc30651ea4fec86aa61f4b9f40bf94eac71.lock 09/08/2020 06:02:48 - INFO - filelock - Lock 140608954701640 acquired on /root/.cache/torch/transformers/d629f792e430b3c76a1291bb2766b0a047e36fae0588f9dbc1ae51decdff691b.70bec105b4158ed9a1747fea67a43f5dee97855c64d62b6ec3742f4cfdb5feda.lock Downloading: 100% 456k/456k [00:00<00:00, 1.37MB/s] 09/08/2020 06:02:48 - INFO - filelock - Lock 140608954701640 released on /root/.cache/torch/transformers/d629f792e430b3c76a1291bb2766b0a047e36fae0588f9dbc1ae51decdff691b.70bec105b4158ed9a1747fea67a43f5dee97855c64d62b6ec3742f4cfdb5feda.lock /usr/local/lib/python3.6/dist-packages/transformers/modeling_auto.py:821: FutureWarning: The class AutoModelWithLMHead is deprecated and will be removed in a future version. Please use AutoModelForCausalLM for causal language models, AutoModelForMaskedLM for masked language models and AutoModelForSeq2SeqLM for encoder-decoder models. FutureWarning, 09/08/2020 06:02:48 - INFO - filelock - Lock 140608954702312 acquired on /root/.cache/torch/transformers/d71fd633e58263bd5e91dd3bde9f658bafd81e11ece622be6a3c2e4d42d8fd89.778cf36f5c4e5d94c8cd9cefcf2a580c8643570eb327f0d4a1f007fab2acbdf1.lock Downloading: 100% 548M/548M [00:16<00:00, 33.1MB/s] 09/08/2020 06:03:06 - INFO - filelock - Lock 140608954702312 released on /root/.cache/torch/transformers/d71fd633e58263bd5e91dd3bde9f658bafd81e11ece622be6a3c2e4d42d8fd89.778cf36f5c4e5d94c8cd9cefcf2a580c8643570eb327f0d4a1f007fab2acbdf1.lock /usr/local/lib/python3.6/dist-packages/transformers/tokenization_utils_base.py:1321: FutureWarning: The max_len attribute has been deprecated and will be removed in a future version, use model_max_length instead. FutureWarning, Traceback (most recent call last): File “/content/transformers/examples/language-modeling/run_language_modeling.py”, line 313, in <module> main() File “/content/transformers/examples/language-modeling/run_language_modeling.py”, line 242, in main get_dataset(data_args, tokenizer=tokenizer, cache_dir=model_args.cache_dir) if training_args.do_train else None File “/content/transformers/examples/language-modeling/run_language_modeling.py”, line 143, in get_dataset cache_dir=cache_dir, TypeError: init() got an unexpected keyword argument ‘cache_dir’

I’m working on the Google Colab environment

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:8 (2 by maintainers)

Top GitHub Comments

2reactions

sguggercommented, Sep 8, 2020

You need an install from source to use the current examples (as stated in their README). In colab you can do so by executing a cell with

! pip install git+git://github.com/huggingface/transformers/

Alternatively, you can find the version of the example that work with 3.1.0 here.

1reaction

jasonyliangcommented, Sep 13, 2020

I still get the error: TypeError: init() got an unexpected keyword argument ‘cache_dir’ when running the latest version for transformers (3.1.0). I’m also running on Colab environment: Command: !pip3 install transformers (also tried #! pip3 install git+git://github.com/huggingface/transformers/)

!wget https://raw.githubusercontent.com/huggingface/transformers/master/examples/language-modeling/run_language_modeling.py

%%bash export TRAIN_FILE=train_path export TEST_FILE=valid_path export MODEL_NAME=gpt2 export OUTPUT_DIR=output

python run_language_modeling.py
–output_dir=output
–model_type=gpt2
–model_name_or_path=gpt2
–do_train
–train_data_file=$TRAIN_FILE
–do_eval
–eval_data_file=$TEST_FILE
–cache_dir=None

Output: Traceback (most recent call last): File “run_language_modeling.py”, line 313, in <module> main() File “run_language_modeling.py”, line 242, in main get_dataset(data_args, tokenizer=tokenizer, cache_dir=model_args.cache_dir) if training_args.do_train else None File “run_language_modeling.py”, line 143, in get_dataset cache_dir=cache_dir, TypeError: init() got an unexpected keyword argument ‘cache_dir’