question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Having problem Pre-training GPT models

See original GitHub issue

Environment info

  • transformers version: 4.9.2
  • Platform: Linux-5.4.104±x86_64-with-Ubuntu-18.04-bionic
  • Python version: 3.7.11
  • PyTorch version (GPU?): 1.9.0+cu102 (True)
  • Tensorflow version (GPU?): 2.6.0 (True)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?: <fill in>
  • Using distributed or parallel set-up in script?: <fill in>

Who can help

Information

Model I am using (Bert, XLNet …): EleutherAI/gpt-neo-2.7B

The problem arises when using:

  • [X ] the official example scripts: (give details below)
  • my own modified scripts: (give details below)

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • [X ] my own task or dataset: (give details below)

To reproduce

I used a csv file with each line have a sample of example

Steps to reproduce the behavior:

My input: !python /content/transformers/examples/pytorch/language-modeling/run_clm.py --model_name_or_path EleutherAI/gpt-neo-2.7B --train_file /content/df.csv --output_dir /tmp/test-clm

i also tried using the no trainer version but still doesn’t work. What am i doing wrong?

What i got back:


Traceback (most recent call last):
  File "/content/transformers/examples/pytorch/language-modeling/run_clm.py", line 520, in <module>
    main()
  File "/content/transformers/examples/pytorch/language-modeling/run_clm.py", line 291, in main
    cache_dir=model_args.cache_dir,
  File "/usr/local/lib/python3.7/dist-packages/datasets/load.py", line 830, in load_dataset
    **config_kwargs,
  File "/usr/local/lib/python3.7/dist-packages/datasets/load.py", line 710, in load_dataset_builder
    **config_kwargs,
  File "/usr/local/lib/python3.7/dist-packages/datasets/builder.py", line 271, in __init__
    **config_kwargs,
  File "/usr/local/lib/python3.7/dist-packages/datasets/builder.py", line 370, in _create_builder_config
    builder_config = self.BUILDER_CONFIG_CLASS(**config_kwargs)
TypeError: __init__() got an unexpected keyword argument 'keep_linebreaks'

Expected behavior

Just want to further train the GPT model

notebook: https://colab.research.google.com/drive/1bk8teH0Egu-gAmBC_zlvUifMHS7y_SyM?usp=sharing

Any help is much appreciated

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
stefan-itcommented, Aug 28, 2021

I’m looking into it right now 😃

0reactions
mosh98commented, Aug 28, 2021

Thank you @stefan-it the script works now, running out of cuda memeory tho but i think it’s irrelevant to the actual script and more to do with my device.

Thanks Again!

Read more comments on GitHub >

github_iconTop Results From Across the Web

GPT models explained. Open AI's GPT-1,GPT-2,GPT-3 - Medium
The paper demonstrated that model had evolved in zero shot performance on different NLP tasks like question-answering, schema resolution, ...
Read more >
Too powerful NLP model (GPT-2) - Towards Data Science
Model Training​​ GPT-2 use unsupervised learning approach to train the language model. Unlike other model such as ELMo and BERT need 2 stages ......
Read more >
Exploring Pre-trained Model Use Cases with GPT-2 and T5
What are pre-trained deep learning models, and why are they setting a new standard in computing? This article explores the possible use cases...
Read more >
Custom pre-trained model... · Issue #278 · openai/gpt-2 - GitHub
I have custom txt dataset, how can I create my pre-training model? ... Basically GPT-2 is a language model, it is trained on...
Read more >
Accelerating Large GPT Training with Sparse Pre ... - Cerebras
To mitigate this issue, multiple approaches have been adopted in the industry: ... In our work, we show how pre-training GPT models can...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found