question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Generate text with `model.generate` on TPU does not work

See original GitHub issue

Environment info

  • transformers version: 4.7.0
  • Platform: Linux-5.4.0-1043-gcp-x86_64-with-glibc2.29 (Ubuntu 20.04.2 LTS)
  • Python version: 3.8.5
  • PyTorch version (GPU?): 1.8.1+cu102 (False)
  • PyTorch XLA version: 1.8.1
  • Using GPU in script?: No, using TPU
  • Using distributed or parallel set-up in script?: No, using a single TPU core

Who can help

@patrickvonplaten

Information

Model I am using (Bert, XLNet …): facebook/m2m100_1.2B, but other text generating models have the same problem.

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below)

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below)

To reproduce

On a machine with a TPU run:

import torch_xla.core.xla_model as xm
from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer

model_name = 'facebook/m2m100_1.2B'
source_lang = 'en'
target_lang = 'de'

docs = [
    "This is some document to translate.",
    "And another document to translate."
]

device = xm.xla_device()

model = M2M100ForConditionalGeneration.from_pretrained(model_name).to(device)

tokenizer = M2M100Tokenizer.from_pretrained(model_name, src_lang=source_lang)
encoded_docs = tokenizer(docs, return_tensors='pt', padding=True).to(device)

generated_tokens = model.generate(**encoded_docs, forced_bos_token_id=tokenizer.get_lang_id(target_lang))

The call to model.generate() runs without ever terminating. It seems to be stuck somewhere in the beam search.

The same code runs perfectly fine on CPUs and GPUs.

Expected behavior

I’d expect that the generation of text works in the same way as for CPUs and GPUs.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:2
  • Comments:13 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
gantecommented, Aug 5, 2022

@mikcnt @divyanshuaggarwal The previous TF generate function was almost a (reduced) copy of the current PT generate function. We had to do a major rework of the TF generate function to make it compatible with XLA, so yeah… PT needs the same treatment if we want to use it with XLA 😄

I’ve shared a twitter thread today about the subject: https://twitter.com/joao_gante/status/1555527603716444160

2reactions
divyanshuaggarwalcommented, Aug 4, 2022

Is there any update on this?

I had an exchange with @gante about it and it seems like the code will need major refactoring for this. https://huggingface.co/spaces/joaogante/tf_xla_generate_benchmarks/discussions/1#62eb9350985a691200cf2921

Read more comments on GitHub >

github_iconTop Results From Across the Web

Faster Text Generation with TensorFlow and XLA
TL;DR: Text Generation on transformers using TensorFlow can now be compiled with XLA. It is up to 100x faster than before, and even ......
Read more >
Troubleshooting TensorFlow - TPU - Google Cloud
This guide, along with the FAQ, provides troubleshooting help for users who are training TensorFlow models on Cloud TPU. If you are troubleshooting...
Read more >
Solve GLUE tasks using BERT on TPU | Text - TensorFlow
This tutorial demonstrates how to do preprocessing as part of your input pipeline for training, using Dataset.map, and then merge it into the...
Read more >
Tensor Processing Units (TPUs) for Accelerated Machine ...
A Tensor Processing Unit (TPU) is a custom computer chip designed by Google ... The trained model can generate new snippets of text...
Read more >
Machine Learning Glossary - Google Developers
The encoder's job is to produce good text representations, ... Not to be confused with the bias term in machine learning models or ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found