Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot load large XlmRoBertaForTokenClassification model in scala 2.12

See original GitHub issue

Steps to Reproduce

XlmRoBertaForTokenClassification.loadSavedModel(<large model path>)

Stack Trace

Exception in thread “main” java.lang.OutOfMemoryError: Java heap space at com.johnsnowlabs.ml.tensorflow.io.ChunkBytes$.readFileInByteChunks(ChunkBytes.scala:44) at com.johnsnowlabs.ml.tensorflow.TensorflowWrapper$.read(TensorflowWrapper.scala:436) at com.johnsnowlabs.nlp.annotators.classifier.dl.ReadXlmRoBertaForTokenTensorflowModel.loadSavedModel(XlmRoBertaForTokenClassification.scala:311) at com.johnsnowlabs.nlp.annotators.classifier.dl.ReadXlmRoBertaForTokenTensorflowModel.loadSavedModel$(XlmRoBertaForTokenClassification.scala:292) at com.johnsnowlabs.nlp.annotators.classifier.dl.XlmRoBertaForTokenClassification$.loadSavedModel(XlmRoBertaForTokenClassification.scala:330)

Your Environment

Spark NLP version 3.3.1:
Apache NLP version 3.0.1:
Java version 1.8.0:
Setup and installation (Pypi, Conda, Maven, etc.): SBT + Scala
Operating System and version: Ubuntu + MacOS

Issue Analytics

State:
Created 2 years ago
Comments:5 (2 by maintainers)

Top GitHub Comments

1reaction

Pcosmincommented, Oct 27, 2021

You are write. I added -Xmx15g to the java process and it’s working. Thank you!

0reactions

maziyarpanahicommented, Oct 27, 2021

Thanks, the error clearly indicates there is not enough memory to serialize XLM-RoBERTa large model in Java. It seems strange, the 15G should be enough for that model. Might be some settings regarding Java heap in your classpath or you actually don’t have 15G actual free memory.

I just tested xlm-roberta-large on Google Colab which only has 12G memory and 2G-3G was already in use by previous operations and it worked: https://colab.research.google.com/drive/1p5jFqxMuCnfcWFDGy_JS7yLJDeYGF00f?usp=sharing

Unfortunately, not much left other than freeing up more memory on that machine.

Top Results From Across the Web

Error while emitting: Method too large · Issue #529 - GitHub

scalaxb 1.7.3, scala 2.12.11 I'm trying to generate code for: ... while emitting bmecat/XMLProtocol$DefaultBmecat_DtUNITFormat [error] Method too large: ...