question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unknown environment issue with BioBert

See original GitHub issue

I am using the nlu BioBert mapper to improve upon a tool that already exists called text2term. A few weeks ago, I was able to get the tool working on a personal computer (Mac), but shortly after when I switched to my new work computer (also Mac, same OS but with an Apple Chip instead of Intel), the program no longer worked even with the same source code, Python, and Java version.

A coworker recreated the issue with an Apple Chip computer, Python 3.9.5, and Java 17. If you have any insights, please let me know.

Here are the co-requirements, as well as the versions and the error: Python 3.10.6 (Also tried 3.9.13) Java version “1.8.0_341” (Also tried Java 16) requirements.txt:

Owlready2==0.36
argparse==1.4.0
pandas==1.4.1
numpy==1.23.2
gensim==4.1.2
scipy==1.8.0
scikit-learn==1.0.2
setuptools==60.9.3
requests==2.27.1
tqdm==4.62.3
sparse_dot_topn==0.3.1
bioregistry==0.4.63
nltk==3.7
rapidfuzz==2.0.5
shortuuid==1.0.9

Error:

ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/py4j/clientserver.py", line 516, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/py4j/java_gateway.py", line 1038, in send_command
    response = connection.send_command(command)
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/py4j/clientserver.py", line 539, in send_command
    raise Py4JNetworkError(
py4j.protocol.Py4JNetworkError: Error while sending or receiving
[OK!]
Traceback (most recent call last):
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/nlu/pipe/component_resolution.py", line 276, in get_trained_component_for_nlp_model_ref
    component.get_pretrained_model(nlp_ref, lang, model_bucket),
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/nlu/components/embeddings/sentence_bert/BertSentenceEmbedding.py", line 13, in get_pretrained_model
    return BertSentenceEmbeddings.pretrained(name,language,bucket) \
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/sparknlp/annotator/embeddings/bert_sentence_embeddings.py", line 231, in pretrained
    return ResourceDownloader.downloadModel(BertSentenceEmbeddings, name, lang, remote_loc)
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/sparknlp/pretrained/resource_downloader.py", line 40, in downloadModel
    j_obj = _internal._DownloadModel(reader.name, name, language, remote_loc, j_dwn).apply()
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/sparknlp/internal/__init__.py", line 317, in __init__
    super(_DownloadModel, self).__init__("com.johnsnowlabs.nlp.pretrained." + validator + ".downloadModel", reader,
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/sparknlp/internal/extended_java_wrapper.py", line 26, in __init__
    self._java_obj = self.new_java_obj(java_obj, *args)
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/sparknlp/internal/extended_java_wrapper.py", line 36, in new_java_obj
    return self._new_java_obj(java_class, *args)
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pyspark/ml/wrapper.py", line 86, in _new_java_obj
    return java_obj(*java_args)
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/py4j/java_gateway.py", line 1321, in __call__
    return_value = get_return_value(
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pyspark/sql/utils.py", line 190, in deco
    return f(*a, **kw)
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/py4j/protocol.py", line 334, in get_return_value
    raise Py4JError(
py4j.protocol.Py4JError: An error occurred while calling z:com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.downloadModel

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/nlu/__init__.py", line 234, in load
    nlu_component = nlu_ref_to_component(nlu_ref)
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/nlu/pipe/component_resolution.py", line 160, in nlu_ref_to_component
    resolved_component = get_trained_component_for_nlp_model_ref(lang, nlu_ref, nlp_ref, license_type, model_params)
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/nlu/pipe/component_resolution.py", line 287, in get_trained_component_for_nlp_model_ref
    raise ValueError(f'Failure making component, nlp_ref={nlp_ref}, nlu_ref={nlu_ref}, lang={lang}, \n err={e}')
ValueError: Failure making component, nlp_ref=sent_biobert_pmc_base_cased, nlu_ref=en.embed_sentence.biobert.pmc_base_cased, lang=en, 
 err=An error occurred while calling z:com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.downloadModel

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/jason/Documents/GitHub/ontology-mapper/text2term/__main__.py", line 48, in <module>
    Text2Term().map_file(arguments.source, arguments.target, output_file=arguments.output, csv_columns=csv_columns,
  File "/Users/jason/Documents/GitHub/ontology-mapper/text2term/t2t.py", line 63, in map_file
    return self.map(source_terms, target_ontology, source_terms_ids=source_terms_ids, base_iris=base_iris,
  File "/Users/jason/Documents/GitHub/ontology-mapper/text2term/t2t.py", line 115, in map
    self._do_biobert_mapping(source_terms, target_terms, biobert_file)
  File "/Users/jason/Documents/GitHub/ontology-mapper/text2term/t2t.py", line 161, in _do_biobert_mapping
    biobert = BioBertMapper(ontology_terms)
  File "/Users/jason/Documents/GitHub/ontology-mapper/text2term/biobert_mapper.py", line 28, in __init__
    self.biobert = self.load_biobert()
  File "/Users/jason/Documents/GitHub/ontology-mapper/text2term/biobert_mapper.py", line 34, in load_biobert
    biobert = nlu.load('en.embed_sentence.biobert.pmc_base_cased')
  File "/Users/jason/.pyenv/versions/3.10.6/lib/python3.10/site-packages/nlu/__init__.py", line 249, in load
    raise Exception(
Exception: Something went wrong during creating the Spark NLP model_anno_obj for your request =  en.embed_sentence.biobert.pmc_base_cased Did you use a NLU Spell?

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
maziyarpanahicommented, Aug 26, 2022

Thanks @C-K-Loan

I am sure @DevinTDHa has more to say about this. However, it seems that M1 is different than M1 Pro and M1 Max. These are very new and proprietary chips with little to know about. We also don’t have any CI/CD support at the moment so we can automate this and test everything thoroughly. We also had to build everything ourselves, there is no support for M1 in TensorFlow for Java and other dependencies we need.

Maybe we missed something and maybe it is possible to build/release for M1 that supports Pro and Max (I have no idea about the M2 family), but at the moment it seems only Apple M1 is supported with our build.

1reaction
DevinTDHacommented, Aug 26, 2022

Hi @C-K-Loan, @paynejason

As @maziyarpanahi said indeed the current build of Spark NLP only supports the default M1 architecture. Other processor deviations seem to have a differing instruction set, thus the error.

We are hoping to broaden support once CI/CD based on M1 is more generally available but in the meantime we recommend to run the pipelines remotely, such as in Google Colab or Databricks (if thats possible for you use case.)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Predictions in raw data #5 - dmis-lab/biobert-pytorch - GitHub
I mean a normal biomedical text. The issue is that there is no .predict function, so the file run_ner.py has to be customized....
Read more >
Topic Modeling based BioBERT QA System | Kaggle
Enhance capacity (people, technology, data) for sequencing with advanced analytics for unknown pathogens, and explore capabilities for distinguishing ...
Read more >
Optimising biomedical relationship extraction with BioBERT
Abstract. Text mining is widely used within the life sciences as an evidence stream for inferring relationships between biological entities.
Read more >
Automated Adverse Drug Event (ADE) Detection from Text in ...
ner_ade_biobert : trained with 768d BioBert embeddings ... scope of the ADE detected ( gastric problems would be detected as absent ).
Read more >
Self-Attention-Based Models for the Extraction of Molecular ...
The first paper investigated the scope of the BERT and BioBERT model in general BLM problems. The second paper improved on the result...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found