Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Why does LanguageModelFeaturizers only download the model having tf_model.h5 but not pytorch_model.bin

See original GitHub issue

Rasa Open Source version

2.2.5

Rasa SDK version

No response

Rasa X version

No response

Python version

3.7

What operating system are you using?

Other

What happened?

I was train chatbot for spanish language and i have found some pretrained finetuned models on huggingface which i wanted to test for featurizers. But trying several such models what i found was that rasa only downloads weights if there is tf_model.h5 present on that huggingface/models/… directories. But if there is only pytorch_model.bin present it throws error that no model is found on the huggingface/models/given_name. And in error it says it doesn’t found tf_model.h5 or pytorch_model.bin.

Command / Request

nest_asyncio.apply()
 print("event loop ready")

start_time=time.time()
 domain= '/content/drive/MyDrive/spanish/small/domain_small.yml'
 config='/content/drive/MyDrive/spanish/small/config/config_bert-base-spanish-wwm-cased-xnli.yml'
 training_files='/content/drive/MyDrive/spanish/small/data/small_spanish.yml'
 rasa.train(domain,config,training_files)
 print(f"Time Taken to train the model : 

 {time.time()-start_time}")

h3. Relevant log output

404 Client Error: Not Found for url: [https://huggingface.co/Recognai/bert-base-spanish-wwm-cased-xnli/resolve/main/tf_model.h5](https://huggingface.co/Recognai/bert-base-spanish-wwm-cased-xnli/resolve/main/tf_model.h5)

 ---------------------------------------------------------------------------

 HTTPError Traceback (most recent call last)

 /usr/local/lib/python3.7/dist-packages/transformers/modeling_tf_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)

 1247 use_auth_token=use_auth_token,

 -> 1248 user_agent=user_agent,

 1249 )

 20 frames

 /usr/local/lib/python3.7/dist-packages/transformers/file_utils.py in cached_path(url_or_filename, cache_dir, force_download, proxies, resume_download, user_agent, extract_compressed_file, force_extract, use_auth_token, local_files_only)

 1336 use_auth_token=use_auth_token,

 -> 1337 local_files_only=local_files_only,

 1338 )

 /usr/local/lib/python3.7/dist-packages/transformers/file_utils.py in get_from_cache(url, cache_dir, force_download, proxies, etag_timeout, resume_download, user_agent, use_auth_token, local_files_only)

 1499 r = requests.head(url, headers=headers, allow_redirects=False, proxies=proxies, timeout=etag_timeout)

 -> 1500 r.raise_for_status()

 1501 etag = r.headers.get("X-Linked-Etag") or r.headers.get("ETag")

 /usr/local/lib/python3.7/dist-packages/requests/models.py in raise_for_status(self)

 942 if http_error_msg:

 --> 943 raise HTTPError(http_error_msg, response=self)

 944 

 HTTPError: 404 Client Error: Not Found for url: [https://huggingface.co/Recognai/bert-base-spanish-wwm-cased-xnli/resolve/main/tf_model.h5](https://huggingface.co/Recognai/bert-base-spanish-wwm-cased-xnli/resolve/main/tf_model.h5)

 During handling of the above exception, another exception occurred:

 OSError Traceback (most recent call last)

 <ipython-input-13-ae58ff173a41>

 in 

 <module>

 6 config='/content/drive/MyDrive/spanish/small/config/config_bert-base-spanish-wwm-cased-xnli.yml'

 7 training_files='/content/drive/MyDrive/spanish/small/data/small_spanish.yml'

 ----> 8 rasa.train(domain,config,training_files)

 9 print(f"Time Taken to train the model : {time.time()-start_time} 

")

/usr/local/lib/python3.7/dist-packages/rasa/train.py in train(domain, config, training_files, output, dry_run, force_training, fixed_model_name, persist_nlu_training_data, core_additional_arguments, nlu_additional_arguments, loop, model_to_finetune, finetuning_epoch_fraction)
 107 finetuning_epoch_fraction=finetuning_epoch_fraction,
 108 ),
 --> 109 loop,
 110 )
 111 

/usr/local/lib/python3.7/dist-packages/rasa/utils/common.py in run_in_loop(f, loop)
 306 loop = asyncio.new_event_loop()
 307 asyncio.set_event_loop(loop)
 --> 308 result = loop.run_until_complete(f)
 309 
 310 # Let's also finish all running tasks:

/usr/local/lib/python3.7/dist-packages/nest_asyncio.py in run_until_complete(self, future)
 68 raise RuntimeError(
 69 'Event loop stopped before Future completed.')
 ---> 70 return f.result()
 71 
 72 def _run_once(self):

/usr/lib/python3.7/asyncio/futures.py in result(self)
 179 self.__log_traceback = False
 180 if self._exception is not None:
 --> 181 raise self._exception
 182 return self._result
 183 

/usr/lib/python3.7/asyncio/tasks.py in __step(****failed resolving arguments****)
 247 # We use the `send` method directly, because coroutines
 248 # don't have `__iter__` and `__next__` methods.
 --> 249 result = coro.send(None)
 250 else:
 251 result = coro.throw(exc)

/usr/local/lib/python3.7/dist-packages/rasa/train.py in train_async(domain, config, training_files, output, dry_run, force_training, fixed_model_name, persist_nlu_training_data, core_additional_arguments, nlu_additional_arguments, model_to_finetune, finetuning_epoch_fraction)
 172 nlu_additional_arguments=nlu_additional_arguments,
 173 model_to_finetune=model_to_finetune,
 --> 174 finetuning_epoch_fraction=finetuning_epoch_fraction,
 175 )
 176 

/usr/local/lib/python3.7/dist-packages/rasa/train.py in _train_async_internal(file_importer, train_path, output_path, dry_run, force_training, fixed_model_name, persist_nlu_training_data, core_additional_arguments, nlu_additional_arguments, model_to_finetune, finetuning_epoch_fraction)
 303 additional_arguments=nlu_additional_arguments,
 304 model_to_finetune=model_to_finetune,
 --> 305 finetuning_epoch_fraction=finetuning_epoch_fraction,
 306 )
 307 return TrainingResult(model=trained_model)

/usr/local/lib/python3.7/dist-packages/rasa/train.py in _train_nlu_with_validated_data(file_importer, output, train_path, fixed_model_name, persist_nlu_training_data, additional_arguments, model_to_finetune, finetuning_epoch_fraction)
 816 persist_nlu_training_data=persist_nlu_training_data,
 817 model_to_finetune=model_to_finetune,
 --> 818 **additional_arguments,
 819 )
 820 rasa.shared.utils.cli.print_color(

/usr/local/lib/python3.7/dist-packages/rasa/nlu/train.py in train(nlu_config, data, path, fixed_model_name, storage, component_builder, training_data_endpoint, persist_nlu_training_data, model_to_finetune, **kwargs)
 96 # trained in another subprocess
 97 trainer = Trainer(
 ---> 98 nlu_config, component_builder, model_to_finetune=model_to_finetune
 99 )
 100 persistor = create_persistor(storage)

/usr/local/lib/python3.7/dist-packages/rasa/nlu/model.py in __init__(self, cfg, component_builder, skip_validation, model_to_finetune)
 161 self.pipeline = model_to_finetune.pipeline
 162 else:
 --> 163 self.pipeline = self._build_pipeline(cfg, component_builder)
 164 
 165 def _build_pipeline(

/usr/local/lib/python3.7/dist-packages/rasa/nlu/model.py in _build_pipeline(self, cfg, component_builder)
 172 for index, pipeline_component in enumerate(cfg.pipeline):
 173 component_cfg = cfg.for_component(index)
 --> 174 component = component_builder.create_component(component_cfg, cfg)
 175 components.validate_component_keys(component, pipeline_component)
 176 pipeline.append(component)

/usr/local/lib/python3.7/dist-packages/rasa/nlu/components.py in create_component(self, component_config, cfg)
 850 )
 851 if component is None:
 --> 852 component = registry.create_component_by_config(component_config, cfg)
 853 self.__add_to_cache(component, cache_key)
 854 return component

/usr/local/lib/python3.7/dist-packages/rasa/nlu/registry.py in create_component_by_config(component_config, config)
 191 component_name = component_config.get("class", component_config<span class="error">["name"]</span>)
 192 component_class = get_component_class(component_name)
 --> 193 return component_class.create(component_config, config)

/usr/local/lib/python3.7/dist-packages/rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py in create(cls, component_config, config)
 103 else:
 104 hf_transformers_loaded = "HFTransformersNLP" in config.component_names
 --> 105 return cls(component_config, hf_transformers_loaded=hf_transformers_loaded)
 106 
 107 @classmethod

/usr/local/lib/python3.7/dist-packages/rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py in __init__(self, component_config, skip_model_load, hf_transformers_loaded)
 86 return
 87 self._load_model_metadata()
 ---> 88 self._load_model_instance(skip_model_load)
 89 
 90 @classmethod

/usr/local/lib/python3.7/dist-packages/rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py in _load_model_instance(self, skip_model_load)
 197 )
 198 self.model = model_class_dict<span class="error">[self.model_name]</span>.from_pretrained(
 --> 199 self.model_weights, cache_dir=self.cache_dir
 200 )
 201 

/usr/local/lib/python3.7/dist-packages/transformers/modeling_tf_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
 1255 f"- or '

 {pretrained_model_name_or_path} 

' is the correct path to a directory containing a file named one of 

 {TF2_WEIGHTS_NAME} 

, 

 {WEIGHTS_NAME} 

.\n\n"
 1256 )
 -> 1257 raise EnvironmentError(msg)
 1258 if resolved_archive_file == archive_file:
 1259 logger.info(f"loading weights file 

 {archive_file} 

")

OSError: Can't load weights for 'Recognai/bert-base-spanish-wwm-cased-xnli'. Make sure that:

*   'Recognai/bert-base-spanish-wwm-cased-xnli' is a correct model identifier listed on 'https://huggingface.co/models'

*   or 'Recognai/bert-base-spanish-wwm-cased-xnli' is the correct path to a directory containing a file named one of tf_model.h5, pytorch_model.bin.
     </module>
     </ipython-input-13-ae58ff173a41>

Issue Analytics

State:
Created 2 years ago
Comments:7 (3 by maintainers)

Top GitHub Comments

1reaction

lhr0909commented, Mar 17, 2022

Exalate commented:

lhr0909 commented:

@JEM-Mosig Actually I have tested by adding this from_pt flag and also including the pytorch dependency and it is working fine. I have a working example repository that contains this change - https://github.com/lhr0909/rasa-v2-nlu-bert-chinese/commit/2a431adde81095f705858925b02ecf07a932fda4

For PyTorch dependency, I only had to manually include a version of transformers with torch extras - https://github.com/lhr0909/rasa-v2-nlu-bert-chinese/blob/main/pyproject.toml#L10

0reactions

sync-by-unito[bot]commented, Dec 19, 2022

➤ Maxime Verger commented:

💡 Heads up! We’re moving issues to Jira: https://rasa-open-source.atlassian.net/browse/OSS.

From now on, this Jira board is the place where you can browse (without an account) and create issues (you’ll need a free Jira account for that). This GitHub issue has already been migrated to Jira and will be closed on January 9th, 2023. Do not forget to subscribe to the corresponding Jira issue!

➡️ More information in the forum: https://forum.rasa.com/t/migration-of-rasa-oss-issues-to-jira/56569.