Backtranslation demo failed
See original GitHub issue🐛 Bug
I was following the READ me of Backtranslation and trying to translate “hello world” to German but got errors. I tried both master and v0.10.1.
Here’s the command I used
git clone https://github.com/pytorch/fairseq cd fairseq pip install --editable ./
pip install subword_nmt sacremoses
import torch
// List available models torch.hub.list(‘pytorch/fairseq’) # […, ‘transformer.wmt18.en-de’, … ]
// Load the WMT’18 En-De ensemble en2de_ensemble = torch.hub.load( ‘pytorch/fairseq’, ‘transformer.wmt18.en-de’, checkpoint_file=‘wmt18.model1.pt:wmt18.model2.pt:wmt18.model3.pt:wmt18.model4.pt:wmt18.model5.pt’, tokenizer=‘moses’, bpe=‘subword_nmt’)
And then I got errors as below
`Error when composing. Overrides: [‘common.no_progress_bar=True’, ‘common.log_interval=100’, “common.log_format=‘simple’”, ‘common.tensorboard_logdir=null’, ‘common.wandb_project=null’, ‘common.azureml_logging=False’, ‘common.seed=1’, ‘common.cpu=False’, ‘common.tpu=False’, ‘common.bf16=False’, ‘common.memory_efficient_bf16=False’, ‘common.fp16=True’, ‘common.memory_efficient_fp16=False’, ‘common.fp16_no_flatten_grads=False’, ‘common.fp16_init_scale=128’, ‘common.fp16_scale_window=null’, ‘common.fp16_scale_tolerance=0.0’, ‘common.min_loss_scale=0.0001’, ‘common.threshold_loss_scale=null’, ‘common.user_dir=null’, ‘common.empty_cache_freq=0’, ‘common.all_gather_list_size=16384’, ‘common.model_parallel_size=1’, ‘common.quantization_config_path=null’, ‘common.profile=False’, ‘common.reset_logging=False’, ‘common.suppress_crashes=False’, ‘common_eval.path=null’, ‘common_eval.post_process=null’, ‘common_eval.quiet=False’, “common_eval.model_overrides=‘{}’”, ‘common_eval.results_path=null’, ‘distributed_training.distributed_world_size=128’, ‘distributed_training.distributed_rank=0’, “distributed_training.distributed_backend=‘nccl’”, “distributed_training.distributed_init_method=‘tcp://learnfair0250:12597’”, ‘distributed_training.distributed_port=12597’, ‘distributed_training.device_id=0’, ‘distributed_training.distributed_no_spawn=False’, “distributed_training.ddp_backend=‘pytorch_ddp’”, ‘distributed_training.bucket_cap_mb=25’, ‘distributed_training.fix_batches_to_gpus=False’, ‘distributed_training.find_unused_parameters=False’, ‘distributed_training.fast_stat_sync=False’, ‘distributed_training.heartbeat_timeout=-1’, ‘distributed_training.broadcast_buffers=False’, ‘distributed_training.slowmo_momentum=null’, “distributed_training.slowmo_algorithm=‘LocalSGD’”, ‘distributed_training.localsgd_frequency=3’, ‘distributed_training.nprocs_per_node=1’, ‘distributed_training.pipeline_model_parallel=False’, ‘distributed_training.pipeline_balance=null’, ‘distributed_training.pipeline_devices=null’, ‘distributed_training.pipeline_chunks=0’, ‘distributed_training.pipeline_encoder_balance=null’, ‘distributed_training.pipeline_encoder_devices=null’, ‘distributed_training.pipeline_decoder_balance=null’, ‘distributed_training.pipeline_decoder_devices=null’, “distributed_training.pipeline_checkpoint=‘never’”, “distributed_training.zero_sharding=‘none’”, ‘distributed_training.tpu=True’, ‘dataset.num_workers=1’, ‘dataset.skip_invalid_size_inputs_valid_test=False’, ‘dataset.max_tokens=3584’, ‘dataset.batch_size=null’, ‘dataset.required_batch_size_multiple=8’, ‘dataset.required_seq_len_multiple=1’, ‘dataset.dataset_impl=null’, ‘dataset.data_buffer_size=10’, “dataset.train_subset=‘train’”, “dataset.valid_subset=‘valid’”, ‘dataset.validate_interval=1’, ‘dataset.validate_interval_updates=0’, ‘dataset.validate_after_updates=0’, ‘dataset.fixed_validation_seed=null’, ‘dataset.disable_validation=False’, “dataset.max_tokens_valid=‘${dataset.max_tokens}’”, “dataset.batch_size_valid=‘${dataset.batch_size}’”, ‘dataset.curriculum=0’, “dataset.gen_subset=‘test’”, ‘dataset.num_shards=1’, ‘dataset.shard_id=0’, ‘optimization.max_epoch=0’, ‘optimization.max_update=150000’, ‘optimization.stop_time_hours=0.0’, ‘optimization.clip_norm=0.0’, ‘optimization.sentence_avg=False’, ‘optimization.update_freq=[1]’, ‘optimization.lr=[0.0005]’, ‘optimization.stop_min_lr=1e-09’, ‘optimization.use_bmuf=False’, “checkpoint.save_dir=‘/checkpoint/edunov/20180526/wmt18en2de.wmt18.transformer_vaswani_wmt_en_de_big.bsz3584_lr0.0005_dr0.3_size225668919_sample_attdr0.1_upsample16’”, “checkpoint.restore_file=‘checkpoint_last.pt’”, ‘checkpoint.finetune_from_model=null’, ‘checkpoint.reset_dataloader=False’, ‘checkpoint.reset_lr_scheduler=False’, ‘checkpoint.reset_meters=False’, ‘checkpoint.reset_optimizer=False’, “checkpoint.optimizer_overrides=‘{}’”, ‘checkpoint.save_interval=1’, ‘checkpoint.save_interval_updates=null’, ‘checkpoint.keep_interval_updates=0’, ‘checkpoint.keep_last_epochs=-1’, ‘checkpoint.keep_best_checkpoints=-1’, ‘checkpoint.no_save=False’, ‘checkpoint.no_epoch_checkpoints=False’, ‘checkpoint.no_last_checkpoints=False’, ‘checkpoint.no_save_optimizer_state=False’, “checkpoint.best_checkpoint_metric=‘loss’”, ‘checkpoint.maximize_best_checkpoint_metric=False’, ‘checkpoint.patience=-1’, “checkpoint.checkpoint_suffix=‘’”, ‘checkpoint.checkpoint_shard_count=1’, ‘checkpoint.load_checkpoint_on_all_dp_ranks=False’, “checkpoint.model_parallel_size=‘${common.model_parallel_size}’”, ‘checkpoint.distributed_rank=0’, ‘bmuf.block_lr=1.0’, ‘bmuf.block_momentum=0.875’, ‘bmuf.global_sync_iter=50’, ‘bmuf.warmup_iterations=500’, ‘bmuf.use_nbm=False’, ‘bmuf.average_sync=False’, ‘bmuf.distributed_world_size=128’, ‘generation.beam=5’, ‘generation.nbest=1’, ‘generation.max_len_a=0.0’, ‘generation.max_len_b=200’, ‘generation.min_len=1’, ‘generation.match_source_len=False’, ‘generation.unnormalized=False’, ‘generation.no_early_stop=False’, ‘generation.no_beamable_mm=False’, ‘generation.lenpen=1.0’, ‘generation.unkpen=0.0’, ‘generation.replace_unk=null’, ‘generation.sacrebleu=False’, ‘generation.score_reference=False’, ‘generation.prefix_size=0’, ‘generation.no_repeat_ngram_size=0’, ‘generation.sampling=False’, ‘generation.sampling_topk=-1’, ‘generation.sampling_topp=-1.0’, ‘generation.constraints=null’, ‘generation.temperature=1.0’, ‘generation.diverse_beam_groups=-1’, ‘generation.diverse_beam_strength=0.5’, ‘generation.diversity_rate=-1.0’, ‘generation.print_alignment=null’, ‘generation.print_step=False’, ‘generation.lm_path=null’, ‘generation.lm_weight=0.0’, ‘generation.iter_decode_eos_penalty=0.0’, ‘generation.iter_decode_max_iter=10’, ‘generation.iter_decode_force_max_iter=False’, ‘generation.iter_decode_with_beam=1’, ‘generation.iter_decode_with_external_reranker=False’, ‘generation.retain_iter_history=False’, ‘generation.retain_dropout=False’, ‘generation.retain_dropout_modules=null’, ‘generation.decoding_format=null’, ‘generation.no_seed_provided=False’, ‘eval_lm.output_word_probs=False’, ‘eval_lm.output_word_stats=False’, ‘eval_lm.context_window=0’, ‘eval_lm.softmax_batch=9223372036854775807’, ‘interactive.buffer_size=0’, “interactive.input=‘-’”, ‘task=translation’, ‘task._name=translation’, “task.data=‘/home/jyt/.cache/torch/pytorch_fairseq/132307b1ed6b7c35ac7cc955bed54ec7fa87b612f07da9e5809fa49daedafb2d.09d4a0e2212ce4f65bf3baef404268c8ae0f4605f6f47246d489b600e0ed1b25’”, “task.source_lang=‘en’”, “task.target_lang=‘de’”, ‘task.load_alignments=False’, ‘task.left_pad_source=True’, ‘task.left_pad_target=False’, ‘task.max_source_positions=1024’, ‘task.max_target_positions=1024’, ‘task.upsample_primary=16’, ‘task.truncate_source=False’, ‘task.num_batch_buckets=0’, “task.train_subset=‘train’”, “task.dataset_impl=‘${dataset.dataset_impl}’”, “task.required_seq_len_multiple=‘${dataset.required_seq_len_multiple}’”, ‘task.eval_bleu=False’, “task.eval_bleu_args=‘{}’”, “task.eval_bleu_detok=‘space’”, “task.eval_bleu_detok_args=‘{}’”, ‘task.eval_tokenized_bleu=False’, ‘task.eval_bleu_remove_bpe=null’, ‘task.eval_bleu_print_samples=False’, ‘criterion=label_smoothed_cross_entropy’, ‘criterion._name=label_smoothed_cross_entropy’, ‘criterion.label_smoothing=0.1’, ‘criterion.report_accuracy=False’, ‘criterion.ignore_prefix_size=0’, ‘criterion.sentence_avg=False’, ‘tokenizer=moses’, ‘tokenizer._name=moses’, “tokenizer.source_lang=‘en’”, “tokenizer.target_lang=‘de’”, ‘tokenizer.moses_no_dash_splits=False’, ‘tokenizer.moses_no_escape=False’, ‘bpe=subword_nmt’, ‘bpe._name=subword_nmt’, “bpe.bpe_codes=‘/home/jyt/.cache/torch/pytorch_fairseq/132307b1ed6b7c35ac7cc955bed54ec7fa87b612f07da9e5809fa49daedafb2d.09d4a0e2212ce4f65bf3baef404268c8ae0f4605f6f47246d489b600e0ed1b25/code’”, “bpe.bpe_separator=‘@@’”, ‘optimizer=adam’, ‘optimizer._name=adam’, “optimizer.adam_betas=‘(0.9, 0.98)’”, ‘optimizer.adam_eps=1e-08’, ‘optimizer.weight_decay=0.0’, ‘optimizer.use_old_adam=False’, ‘optimizer.tpu=True’, ‘optimizer.lr=[0.0005]’, ‘lr_scheduler=inverse_sqrt’, ‘lr_scheduler._name=inverse_sqrt’, ‘lr_scheduler.warmup_updates=4000’, ‘lr_scheduler.warmup_init_lr=1e-07’, ‘lr_scheduler.lr=[0.0005]’] Traceback (most recent call last): File “/anaconda/envs/py37_default/lib/python3.7/site-packages/hydra/_internal/config_loader_impl.py”, line 513, in _apply_overrides_to_config OmegaConf.update(cfg, key, value, merge=True) File “/anaconda/envs/py37_default/lib/python3.7/site-packages/omegaconf/omegaconf.py”, line 613, in update root.setattr(last_key, value) File “/anaconda/envs/py37_default/lib/python3.7/site-packages/omegaconf/dictconfig.py”, line 285, in setattr raise e File “/anaconda/envs/py37_default/lib/python3.7/site-packages/omegaconf/dictconfig.py”, line 282, in setattr self.__set_impl(key, value) File “/anaconda/envs/py37_default/lib/python3.7/site-packages/omegaconf/dictconfig.py”, line 266, in __set_impl self._set_item_impl(key, value) File “/anaconda/envs/py37_default/lib/python3.7/site-packages/omegaconf/basecontainer.py”, line 398, in _set_item_impl self._validate_set(key, value) File “/anaconda/envs/py37_default/lib/python3.7/site-packages/omegaconf/dictconfig.py”, line 143, in _validate_set self._validate_set_merge_impl(key, value, is_assign=True) File “/anaconda/envs/py37_default/lib/python3.7/site-packages/omegaconf/dictconfig.py”, line 159, in _validate_set_merge_impl cause=ValidationError(“child ‘$FULL_KEY’ is not Optional”), File “/anaconda/envs/py37_default/lib/python3.7/site-packages/omegaconf/base.py”, line 101, in _format_and_raise type_override=type_override, File “/anaconda/envs/py37_default/lib/python3.7/site-packages/omegaconf/_utils.py”, line 694, in format_and_raise _raise(ex, cause) File “/anaconda/envs/py37_default/lib/python3.7/site-packages/omegaconf/_utils.py”, line 610, in _raise raise ex # set end OC_CAUSE=1 for full backtrace omegaconf.errors.ValidationError: child ‘checkpoint.save_interval_updates’ is not Optional full_key: checkpoint.save_interval_updates reference_type=CheckpointConfig object_type=CheckpointConfig
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File “<stdin>”, line 4, in <module> File “/anaconda/envs/py37_default/lib/python3.7/site-packages/torch/hub.py”, line 370, in load model = _load_local(repo_or_dir, model, *args, **kwargs) File “/anaconda/envs/py37_default/lib/python3.7/site-packages/torch/hub.py”, line 399, in _load_local model = entry(*args, **kwargs) File “/home/jyt/.cache/torch/hub/pytorch_fairseq_master/fairseq/models/fairseq_model.py”, line 262, in from_pretrained **kwargs, File “/home/jyt/.cache/torch/hub/pytorch_fairseq_master/fairseq/hub_utils.py”, line 75, in from_pretrained arg_overrides=kwargs, File “/home/jyt/.cache/torch/hub/pytorch_fairseq_master/fairseq/checkpoint_utils.py”, line 339, in load_model_ensemble_and_task state = load_checkpoint_to_cpu(filename, arg_overrides) File “/home/jyt/.cache/torch/hub/pytorch_fairseq_master/fairseq/checkpoint_utils.py”, line 273, in load_checkpoint_to_cpu state = _upgrade_state_dict(state) File “/home/jyt/.cache/torch/hub/pytorch_fairseq_master/fairseq/checkpoint_utils.py”, line 558, in _upgrade_state_dict state[“cfg”] = convert_namespace_to_omegaconf(state[“args”]) File “/home/jyt/.cache/torch/hub/pytorch_fairseq_master/fairseq/dataclass/utils.py”, line 353, in convert_namespace_to_omegaconf composed_cfg = compose(“config”, overrides=overrides, strict=False) File “/anaconda/envs/py37_default/lib/python3.7/site-packages/hydra/experimental/compose.py”, line 37, in compose with_log_configuration=False, File “/anaconda/envs/py37_default/lib/python3.7/site-packages/hydra/_internal/hydra.py”, line 512, in compose_config from_shell=from_shell, File “/anaconda/envs/py37_default/lib/python3.7/site-packages/hydra/_internal/config_loader_impl.py”, line 156, in load_configuration from_shell=from_shell, File “/anaconda/envs/py37_default/lib/python3.7/site-packages/hydra/_internal/config_loader_impl.py”, line 277, in _load_configuration ConfigLoaderImpl._apply_overrides_to_config(config_overrides, cfg) File “/anaconda/envs/py37_default/lib/python3.7/site-packages/hydra/_internal/config_loader_impl.py”, line 522, in _apply_overrides_to_config ) from ex hydra.errors.ConfigCompositionException: Error merging override checkpoint.save_interval_updates=null`
I’m wondering if anyone could tell me what’s wrong with it. Thanks in advance!
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (2 by maintainers)
Top GitHub Comments
Hi Matt, it’s working now. Thank you for your suggestions!
@novelidea or someone else: What did you do to get that working? I have the same issue with the same error while running the following (same as @novelidea but added
git pull
as @lematt1991 suggested):Running with fairseq v0.12.2 and python v3.8.