question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Is it possible to use adaptive_loss with transformer?

See original GitHub issue

I would like to use adaptvive-loss with transformer Is this possible?

I tried as follows

fairseq-train data/bin/ --save-dir nn_model --task translation --share-all-embeddings --no-progress-bar --arch transformer --ddp-backend=no_c10d --optimizer adam --adam-betas '(0.9, 0.98)' --update-freq=8 --max-tokens 1536 --warmup-updates 1000 --criterion adaptive_loss --adaptive-softmax-cutoff 1000

but I got the following error

Namespace(activation_dropout=0.0, activation_fn='relu', adam_betas='(0.9, 0.98)', adam_eps=1e-08, adaptive_input=False, adaptive_softmax_cutoff='1000', adaptive_softmax_dropout=0, arch='transformer', attention_dropout=0.0, bucket_cap_mb=25, clip_norm=25, cpu=False, criterion='adaptive_loss', curriculum=0, data='data/bin/', dataset_impl='cached', ddp_backend='no_c10d', decoder_attention_heads=8, decoder_embed_dim=512, decoder_embed_path=None, decoder_ffn_embed_dim=2048, decoder_input_dim=512, decoder_layers=6, decoder_learned_pos=False, decoder_normalize_before=False, decoder_output_dim=512, device_id=0, disable_validation=False, distributed_backend='nccl', distributed_init_method=None, distributed_no_spawn=False, distributed_port=-1, distributed_rank=0, distributed_world_size=1, dropout=0.1, encoder_attention_heads=8, encoder_embed_dim=512, encoder_embed_path=None, encoder_ffn_embed_dim=2048, encoder_layers=6, encoder_learned_pos=False, encoder_normalize_before=False, find_unused_parameters=False, fix_batches_to_gpus=False, force_anneal=None, fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, global_sync_iter=10, keep_interval_updates=-1, keep_last_epochs=-1, lazy_load=False, left_pad_source='True', left_pad_target='False', log_format=None, log_interval=1000, lr=[0.25], lr_scheduler='fixed', lr_shrink=0.1, max_epoch=0, max_sentences=None, max_sentences_valid=None, max_source_positions=1024, max_target_positions=1024, max_tokens=1536, max_update=0, memory_efficient_fp16=False, min_loss_scale=0.0001, min_lr=-1, no_epoch_checkpoints=False, no_progress_bar=True, no_save=False, no_token_positional_embeddings=False, num_workers=0, optimizer='adam', optimizer_overrides='{}', raw_text=False, required_batch_size_multiple=8, reset_dataloader=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, restore_file='checkpoint_last.pt', save_dir='nn_model', save_interval=1, save_interval_updates=0, seed=1, sentence_avg=False, share_all_embeddings=True, share_decoder_input_output_embed=False, skip_invalid_size_inputs_valid_test=False, source_lang=None, target_lang=None, task='translation', tbmf_wrapper=False, tensorboard_logdir='', threshold_loss_scale=None, train_subset='train', update_freq=[8], upsample_primary=1, use_bmuf=False, user_dir=None, valid_subset='valid', validate_interval=1, warmup_updates=1000, weight_decay=0.0)
| [sl] dictionary: 30088 types
| [tl] dictionary: 30088 types
| data/bin/en__it_XX/ valid sl-tl 171248 examples
Traceback (most recent call last):
  File "/usr/local/bin/fairseq-train", line 11, in <module>
    sys.exit(cli_main())
  File "/usr/local/lib/python3.6/dist-packages/fairseq_cli/train.py", line 302, in cli_main
    main(args)
  File "/usr/local/lib/python3.6/dist-packages/fairseq_cli/train.py", line 49, in main
    model = task.build_model(args)
  File "/usr/local/lib/python3.6/dist-packages/fairseq/tasks/fairseq_task.py", line 169, in build_model
    return models.build_model(args, self)
  File "/usr/local/lib/python3.6/dist-packages/fairseq/models/__init__.py", line 50, in build_model
    return ARCH_MODEL_REGISTRY[args.arch].build_model(args, task)
  File "/usr/local/lib/python3.6/dist-packages/fairseq/models/transformer.py", line 165, in build_model
    decoder = cls.build_decoder(args, tgt_dict, decoder_embed_tokens)
  File "/usr/local/lib/python3.6/dist-packages/fairseq/models/transformer.py", line 174, in build_decoder
    return TransformerDecoder(args, tgt_dict, embed_tokens)
  File "/usr/local/lib/python3.6/dist-packages/fairseq/models/transformer.py", line 356, in __init__
    adaptive_inputs=embed_tokens if args.tie_adaptive_weights else None,
AttributeError: 'Namespace' object has no attribute 'tie_adaptive_weights'

what’s wrong?

what should be parameters tie_adaptive_weights?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:8 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
nicolabertoldicommented, Sep 24, 2019

thank you for this hint.

0reactions
lematt1991commented, Sep 24, 2019

I would try using the pytorch profiler

Read more comments on GitHub >

github_iconTop Results From Across the Web

[D] Adaptive loss weight in VQGAN paper. : r/MachineLearning
The adaptive weight actually stays relatively small ( scale of 1e-5 ), removing it doesn't have much effects on the final results. You...
Read more >
When can we call a loss function "adaptive"?
I came across the phrase "adaptive loss function" in several research papers. ... It is the normal English meaning of "adaptive".
Read more >
Faster Depth-Adaptive Transformers
Depth-adaptive neural networks can dynamically adjust depths according to the hardness of input words, and thus improve efficiency.
Read more >
Text and Image Matching with Adaptive Loss for Cross-modal ...
Existing pre-trained models for vision-language often learn image-text semantic alignment using a multi-layer Transformer architecture, such as Bert [68], on ...
Read more >
FashionBERT: Text and Image Matching with Adaptive Loss ...
Pioneer approaches detect the region of interests (i.e., RoIs) from images and use the RoI embeddings as image representations.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found