How to evaluate prompt tuning model?
See original GitHub issueHi, OFA team,
I have used prompt tuning method to train a vqa-gen task, and evaluate the model via run_scripts/vqa/evaluate_vqa_beam.sh
directly, but got error below:
Traceback (most recent call last):
File "../../evaluate.py", line 160, in <module>
cli_main()
File "../../evaluate.py", line 154, in cli_main
distributed_utils.call_main(
File "/workspace/project/OFA/fairseq/fairseq/distributed/utils.py", line 376, in call_main
distributed_main(cfg.distributed_training.device_id, main, cfg, kwargs)
File "/workspace/project/OFA/fairseq/fairseq/distributed/utils.py", line 350, in distributed_main
main(cfg, **kwargs)
File "../../evaluate.py", line 138, in main
result, scores = eval_step(task, generator, models, sample, **kwargs)
File "/workspace/project/OFA/utils/eval_utils.py", line 306, in eval_step
return eval_vqa_gen(task, generator, models, sample, **kwargs)
File "/workspace/project/OFA/utils/eval_utils.py", line 47, in eval_vqa_gen
hypos = task.inference_step(generator, models, sample, prefix_tokens=sample['prefix_tokens'])
File "/workspace/project/OFA/fairseq/fairseq/tasks/fairseq_task.py", line 517, in inference_step
return generator.generate(
File "/opt/conda/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/workspace/project/OFA/models/sequence_generator.py", line 209, in generate
return self._generate(models, sample, **kwargs)
File "/workspace/project/OFA/models/sequence_generator.py", line 354, in _generate
lprobs, avg_attn_scores = model.forward_decoder(
File "/workspace/project/OFA/models/sequence_generator.py", line 824, in forward_decoder
decoder_out = model.decoder.forward(
File "/workspace/project/OFA/models/ofa/unify_transformer.py", line 1343, in forward
x, extra = self.extract_features(
File "/workspace/project/OFA/models/ofa/unify_transformer.py", line 1367, in extract_features
return self.extract_features_scriptable(
File "/workspace/project/OFA/models/ofa/unify_transformer.py", line 1532, in extract_features_scriptable
x, layer_attn, _ = layer(
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/workspace/project/OFA/models/ofa/unify_transformer_layer.py", line 500, in forward
x, attn = self.self_attn(
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/workspace/project/OFA/models/ofa/unify_multihead_attention.py", line 342, in forward
assert key_padding_mask.size(1) == k.size(1), "{} vs {}".format(
AssertionError: 101 vs 102
Could you please provide an example for evaluating prompt tuning models, thanks?
Issue Analytics
- State:
- Created 9 months ago
- Comments:14
Top Results From Across the Web
Prompting: Better Ways of Using Language Models for NLP ...
Tuning soft prompts is very different from prompt-based fine-tuning, which allows one to optimize the full model and, more importantly, handle ...
Read more >Guiding Frozen Language Models with Learned Soft Prompts
When evaluated on SuperGLUE and using a frozen T5 model, prompt tuning significantly outperforms prompt design using either GPT-3 or T5.
Read more >Prompt-based learning
It remains unclear how to learn continuous/ soft prompts effectively? • P-tuning (Liu et al., 2021): encode dependencies between prompt tokens using a...
Read more >Research talk: Prompt tuning: What works and what's next
Speaker: Danqi Chen, Assistant Professor, Princeton UniversityThe AI landscape has been transformed by the advent of large-scale models like ...
Read more >[2210.12360] Prompt-Tuning Can Be Much Better Than Fine ...
In this paper, we do cross-lingual evaluation on various NLU tasks (sentence classification, sequence labeling, question answering) using prompt ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@yh351016 Thanks, I tried this and I can run evaluation now.
@flymark2010 if provided, specified padding elements in the key will be ignored by the attention. When given a binary mask and a value is True, the corresponding value on the attention layer will be ignored. When given a byte mask and a value is non-zero, the corresponding value on the attention layer will be ignored. You can try to keep the length of the key padding mask consistent with k , filling with the same original value.