Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Commands for Seeker Training on Dialog GPT2 or T5?

See original GitHub issue

Hey I was wondering what the set of commands are to get parlai to train t5 or GPT2 to train with seeker parameters where the search query is the copy task. There seems to be a lot of options, was hoping there would be a way to combine the huggingface parameters and the seeker training parameters. This is what i tried

parlai train_model -m hugging_face/dialogpt --add-special-tokens True --delimiter '\n' --add-start-token True --gpt2-size medium --t projects.seeker.tasks.knowledge,projects.seeker.tasks.dialogue,projects.seeker.tasks.search_query -bs 2 -mf microsoft/DialoGPT-medium

just wondering how it knows what to do in the search task and the number of documents it looks up

Issue Analytics

State:
Created a year ago
Comments:10 (4 by maintainers)

Top GitHub Comments

1reaction

klshustercommented, Nov 3, 2022

sorry, the fix here is to edit projects/seeker/tasks/__init__.py to have the following line:

import projects.bb3.tasks.mutators

1reaction

klshustercommented, Aug 16, 2022

write a script

Not necessary! To clarify: projects.seeker.tasks.dialogue is actually a multi-task wrapper over several of the dialogue tasks. Because you want to multitask with dialogue and knowledge tasks, you can add special syntax (:mutators=my_mutator) to apply mutators (which are a way to “mutate” the data) to only certain tasks; since the dialogue and knowledge tasks require different mutators, you will need to specify each task manually (you can take a look at the respective DefaultTeachers in the projects/seeker/tasks/* files to see which teachers are being multitasked).

decoder only

this is just to separate from the standard tasks which assume an encoder/decoder model; our BB3 3B model is a FiD-style model, which requires an encoder-decoder architecture, whereas in decoder-only models deal we treat retrieved documents as simply part of the context. The DecoderOnly just means that the documents are pre-loaded into the context