Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Training BB2 on custom dataset

See original GitHub issue

I want to train the BB2 model on custom dataset with bot persona. I have the dataset in question-answer dialog format.

I prepared ParlAI format dataset using the guidance given here.

I am able to generate the task from the JSON file by following the instruction. However, I am curious, how can I add persona context as the format described in the guide only contains input-response pair.
For example: {"id": "partner1", "text": "hello how are you today?"}, {"id": "partner2", "text": "i'm great thanks! what are you doing?"}

PS. I want to add persona information because each bot suppose to generate different responses for the same question based on their persona context.

Thanks.

Issue Analytics

State:
Created a year ago
Comments:14 (6 by maintainers)

Top GitHub Comments

1reaction

geo47commented, Nov 8, 2022

if you have suitable training data, I would guess it’ll learn something. Only one way to find out!

I tried this approach. The model seems learning from the context and topics. However, it fails to handle response for any random negative input query.

For example,

I have script data with dialogs between PC and BOT.
For a given input, there is a specific response based on the individual topics.
Multiple topics might have similar inputs but different responses.

The model seems learning and recognizing the topic contexts. However, for a given input, whatever the PC input is (positive or negative - random), it always generates the output response based on the script.

Since the data does not have negative samples, the model seems learning only positive responses.

1reaction

klshustercommented, Nov 3, 2022

If you add specialized keys beyond text, or the ones that BB2 normally uses, they will be ignored by the model. Your best bet is to either prepend the topic to the given text, or override the agent’s observe function to specially process any keys you add