question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Training BB2 on custom dataset

See original GitHub issue

I want to train the BB2 model on custom dataset with bot persona. I have the dataset in question-answer dialog format.

I prepared ParlAI format dataset using the guidance given here.

I am able to generate the task from the JSON file by following the instruction. However, I am curious, how can I add persona context as the format described in the guide only contains input-response pair.
For example: {"id": "partner1", "text": "hello how are you today?"}, {"id": "partner2", "text": "i'm great thanks! what are you doing?"}

PS. I want to add persona information because each bot suppose to generate different responses for the same question based on their persona context.

Thanks.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:14 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
geo47commented, Nov 8, 2022

if you have suitable training data, I would guess it’ll learn something. Only one way to find out!

I tried this approach. The model seems learning from the context and topics. However, it fails to handle response for any random negative input query.

For example,

  • I have script data with dialogs between PC and BOT.
  • For a given input, there is a specific response based on the individual topics.
  • Multiple topics might have similar inputs but different responses.

The model seems learning and recognizing the topic contexts. However, for a given input, whatever the PC input is (positive or negative - random), it always generates the output response based on the script.

Since the data does not have negative samples, the model seems learning only positive responses.

1reaction
klshustercommented, Nov 3, 2022

If you add specialized keys beyond text, or the ones that BB2 normally uses, they will be ignored by the model. Your best bet is to either prepend the topic to the given text, or override the agent’s observe function to specially process any keys you add

Read more comments on GitHub >

github_iconTop Results From Across the Web

Fine-tuning with custom datasets - Hugging Face
In this example, we'll show how to download, tokenize, and train a model on the IMDb reviews dataset. This task takes the text...
Read more >
Step-by-step instructions for training YOLOv7 on a Custom ...
Follow this guide to get step-by-step instructions for running YOLOv7 model training within a Gradient Notebook on a custom dataset.
Read more >
How to train the model on Custom dataset? · Issue #55 - GitHub
It is similar to the training strategy of MOT15 and MOT20. First you need to transform the custom dataset to the formation of...
Read more >
How to Train YOLOv7 On a Custom Dataset - YouTube
In this video we walk through how to train YOLOv7 on your custom dataset. 1. What's New in YOLOv72. Exploring Roboflow Universe for...
Read more >
How to Train YOLOv6 on a Custom Dataset - Roboflow Blog
Let's dive in to how to train YOLOv6 on a custom dataset. The custom dataset we'll be using for this post is Chess...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found