question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BB3 Response Time

See original GitHub issue

Hi, I’m trying to host a chat service with blenderbot3 (3B) across 5 NVIDIA GeForce RTX 2080 Ti gpus. Here is my config file

tasks:
  default:
    onboard_world: MessengerBotChatOnboardWorld
    task_world: MessengerBotChatTaskWorld
    timeout: 1800
    agents_required: 1
task_name: chatbot
world_module: parlai.chat_service.tasks.chatbot.worlds
overworld: MessengerOverworld
max_workers: 1000
opt:
  debug: True
  models:
    blenderbot3_3B:
      model: projects.seeker.agents.seeker:ComboFidGoldDocumentAgent
      init_opt: gen/r2c2_bb3
      model_file: zoo:bb3/bb3_3B/model
      interactive_mode: True
      no_cuda: False
      override:
        init_opt: gen/r2c2_bb3
        search_server: https://www.google.com
        model_parallel: True
additional_args:
  page_id: 1 # Configure Your Own Page

I was wondering if there are any methods I could use to increase the response time for the bot. Currently it takes the bot around 10-15 seconds to respond with my current setup which seems a bit slow, especially compared to the bb3 demo which is faster and uses a larger model. Let me know if there are any other details I can provide

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
mojtaba-komeilicommented, Oct 31, 2022

3-4 seconds seems reasonable. And you are right that decoding is taking most of the time during inference. There are some ongoing projects for improving decoding. So stay tuned on that. For serving multiple users, you may try a batching mechanism that keeps requests in a queue and batches them between each inference. Have a look at batching in this code for reference.

0reactions
ryanshea10commented, Nov 7, 2022

Great! Thanks for the help with this. I think I’ve managed to mostly resolve the issue so I will close it for now.

Read more comments on GitHub >

github_iconTop Results From Across the Web

DCP405 module measurements – EEZ
The response time is within 20 μs that is almost two order of magnitude better then software based OVP. The similar response time...
Read more >
Late intervention with the small molecule BB3 mitigates ... - NCBI
Animals were administered vehicle or BB3 once daily until euthanization. To identify any effects of BB3 on urine output in healthy, i.e., non-IR...
Read more >
Study to Evaluate the Safety and Activity of BB3 to Treat Heart ...
Evaluation of the degree of late ventricular remodeling between the BB3 and placebo treatment groups at 6 months, as measured by increase in...
Read more >
ParlAI/README.md at main - agents - GitHub
A framework for training and evaluating AI models on a variety of openly available dialogue datasets. - ParlAI/README.md at main · facebookresearch/ParlAI.
Read more >
EEZ Bench Box 3 (BB3) - EEVblog
EEZ Bench Box 3 (BB3) - Page 14. ... Boot time, UI response time and especially control and read out of the outputs....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found