question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How exactly can I use a custom model with nboost?

See original GitHub issue

Connected to #45, #49 and #35 I am struggling to get nboost working with a custom model - I am not sure where to start. What exactly needs to be the input and output of a model? What function is called?

I tried to use a model trained using code from https://github.com/ThilinaRajapakse/simpletransformers#minimal-start-for-sentence-pair-classification with regression, but no luck. I wasn’t able to set the --model argument, it keeps telling me that PtBertRerankModelPlugin is not in MODULE_MAP. It loads and nboost starts, but it raises exceptions with each query:

  File "/net/scratch/people/plgklasocki/transformers-env/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
    rv = self.dispatch_request()
  File "/net/scratch/people/plgklasocki/transformers-env/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/net/scratch/people/plgklasocki/transformers-env/lib/python3.6/site-packages/nboost/proxy.py", line 123, in proxy_through
    plugin.on_response(response, db_row)
  File "/net/scratch/people/plgklasocki/transformers-env/lib/python3.6/site-packages/nboost/plugins/rerank/base.py", line 34, in on_response
    filter_results=response.request.filter_results
  File "/net/scratch/people/plgklasocki/transformers-env/lib/python3.6/site-packages/nboost/plugins/rerank/base.py", line 65, in rank
    score = logit[1]
IndexError: index 1 is out of bounds for axis 0 with size 1

Models from simple transformers use input as in model.predict([[query, text]]) is that ok? Should it use .forward, or different input? What should the output be - single value between 0 and 1, a tensor (dimensions? ) Do you recommend a way to train such models (sentence-transformers, vanilla huggingface/transformers?)

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:12 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
pertschukcommented, Jun 1, 2020

@klasocki the transformers tokenizer.encode function supports two arguments, and automatically adds SEP (add_special_tokens=True or something), first should be query second passage

@apohllo BertConfig can be anything so long as output size = 2 (binary classification)

0reactions
klasockicommented, Jul 15, 2020

No, I actually mean worse 😆 But that could just be my data. Truncating the docs to 512 tokens worked quite well, since for Wikipedia search most important information is in the beginning anyway.

Yes nboost works that way. Usual approach is that you ask elastic for e.g. 100 documents, then nboost re-ranks them (based solely on the model, not weighting with ES) and returns e.g. 10 for you

Read more comments on GitHub >

github_iconTop Results From Across the Web

Rocket League Custom Boost Gauge (TUTORIAL) - YouTube
It's really fun to try out new Boost Gauge designs. Here's a link to my Star Boost Gauge mod: ... Your browser can...
Read more >
How to adapt a custom polygon type in boost geometry
I am trying to use boost geometry algorithms with ...
Read more >
LEGO BOOST custom models with building instructions
A rich collection of LEGO BOOST original models built with the LEGO BOOST Creative Toolbox 17101. Building and programming instructions available!
Read more >
How to enable Boost performance and software updates for ...
Choose "Custom" under PERFORMANCE MODES. Set the CPU slider to BOOST. What models of the Razer Blade does this apply to?
Read more >
Tesla Acceleration Boost: A Complete Guide
However, this does depend on the individual and how often they take advantage of the speed boost. Performance models are usually quite a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found