question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Possible to reuse features?

See original GitHub issue

I have a binary classification task, that I want to tackle with the ClassificationModel and albert from this library.

In my case, the prediction results are not yet as great as I want them to be, so I want to label a lot more data with support of the predictions. Concretely, I want to calculate the confidence of a result as described here and then label the samples with low confidence.

I want to achieve that by just training n albert models, e.g. 10 of them, with the same data. Then I run the prediction with all of them and can see, for which predictions “they have the same opinion” (low variance), and where they have low confidence, so a high variance.

As of now, I’m naively training 10 ClassificationModels, save them and then loop through all of them with model.predict. Then I calculate the variance using the variance formula.

This works perfectly, however now I need 3s per data sample to calculate this.

What are some possible methods we can speed this up? If I understand correctly, the features can probably not be reused, as all albert layers and embeddings in the models are different after the training, as all weights in albert are affected by the training.

I was thinking about using ONNX to speed things up, but I’m not sure how much it’ll bring and how to transform a ClassificationModel to ONNX.

I know I pointed to multiple things here, so let me know if I should split stuff up into multiple issues 😃

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:14 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
ThilinaRajapaksecommented, Mar 15, 2020

No problem!

Google Colab might be an option in that case.

0reactions
stale[bot]commented, May 15, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Is it possible to reuse a feature as the "Given" for another ...
Generally, you should write your tests to work independently, although you can certainly reuse step definitions. So, in the general case, ...
Read more >
How ML Teams Share and Reuse Features in Machine Learning
Discover the ways teams can share and reuse machine learning features to succesfully scale ML across organizations.
Read more >
Maximize Your Ability For Code Reusability: 8 Effective Ways
In most cases, code reuse is possible due to the proliferation of good, open-source libraries. These non-proprietary libraries allow you to ...
Read more >
How can I reuse my gherkin scenarios? - John Ferguson Smart
So reuse comes from the underlying components and individual steps, not from whole scenarios or features. In standard Serenity, for example, ...
Read more >
Reuse and Improve - | Principles for Digital Development
While an existing tool or approach may not exactly fit all your needs for reuse, consider improving and building on it, rather than...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found