Dev Observability
Product
Pricing
Docs
Resources
Blog
Company
Debug Wordle

question-mark

Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Support for pre-training the language model

See original GitHub issue

Is your feature request related to a problem? Please describe. In order to use the classifier on different languages / specific domains it would be useful to be able to pretrain the language model.

Describe the solution you’d like Calling .fit on a corpus (i.e.) no labels should train the language model.

model.fit(corpus)

Describe alternatives you’ve considered Use the original repo which doesn’t have a simple to use interface.

Issue Analytics

State:
Created 5 years ago
Comments:11 (7 by maintainers)

Top GitHub Comments

2reactions

benleetownsendcommented, Aug 2, 2018

@xuy2 This code is merged into master now.

1reaction

madisonmaycommented, Aug 7, 2018

lt means the latter – randomly choosing 512 contiguous tokens from an article. A random slice of text.

Read more comments on GitHub >

Top Results From Across the Web

Why Do Pretrained Language Models Help in Downstream ...

Abstract: Pretrained language models have achieved state-of-the-art performance when adapted to a downstream NLP task.

Why Do Pretrained Language Models Help in ... - OpenReview

We propose an analysis framework that links the pretraining and downstream tasks with an underlying latent variable generative model of text — the...

Training a causal language model from scratch - Hugging Face

In this chapter, we'll take a different approach and train a completely new model from scratch. This is a good approach to take...

Pre-trained Language Models: Simplified | by Prakhar Ganesh

A model which trains only on the task-specific dataset needs to both understand the language and the task using a comparatively smaller dataset....

lyeoni/pretraining-for-language-understanding: Pre-training of ...

A language model would be trained on a massive corpus, and then we can use it as a component in other models that...

Top Related Medium Post

No results found

Top Related StackOverflow Question

No results found

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Top Related Reddit Thread

No results found

Top Related Hackernoon Post

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Top Related Hashnode Post

No results found

Keep outputting '0it [00:00, ?it/s]'

'PathObject' object has no attribute 'text'