question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Adding vocab to field object

See original GitHub issue

Hi Everyone,

Is there any way of adding more vocabulary to a Field object that already has had its vocab built?

for example if at one point I have run this sort of code:

TEXT = field()
TEXT.build_vocab(dataset_a)

Is it possible to add vocab to TEXT from another dataset without erasing the current vocab built from dataset_a?

Thanks!

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:1
  • Comments:12 (6 by maintainers)

github_iconTop GitHub Comments

3reactions
mttkcommented, Sep 17, 2018

Cool, thanks for the workflow. I’ll hopefully get around making this easier sometime soon.

0reactions
tu-artemcommented, Feb 4, 2019

Some challenges I am currently facing:

  1. Currently there is an extend() method of Vocab which simply goes through itos of an argument and adds them to current vocab. I want to make it more general to accept Counter object as in Vocab.__init__() and also keep in mind original max_size and min_freq arguments. But changing it this way may harm code that is using current entend() method
  2. If vocab is built with specials_first=False it becomes unclear how to add new words, since specials will no longer be at the end if itos/stoi
  3. I am not sure how to treat a Vocab with existing vectors

Any advice is highly appreciated!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Adding a field to a vocabulary | Drupal.org
You can do this with a free tagging vocabulary to group nodes by your calculated field values or just about anything.
Read more >
torchtext.vocab - Read the Docs
Defines a vocabulary object that will be used to numericalize a field. Variables: freqs – A collections.Counter object holding the frequencies of tokens...
Read more >
AttributeError: 'Field' object has no attribute 'vocab' preventing ...
im new in field of nlp so please help me fix this code, because it gives AttributeError: 'Field' object has no attribute 'vocab'...
Read more >
Vocab · spaCy API Documentation
The Vocab object provides a lookup table that allows you to access Lexeme objects, as well as the StringStore . It also owns...
Read more >
How to Create a Vocabulary for NLP Tasks in Python
Line by line, here's what the object variable initializations are doing ... How are we going to add words to the vocabulary?
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found