question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

About Programmatically usage

See original GitHub issue

I’m trying to use the package programatically. I’m doing

    from subword_nmt.apply_bpe import BPE, read_vocabulary
     # read/write files as UTF-8
    bpe_codes_fin = codecs.open(bpe_codes, encoding='utf-8')
    bpe_vocab_fin = codecs.open(bpe_vocab, encoding='utf-8')
    vocabulary = read_vocabulary(bpe_vocab_fin, vocabulary_threshold)

    bpe = BPE(bpe_codes_fin, merges=-1, separator='@@', vocab=vocabulary, glossaries=None)
    codes = bpe.process_line(line)

Is that correct? Also, I’m not sure of the vocabulary_threshold, since I do not see any default value. Is there any one?

Thank you.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:9 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
rsennrichcommented, Apr 4, 2019

try adding this as the first line to the BPE file:

#version: 0.2

the reason for this is explained in the README. It looks like fastBPE implements the new variant (v 0.2) as well.

1reaction
rsennrichcommented, Apr 3, 2019

As to your first question, have a look at your vocabulary file - whether you set the threshold to 5 or 500 won’t make a big difference for you, since most rare tokens are single (non-Latin) characters that won’t be affected by this.

FAIR LASER uses a different BPE implementation ( https://github.com/glample/fastBPE ), which seems to store the BPE file in a different format. It might work if you simply remove the third item in each entry (the frequency), but I can’t guarantee there’s no other inconsistency, e.g. in how UTF-8 whitespace is handled.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Programmatically Definition & Meaning - Dictionary.com
In computing, a program is a sequence of instructions (called code) that enable a computer to perform a task. Programmatically is used to...
Read more >
programmatically adverb - Oxford Learner's Dictionaries
​in a way that is connected with, suggests or follows a plan. Programmatically, not a great deal separated the two parties in the...
Read more >
PROGRAMMATICALLY definition - Cambridge Dictionary
in a way that follows a plan or uses a particular method : Programmatically, we are guided by a set of rules. We...
Read more >
Interact programmatically with the Navigation component
The Navigation component provides ways to programmatically create and interact with certain navigation elements.
Read more >
How to track app usage time programmatically in android?
actually you can consider it as a session. That means when the user first open the app ,spend some time then go to...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found