question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Hello and thank very much for the new TNER-Version. 😃

It seems like the old syntax/bibs (e.g. “from tner import TrainTransformersNER”) have been discontinued in the new release? (btw: you have 4 Colab Notepad-Files on your homepage that still reference these and they are also not working anymore).

I tried calling the new GridSearcher-Syntax, with my old local dataset in IOB-Format (train.txt & valid.txt), which worked fine usind the previous TNER-Version.

This O is O the O first O Entity B-SOMETHING . 0

This crashes with error message “JSONDecodeError: Expecting value: line 1 column 1 (char 0)” because the program is looking for the label-file which isnt present. So is IOB (or BIO) no longer supported and i have to convert my data into your json-format?

Thanks, Jan

searcher = GridSearcher( checkpoint_dir=‘./ckpt_tner’, dataset=“data/iob”, # either of dataset (huggingface dataset) or local_dataset (custom dataset) should be given model=“roberta-large”, # language model to fine-tune epoch=10, # the total epoch (L in the figure) epoch_partial=5, # the number of epoch at 1st stage (M in the figure) n_max_config=3, # the number of models to pass to 2nd stage (K in the figure) batch_size=16, gradient_accumulation_steps=[4, 8], crf=[True, False], lr=[1e-4, 1e-5], weight_decay=[None, 1e-7], random_seed=[42, 442], lr_warmup_step_ratio=[None, 0.1], max_grad_norm=[None, 10]
) searcher.train()

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:12 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
asahi417commented, Aug 15, 2022

I noticed that the stopword (and the stoptags) were a legacy code from the previous version, and nothing to do with the latest TNER so I just removed it.

0reactions
asahi417commented, Sep 28, 2022

We confirmed that the IOB formatting was solved but there is another issue, which has nothing to do with the format, so I’ll close this.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Inside–outside–beginning (tagging) - Wikipedia
The IOB format (short for inside, outside, beginning), also commonly referred to as the BIO format, is a common tagging format for tagging...
Read more >
NLP | IOB tags - GeeksforGeeks
What are IOB tags? It is a format for chunks. These tags are similar to part-of-speech tags but can denote the inside, outside,...
Read more >
BIO / IOB Tagged Text to Original Text | by Jeril Kuriakose
In this post we will see how to convert BIO tagged text to original text. The BIO / IOB format (short for inside,...
Read more >
Input/Output Block (IOB) Fields - IBM
Input/Output Block (IOB) Format. IOB Format. IOBFLAG1 (1 byte): Set bit positions 0, 1, 6, and 7. One-bits in positions 0 and 1...
Read more >
Difference between IOB and IOB2 format?
IOB: Here, I is used for a token inside a chunk, O is used for a token outside a chunk and B is...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found