Steps to utilize NeuroNER for other languages
See original GitHub issueIt appears that BART at least is pretty language agnostic. The English specific parts of NeuroNER (afaict), are the recommended glove.6B.100d
word vectors, and all of the spacy related tokenizing code, which is used to translate BART format into CoNLL format (correct?)
Am I correct that if I:
- Supply Korean word vectors in
/data/word_vectors
- Supply CoNLL formatted
train
,valid
, andtest
data using BART labeled Korean text which I run through my own tokenizer
I will be able to train and use NeuroNER for Korean text?
Issue Analytics
- State:
- Created 6 years ago
- Reactions:1
- Comments:10 (3 by maintainers)
Top Results From Across the Web
NeuroNER: an Easy-to-Use Named-Entity Recognition Tool ...
Named-entity recognition (NER) aims at identifying entities of interest in the text, such as location, organization and temporal expression. Identified entities ...
Read more >Franck-Dernoncourt/NeuroNER - GitHub
NeuroNER is a program that performs named-entity recognition (NER). Website: neuroner.com. This page gives step-by-step instructions to install and use NeuroNER ...
Read more >Speaking your mind: links between languages and other skills
If you speak Mandarin, your brain is different: Untangling the brain's mechanisms for language has been a pillar of neuroscience since its inception:...
Read more >Language Translation with RNNs - Towards Data Science
For this project, we'll use a many-to-many process where the input is a sequence of English words and the output is a sequence...
Read more >NLP with spaCy and business tools you can build right now
Let's take a look at how you can use spaCy, a state of the art natural language processing tool, to build custom software...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi (I’m the guy who uses NeuroNER in French)! These 2 steps are true, but you also need spacy (or nltk) working in Korean. I’m explaining a bit more for SpaCy : You need a SpaCy Korean model. This consist in a tokenizer and a POS Tagging model. Someone asked exactly this question : https://github.com/explosion/spaCy/issues/929 Then you will have to change spacylanguage in parameter.ini I hope I’m clear, if not, feel free to ask.
Steps (for spacy) language : X:
Correct! Note that providing word vectors is optional (it’s typically better if you have some), and that I haven’t tested NeuroNER with languages other than English. I know someone successfully used it in French (after an encoding fix PR 😃), and someone was supposed to try with Bangladeshi but I haven’t heard back from him.
On Jul 3, 2017 9:49 PM, “Sooheon Kim” notifications@github.com wrote: