IndexError: list index out of range while training parser
See original GitHub issueTraining pipeline: ['parser']
Starting with blank model 'ko'
Counting training words (limit=0)
Traceback (most recent call last):
File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/ksjae/.local/lib/python3.7/site-packages/spacy/__main__.py", line
35, in <module>
plac.call(commands[command], sys.argv[1:])
File "/home/ksjae/.local/lib/python3.7/site-packages/plac_core.py", line 328,
in call
cmd, result = parser.consume(arglist)
File "/home/ksjae/.local/lib/python3.7/site-packages/plac_core.py", line 207,
in consume
return cmd, self.func(*(args + varargs + extraopts), **kwargs)
File "/home/ksjae/.local/lib/python3.7/site-packages/spacy/cli/train.py", line
213, in train
optimizer = nlp.begin_training(lambda: corpus.train_tuples, device=use_gpu)
File "/home/ksjae/.local/lib/python3.7/site-packages/spacy/language.py", line 583, in begin_training
**kwargs
File "nn_parser.pyx", line 576, in spacy.syntax.nn_parser.Parser.begin_training
File "arc_eager.pyx", line 346, in spacy.syntax.arc_eager.ArcEager.get_actions
File "nonproj.pyx", line 123, in spacy.syntax.nonproj.projectivize
File "nonproj.pyx", line 172, in spacy.syntax.nonproj._get_smallest_nonproj_arc
File "nonproj.pyx", line 58, in spacy.syntax.nonproj.is_nonproj_arc
File "nonproj.pyx", line 26, in ancestors
IndexError: list index out of range
How to reproduce the behaviour
Training code as-is from document
python3 -m spacy train ko model KNLI-spacy.json KNLI-spacy-dev.json -p parser
Use this json file EDIT: These are faulty but remained in place, use Corpus.zip for newest ones https://1drv.ms/u/s!Aq0-1ykl7mZBqWCbqo6cq4X1amma?e=1eBExo for KNLI-spacy.json https://1drv.ms/u/s!Aq0-1ykl7mZBqWGLptXC0Ba5nGFK?e=suX2RJ for KNLI-spacy-dev.json
Your Environment
- spaCy version: 2.1.9
- Platform: Linux-4.4.0-178-generic-x86_64-with-debian-stretch-sid
- Python version: 3.7.7
Issue Analytics
- State:
- Created 3 years ago
- Comments:20 (9 by maintainers)
Top Results From Across the Web
IndexError: list index out of range during training in Tensorflow
The format of your call to Sequential.fit is incorrect; the first two parameters should be x and y , rather than a tuple...
Read more >Python IndexError: List Index Out of Range [Easy Fix] - Finxter
To solve the “IndexError: list index out of range”, avoid do not access a non-existing list index. For example, my_list[5] causes an error...
Read more >List Index Out of Range – Python Error Message Solved
You'll get the Indexerror: list index out of range error when iterating through a list and trying to access an item that doesn't...
Read more >Indexerror: list Index Out of Range in Python - STechies
“List index out of range” error occurs in Python when we try to access an undefined element from the list. The only way...
Read more >IndexError: list index out of range and python - LearnDataSci
Cause 1: Indexing the Final List Value ... This problem frequently occurs when trying to index the end of a list. ... Recall...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi, the problem is that the heads haven’t been converted correctly for spacy’s training format. The heads should be relative to the current token, not absolute IDs. The root should have head
0
and all other tokens should have heads relative to their position, so a head of-2
would mean the head is two words to the left,1
would mean one word to the right, etc.The data loader should fail with a more useful error in this case, though. I’ll take a look to see how this could be improved.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.