OpenOnDemandZipFile fail to open punkt.zip on python3.x
See original GitHub issueAn exception raises when excuting an simple tutorial in nltk_book as the following command.
$ python3 -c 'from nltk import word_tokenize; text = word_tokenize("And now for something completely different")'
I think the problem is caused by the decorator @py3_data on ZipFilePathPointer.__init__ and OpenOnDemandZipFile.__init__. This decorator will append ‘/PY3’ to the first arg, which is a a zip filepath as str, like “~/nltk_data/tokenizers/punkt.zip”, but “~/nltk_data/tokenizers/punkt.zip/PY3” can’t be opened as a zipfile.
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/git/nltk/nltk/tokenize/__init__.py", line 109, in word_tokenize
return [token for sent in sent_tokenize(text, language)
File "/git/nltk/nltk/tokenize/__init__.py", line 93, in sent_tokenize
tokenizer = load('tokenizers/punkt/{0}.pickle'.format(language))
File "/git/nltk/nltk/data.py", line 808, in load
opened_resource = _open(resource_url)
File "/git/nltk/nltk/data.py", line 926, in _open
return find(path_, path + ['']).open()
File "/git/nltk/nltk/data.py", line 648, in find
raise LookupError(resource_not_found)
LookupError:
**********************************************************************
Resource 'tokenizers/punkt/PY3/english.pickle' not found.
Please use the NLTK Downloader to obtain the resource: >>>
nltk.download()
Searched in:
- '/home/joybin/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- ''
**********************************************************************
Issue Analytics
- State:
- Created 6 years ago
- Comments:7 (5 by maintainers)
Top Results From Across the Web
17541-MP02 Instruction Manual EN.pdf - Punkt.
Do not open or dismantle the MP02. The battery is not consumer-replaceable and will become hazardous if damaged. If liquid from.
Read more >NLTK Documentation - Read the Docs
A new window should open, showing the NLTK Downloader. Click on the File menu and select Change Download. Directory. For central installation, ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@alvations fyi in compat.py:
From memory these extra subdirectories were created manually.
@alvations I’m using python3.5 on ubuntu 17.04. I’m sure I have downloaded all zip file under ~/nltk_data. I can run the case without any Exception when ~/nltk_data/tokenizers/punkt.zip is unpacked.