Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Spacy convert not writing files larger than 2 GB.

See original GitHub issue

Copy some conllu file over and over into some file. Then run “spacy convert” it to json. You’ll see output file size is about 2147479553 or so.

How to reproduce the behaviour

Your Environment

Operating System: Linux Ubuntu 18.04
Python Version Used: 3.7 and 3.8
spaCy Version Used: 2.3.2
Environment Information:

Issue Analytics

State:
Created 3 years ago
Comments:10 (7 by maintainers)

Top GitHub Comments

1reaction

buriycommented, Sep 23, 2020

Not sure how it truncates the file but it appears to work correctly. 650mb conllu -> 2gb json, 1300mb conllu -> 2.1gb json, 2600mb conllu -> 2.1gb json, and it trains correctly.

(I was checking how the resulting quality changes when you change the dataset size)

Thanks, I’ll move to the list of smaller files, but this issue should be at least reported and a error should appear when it happens.

0reactions

github-actions[bot]commented, Oct 23, 2021

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Top Results From Across the Web

Fix 'File Is Too Large for Destination File System ... - EaseUS

Get 'The file is too large for the destination file system' error message while copying larger files with size more than 4GB to...

File is too Large to Copy to External Hard Drive (4 Ways)

Well, one possible reason is that the file is really larger than the available space of the external hard drive.

Why can't I copy large files over 4GB to my USB flash drive or ...

Learn about possible reasons for the problem why it may not be possible to copy large files over 4GB to USB flash drive...

How to fix: Memory stick says "File too large" - YouTube

Do you get the error message " File too large " when you try to copy files to your USB memory stick, even...

Quickly create large file on a Windows system - Stack Overflow

I'm confused here... if it is instantaneous, and it's not a sparse file, then how does it actually use up disk space? –...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Spacy convert not writing files larger than 2 GB.

How to reproduce the behaviour

Your Environment

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Tags in TAG_MAP not being found in debug-data

AssertionError in KB.load_bulk