tutorials: caught MemoryError when "Running in bulk" in deep/define-ml-pipeline#running-in-bulk
See original GitHub issuePlease provide information about your setup DVC version: 0.40.2 (installed by pip) OS: Ubuntu 18.04 RAM: 8GB
~~I am following a tutorial in https://dvc.org/doc/tutorial/define-ml-pipeline.~~ UPDATE: This refers to http://localhost:3000/doc/tutorials/deep/define-ml-pipeline#running-in-bulk now.
In “Running in bulk” section, I failed to run this command and caught an error.
$ dvc run -d code/featurization.py -d code/conf.py \
-d data/Posts-train.tsv -d data/Posts-test.tsv \
-o data/matrix-train.p -o data/matrix-test.p \
python code/featurization.py
Running command:
python code/featurization.py
The input data frame data/Posts-train.tsv size is (66999, 3)
Traceback (most recent call last):
File "code/featurization.py", line 48, in <module>
train_words = np.array(df_train.text.str.lower().values.astype('U'))
MemoryError
ERROR: failed to run command - stage 'matrix-train.p.dvc' cmd python code/featurization.py failed
Having any troubles?. Hit us up at https://dvc.org/support, we are always happy to help!
Issue Analytics
- State:
- Created 4 years ago
- Comments:30 (28 by maintainers)
Top Results From Across the Web
How to Handle the MemoryError in Python - Rollbar
A MemoryError is an error encountered in Python when there is no memory available for allocation. Learn two ways to solve this.
Read more >When to catch MemoryError in Python? - Stack Overflow
In Python list is a linked list so the memory Exception MemoryError will be raised only when the system can't allocate additional memory....
Read more >Fix out of Error Memory Error in Windows 10 - YouTube
Issues addressed in this tutorial : your computer is low on memory your ... desktops,and tablets running the Windows 10, Windows 8/8.1, ...
Read more >Out of memory error running batch job with Corticon Server 5.5.1
Out of memory error is captured in the event viewer log. ... This issue was resolved by increasing the timeout for in-house application...
Read more >Memory error checking in C and C++: Comparing Sanitizers ...
A program running under Valgrind could run 20 to 50 times slower ... Clang option to catch uninitialized memory reads: -fsanitize=memory .
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

@Naba7 I am working on a new tutorial. It will be up soon. With a smaller dataset and fewer RAM requirements.
This issue affects me aswell.
Paste from my Discord message:
What’s wrong? While running featurization.py I get some kind buffer overflow. 16GB of RAM get consumed in seconds and the execution halts after a couple of seconds of system freeze.
I get a The input data frame data/Posts-train.tsv size is (66999, 3) output, so far the code is valid. But the next step most likely goes sideways, because a injected print(test) does not show up after train_words.
My setup includes 16GB of RAM. Despite the older statements I don’t get a memory error raised. I think dvc may not be verbose about python errors.