Big Data in CPU problem
See original GitHub issueHi, I am trying to build bangla sentence embedding from sentence transformers using multilingual.py
-
When I try to run 3,00,000 parallel sentences in pycharm. It took 4 hours. Then suddenly it stopped showing me ‘Sigkill’.
-
then I try to run only 50,000 parallel sentences in collab. It took 3 hours then collab suddenly stopped.
I found it took time for MSE evaluator. I give English sentences as src sentences and Bangla sentences as trg sentences. And it stacks here.
2020-12-04 03:53:04 - Load teacher model
2020-12-04 03:53:04 - Load pretrained SentenceTransformer: bert-base-nli-stsb-mean-tokens
2020-12-04 03:53:04 - Did not find folder bert-base-nli-stsb-mean-tokens
2020-12-04 03:53:04 - Try to download model from server: https://sbert.net/models/bert-base-nli-stsb-mean-tokens.zip
2020-12-04 03:53:04 - Downloading sentence transformer model from https://sbert.net/models/bert-base-nli-stsb-mean-tokens.zip and saving it at /root/.cache/torch/sentence_transformers/sbert.net_models_bert-base-nli-stsb-mean-tokens
100%|██████████| 405M/405M [00:21<00:00, 18.9MB/s]
2020-12-04 03:53:34 - Load SentenceTransformer from folder: /root/.cache/torch/sentence_transformers/sbert.net_models_bert-base-nli-stsb-mean-tokens
2020-12-04 03:53:41 - Use pytorch device: cpu
2020-12-04 03:53:41 - Create student model from scratch
2020-12-04 03:53:41 - Lock 139755517346088 acquired on /root/.cache/torch/transformers/87683eb92ea383b0475fecf99970e950a03c9ff5e51648d6eee56fb754612465.ab95cf27f9419a99cce4f19d09e655aba382a2bafe2fe26d0cc24c18cf1a1af6.lock
Downloading: 100%
512/512 [00:00<00:00, 1.30kB/s]
2020-12-04 03:53:41 - Lock 139755517346088 released on /root/.cache/torch/transformers/87683eb92ea383b0475fecf99970e950a03c9ff5e51648d6eee56fb754612465.ab95cf27f9419a99cce4f19d09e655aba382a2bafe2fe26d0cc24c18cf1a1af6.lock
2020-12-04 03:53:41 - Lock 139757055414056 acquired on /root/.cache/torch/transformers/97d0ea09f8074264957d062ec20ccb79af7b917d091add8261b26874daf51b5d.f42212747c1c27fcebaa0a89e2a83c38c6d3d4340f21922f892b88d882146ac2.lock
Downloading: 100%
1.12G/1.12G [01:17<00:00, 14.3MB/s]
2020-12-04 03:54:52 - Lock 139757055414056 released on /root/.cache/torch/transformers/97d0ea09f8074264957d062ec20ccb79af7b917d091add8261b26874daf51b5d.f42212747c1c27fcebaa0a89e2a83c38c6d3d4340f21922f892b88d882146ac2.lock
2020-12-04 03:55:01 - Lock 139754476641080 acquired on /root/.cache/torch/transformers/9df9ae4442348b73950203b63d1b8ed2d18eba68921872aee0c3a9d05b9673c6.00628a9eeb8baf4080d44a0abe9fe8057893de20c7cb6e6423cddbf452f7d4d8.lock
Downloading: 100%
5.07M/5.07M [00:01<00:00, 4.44MB/s]
2020-12-04 03:55:01 - Lock 139754476641080 released on /root/.cache/torch/transformers/9df9ae4442348b73950203b63d1b8ed2d18eba68921872aee0c3a9d05b9673c6.00628a9eeb8baf4080d44a0abe9fe8057893de20c7cb6e6423cddbf452f7d4d8.lock
2020-12-04 03:55:02 - Use pytorch device: cpu
2020-12-04 03:55:02 - Load temp.txt
sentence cnt 49999
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
What to Do When Your Data Is Too Big for Your Memory?
There are different options to solve the problem of big data, small problems. These solutions either cost time or money.
Read more >Big data : The processing efficiency issue - UPMEM
Unfortunately, running basic operations on a lot of data is not convenient for CPUs, because of the cache architecture inefficiency in such ...
Read more >Big Data Problem - an overview | ScienceDirect Topics
Big data problems have brought many changes in the way data is processed and managed over time. Today, data is not just posing...
Read more >Chapter 4. Handling large data on a single computer
A large volume of data poses new challenges, such as overloaded memory and algorithms that never stop running. It forces you to adapt...
Read more >Big CPU, Big Data: Solving the World's Toughest ...
BIG CPU, BIG DATA teaches you how to write parallel programs for multicore machines, compute clusters, GPU accelerators, and big data map-reduce jobs,...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Have a look here: https://arxiv.org/abs/2004.09813
Table 6.
On a fast GPU, training should be quite quick for this size of data (on a V100 GPU, maybe 4 hours).
Yes, you can use google collab.
For the evaluator 1k data is sufficient. For training, more data is useful. I used to train the multilingual models for 50+ languages with ten millions of sentences