question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Big Data in CPU problem

See original GitHub issue

Hi, I am trying to build bangla sentence embedding from sentence transformers using multilingual.py

  1. When I try to run 3,00,000 parallel sentences in pycharm. It took 4 hours. Then suddenly it stopped showing me ‘Sigkill’.

  2. then I try to run only 50,000 parallel sentences in collab. It took 3 hours then collab suddenly stopped.

I found it took time for MSE evaluator. I give English sentences as src sentences and Bangla sentences as trg sentences. And it stacks here.

2020-12-04 03:53:04 - Load teacher model
2020-12-04 03:53:04 - Load pretrained SentenceTransformer: bert-base-nli-stsb-mean-tokens
2020-12-04 03:53:04 - Did not find folder bert-base-nli-stsb-mean-tokens
2020-12-04 03:53:04 - Try to download model from server: https://sbert.net/models/bert-base-nli-stsb-mean-tokens.zip
2020-12-04 03:53:04 - Downloading sentence transformer model from https://sbert.net/models/bert-base-nli-stsb-mean-tokens.zip and saving it at /root/.cache/torch/sentence_transformers/sbert.net_models_bert-base-nli-stsb-mean-tokens
100%|██████████| 405M/405M [00:21<00:00, 18.9MB/s]
2020-12-04 03:53:34 - Load SentenceTransformer from folder: /root/.cache/torch/sentence_transformers/sbert.net_models_bert-base-nli-stsb-mean-tokens
2020-12-04 03:53:41 - Use pytorch device: cpu
2020-12-04 03:53:41 - Create student model from scratch
2020-12-04 03:53:41 - Lock 139755517346088 acquired on /root/.cache/torch/transformers/87683eb92ea383b0475fecf99970e950a03c9ff5e51648d6eee56fb754612465.ab95cf27f9419a99cce4f19d09e655aba382a2bafe2fe26d0cc24c18cf1a1af6.lock
Downloading: 100%
512/512 [00:00<00:00, 1.30kB/s]

2020-12-04 03:53:41 - Lock 139755517346088 released on /root/.cache/torch/transformers/87683eb92ea383b0475fecf99970e950a03c9ff5e51648d6eee56fb754612465.ab95cf27f9419a99cce4f19d09e655aba382a2bafe2fe26d0cc24c18cf1a1af6.lock
2020-12-04 03:53:41 - Lock 139757055414056 acquired on /root/.cache/torch/transformers/97d0ea09f8074264957d062ec20ccb79af7b917d091add8261b26874daf51b5d.f42212747c1c27fcebaa0a89e2a83c38c6d3d4340f21922f892b88d882146ac2.lock
Downloading: 100%
1.12G/1.12G [01:17<00:00, 14.3MB/s]

2020-12-04 03:54:52 - Lock 139757055414056 released on /root/.cache/torch/transformers/97d0ea09f8074264957d062ec20ccb79af7b917d091add8261b26874daf51b5d.f42212747c1c27fcebaa0a89e2a83c38c6d3d4340f21922f892b88d882146ac2.lock
2020-12-04 03:55:01 - Lock 139754476641080 acquired on /root/.cache/torch/transformers/9df9ae4442348b73950203b63d1b8ed2d18eba68921872aee0c3a9d05b9673c6.00628a9eeb8baf4080d44a0abe9fe8057893de20c7cb6e6423cddbf452f7d4d8.lock
Downloading: 100%
5.07M/5.07M [00:01<00:00, 4.44MB/s]

2020-12-04 03:55:01 - Lock 139754476641080 released on /root/.cache/torch/transformers/9df9ae4442348b73950203b63d1b8ed2d18eba68921872aee0c3a9d05b9673c6.00628a9eeb8baf4080d44a0abe9fe8057893de20c7cb6e6423cddbf452f7d4d8.lock
2020-12-04 03:55:02 - Use pytorch device: cpu
2020-12-04 03:55:02 - Load temp.txt
sentence cnt 49999

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
nreimerscommented, Dec 4, 2020

Have a look here: https://arxiv.org/abs/2004.09813

Table 6.

On a fast GPU, training should be quite quick for this size of data (on a V100 GPU, maybe 4 hours).

1reaction
nreimerscommented, Dec 4, 2020

Yes, you can use google collab.

For the evaluator 1k data is sufficient. For training, more data is useful. I used to train the multilingual models for 50+ languages with ten millions of sentences

Read more comments on GitHub >

github_iconTop Results From Across the Web

What to Do When Your Data Is Too Big for Your Memory?
There are different options to solve the problem of big data, small problems. These solutions either cost time or money.
Read more >
Big data : The processing efficiency issue - UPMEM
Unfortunately, running basic operations on a lot of data is not convenient for CPUs, because of the cache architecture inefficiency in such ...
Read more >
Big Data Problem - an overview | ScienceDirect Topics
Big data problems have brought many changes in the way data is processed and managed over time. Today, data is not just posing...
Read more >
Chapter 4. Handling large data on a single computer
A large volume of data poses new challenges, such as overloaded memory and algorithms that never stop running. It forces you to adapt...
Read more >
Big CPU, Big Data: Solving the World's Toughest ...
BIG CPU, BIG DATA teaches you how to write parallel programs for multicore machines, compute clusters, GPU accelerators, and big data map-reduce jobs,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found