question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

MemoryError when generating data table

See original GitHub issue

Hello, I’m running velocyto (version 0.17.11) on Drop-seq data (human, 2’000 cells, 30’000 genes) in a 1.5T cluster with the following command: velocyto run og007_1_star_gene_exon_tagged_corrected.bam Homo_sapiens.GRCh38.84.chr.gtf -b og007_1_topBarcodes.tsv (I also provide the cellsorted_og007_1_star_gene_exon_tagged_corrected.bam in the same folder)

After a run time of 60h on this sample and 200h (!!) on another (it ended tonight 😦 ), I got twice the following MemoryError:

2018-09-08 03:31:40,661 - DEBUG - Counting for batch 268586, containing 17 cells and 17 reads
2018-09-08 03:31:40,816 - DEBUG - 0 reads not considered because fully enclosed in repeat masked regions
2018-09-08 03:31:40,817 - DEBUG - 0 reads were skipped because no apropiate cell or umi barcode was found
2018-09-08 03:31:40,817 - DEBUG - Counting done!
2018-09-08 03:31:59,173 - DEBUG - Generating output file og007_1_star_gene_exon_tagged_corrected_599QR.loom
2018-09-08 03:31:59,173 - DEBUG - Collecting row attributes
2018-09-08 03:31:59,358 - DEBUG - Generating data table

Traceback (most recent call last):
  File "/home/tpentim/anaconda3/bin/velocyto", line 11, in <module>
    sys.exit(cli())
  File "/home/tpentim/anaconda3/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/tpentim/anaconda3/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/tpentim/anaconda3/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/tpentim/anaconda3/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/tpentim/anaconda3/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/home/tpentim/anaconda3/lib/python3.6/site-packages/velocyto/commands/run.py", line 113, in run
    samtools_memory=samtools_memory, dump=dump, verbose=verbose, additional_ca=additional_ca)
  File "/home/tpentim/anaconda3/lib/python3.6/site-packages/velocyto/commands/_run.py", line 274, in _run
    layers[layer_name] = np.concatenate(dict_list_arrays[layer_name], axis=1)
MemoryError

I would really appreciate your help in running this amazing tool!

P.s. Do you have any tip on improving run time besides from masking repeated reads?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
tpentimcommented, Sep 24, 2018

Thanks again, Giole!

I finally managed to generate the loom file 😃 the problem was as you pointed out in the .bam file. More in detail, it was solved by re-installing samtools and letting velocyto cellsort it 👍

0reactions
gioelelmcommented, Sep 14, 2018

Something is not right in your bam and/or valid barcodes fikw, look at this line in the report:

DEBUG - Counting for batch 134923, containing 100 cells and 138 reads

This suggests that you have around one million of cell barcodes, with just a couple of reads each.

Read more comments on GitHub >

github_iconTop Results From Across the Web

MemoryError when creating and writing - python
Do not create the empty table. Load the data from the file and reshape it to the shape you want (four columns and...
Read more >
datatable.exceptions.MemoryError
datatable.exceptions. MemoryError. This exception is raised whenever any operation fails to allocate the required amount of memory.
Read more >
Datatable Out of Memory Error for Bulk Records in c# - MSDN
hi. i m using winform 2008 with c# and mysql server. i have 1500000 records in table. i want export each 65000 to...
Read more >
Avoiding the Out Of Memory Error When Using Database ...
When your table has data that are time stamped, it becomes fairly easy to move the data by date. Create the new table,...
Read more >
Large result sets with editor server side processing returning ...
DataTable ( { //"pageLength":25, iDisplayLength: 50, // hide some columns initially // targets are the index of the columns.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found