question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

memory issues when getting tsv in derivatives

See original GitHub issue

Can’t see this issue anywhere else.

I can use BIDSLaout.get() to get the nifti files, and it does it in seconds. But if I try and get the confound tsv files instead, the python session will get killed cause it uses over 16GB of RAM to do it.

layout.get(scope=‘fMRIPrep’, desc=‘confounds’, suffix=‘regressors’, extension=‘tsv’, return_type=‘file’)

Getting one subject will work use around 10GB of RAM (and getting per subject does not feel ideal).

The following will work:

layout.get(scope=‘derivatives’, desc=‘preproc’, extension=‘nii.gz’)

So the problem is only when getting tsv files. It doesn’t matter if I set return_type to something else (I put return_type=‘file’ in the example because its not just a problem when returning BIDSDataFile). There are not more confounds tsv files than preproc nifti files either.

Is this a known/common problem or could there be a specific reason why I am hitting it? It feels counter-intuitive that getting the tsvs take up so much RAM. Is there a way around it which isn’t just loop over subjects?

Information about dataset

Not sure if any of these details are relevant. But might be helpful.

I have a BIDS directory with ca 250 runs in total. There are about 27 subjects, 2 sessions and ca 5-9 runs for one session and 1-4 runs for another session. len(layout.files) is 8240. The derivatives are in fMRIPrep. There are some other directories in the derivatives, but nothing that is added.

Full code of what I am doing below:

import bids
bids_dir = './' # pwd is in BIDS dataset
fmriprep_dir = 'derivatives/fmriprep-1.5.1/fmriprep/'
layout = bids.BIDSLayout(bids_dir)
layout.add_derivatives(bids_dir + fmriprep_dir)
# This will work
layout.get(scope='fMRIPrep', desc='preproc', extension='nii.gz')
# This will crash the session and use 100% memory on laptop with 16GB RAM
layout.get(scope='fMRIPrep', desc='confounds', suffix='regressors', extension='tsv', return_type='file')

Bids version: 0.9.4

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:12

github_iconTop GitHub Comments

1reaction
effigiescommented, Jan 8, 2020

I’m happy to report that I have no idea how indexes work, and have made no progress on improving performance.

1reaction
tyarkonicommented, Jan 6, 2020

I also note that we don’t currently have indexes (!). That might be a sensible place to start…

Read more comments on GitHub >

github_iconTop Results From Across the Web

Types of Memory Problems | Epilepsy Foundation
People with epilepsy are commonly afflicted by certain types of memory problems. Learn more about these problems online at the Epilepsy ...
Read more >
Getting started with BIDS, fMRIPrep, MRIQC - Saren Seeley
tsv files that contain tables of metadata; Raw data files (usually .nii.gz files for fMRI data). Example. This is my directory structure from ......
Read more >
Derivative aggregation | Elasticsearch Guide [8.5] | Elastic
A parent pipeline aggregation which calculates the derivative of a specified metric in a parent histogram (or date_histogram) aggregation.
Read more >
High-Bandwidth Memory (HBM) delivers impressive ...
The TSV runs through the layers of HBM chips like an elevator runs through a building, greatly reducing the amount of time data...
Read more >
3D TSV Market Size Worth USD 13.4 Billion, Globally, by 2028 ...
It is also being used to improve the memory and logic function of CMOS ... 3D TSV Market: Restraints * High cost and...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found