Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

memory issues when getting tsv in derivatives

See original GitHub issue

Can’t see this issue anywhere else.

I can use BIDSLaout.get() to get the nifti files, and it does it in seconds. But if I try and get the confound tsv files instead, the python session will get killed cause it uses over 16GB of RAM to do it.

layout.get(scope=‘fMRIPrep’, desc=‘confounds’, suffix=‘regressors’, extension=‘tsv’, return_type=‘file’)

Getting one subject will work use around 10GB of RAM (and getting per subject does not feel ideal).

The following will work:

layout.get(scope=‘derivatives’, desc=‘preproc’, extension=‘nii.gz’)

So the problem is only when getting tsv files. It doesn’t matter if I set return_type to something else (I put return_type=‘file’ in the example because its not just a problem when returning BIDSDataFile). There are not more confounds tsv files than preproc nifti files either.

Is this a known/common problem or could there be a specific reason why I am hitting it? It feels counter-intuitive that getting the tsvs take up so much RAM. Is there a way around it which isn’t just loop over subjects?

Information about dataset

Not sure if any of these details are relevant. But might be helpful.

I have a BIDS directory with ca 250 runs in total. There are about 27 subjects, 2 sessions and ca 5-9 runs for one session and 1-4 runs for another session. len(layout.files) is 8240. The derivatives are in fMRIPrep. There are some other directories in the derivatives, but nothing that is added.

Full code of what I am doing below:

import bids
bids_dir = './' # pwd is in BIDS dataset
fmriprep_dir = 'derivatives/fmriprep-1.5.1/fmriprep/'
layout = bids.BIDSLayout(bids_dir)
layout.add_derivatives(bids_dir + fmriprep_dir)
# This will work
layout.get(scope='fMRIPrep', desc='preproc', extension='nii.gz')
# This will crash the session and use 100% memory on laptop with 16GB RAM
layout.get(scope='fMRIPrep', desc='confounds', suffix='regressors', extension='tsv', return_type='file')

Bids version: 0.9.4

Issue Analytics

State:
Created 4 years ago
Comments:12

Top GitHub Comments

1reaction

effigiescommented, Jan 8, 2020

I’m happy to report that I have no idea how indexes work, and have made no progress on improving performance.

1reaction

tyarkonicommented, Jan 6, 2020

I also note that we don’t currently have indexes (!). That might be a sensible place to start…

Top Results From Across the Web

Types of Memory Problems | Epilepsy Foundation

People with epilepsy are commonly afflicted by certain types of memory problems. Learn more about these problems online at the Epilepsy ...

Getting started with BIDS, fMRIPrep, MRIQC - Saren Seeley

tsv files that contain tables of metadata; Raw data files (usually .nii.gz files for fMRI data). Example. This is my directory structure from ......

Derivative aggregation | Elasticsearch Guide [8.5] | Elastic

A parent pipeline aggregation which calculates the derivative of a specified metric in a parent histogram (or date_histogram) aggregation.

High-Bandwidth Memory (HBM) delivers impressive ...

The TSV runs through the layers of HBM chips like an elevator runs through a building, greatly reducing the amount of time data...

3D TSV Market Size Worth USD 13.4 Billion, Globally, by 2028 ...

It is also being used to improve the memory and logic function of CMOS ... 3D TSV Market: Restraints * High cost and...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

memory issues when getting tsv in derivatives

Information about dataset