Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Target suffix loading: use of `contains` may pick multiple targets

See original GitHub issue

Issue description

Current behavior

The current implementation of target suffix loading involves the use of the contains() string method to capture target files as shown in the below lines:

https://github.com/ivadomed/ivadomed/blob/76b36a0a0f7141feb2d5b00b33e4c3a06865fc2c/ivadomed/loader/bids_dataframe.py#L125-L128

This means that

if your derivatives contains multiple filenames with overlap (e.g. _lesion-manual.nii.gz and _lesion-manual2.nii.gz) and
if you use a "target_suffix" containing the overlap (e.g. _lesion-manual),

then the current loading process picks all of the filenames with the given overlap as potential targets.

This is problematic because the user might (reasonably) think that only _lesion-manual.nii.gz will be used as the ground-truth when the "target_suffix" is specified as _lesion-manual, whereas in reality both _lesion-manual.nii.gz and _lesion-manual2.nii.gz are used.

I didn’t go deeper in the codebase yet to see what happens when multiple targets are picked during the loading process, but I can confirm that the GT that ends up being used changes from run to run based on my experiments with ivadomed --test.

Expected behavior

I would expect the "target_suffix" to exactly match the target filename. Two options we have are:

Changing line 128 as shown above with something like

& df_next['filename'].str.split(os.extsep).apply(lambda x: x[0]).str.endswith('|'.join(self.target_suffix)))]

which enforces an exact match of target suffix.

In case the use of contains() is justified in some scenarios and its implemented that way for a specific reason, the user can instead give a full "target_suffix" including the file extension such as _lesion-manual.nii.gz. I have tested this and can confirm it solves the problem.

The latter option is notably a lot easier to implement as it doesn’t require a change in the codebase. However, I think this is an issue many people might face without knowing it in the future.

Steps to reproduce

Download the basel-mp2rage dataset as shown here.
Run preprocessing on the data as shown here.
Insert print(bids_df.get_deriv_fnames()) after line 370 in ivadomed/main.py which initializes the BIDS data frame.
Run the following config file with ivadomed --train.
Take a look at the output of the print statement from step 3.

However, this is specific to one dataset and it is understandable that not everybody will be willing to run preprocessing etc. on this particular dataset. In this case, I would like to point out that this issue is reproducible with any dataset which has multiple annotations per subject where the string target suffix for these annotations have an overlap.

Issue Analytics

State:
Created 2 years ago
Reactions:1
Comments:8 (8 by maintainers)

Top GitHub Comments

1reaction

uzaymacarcommented, Mar 10, 2022

Feel free to take it on @mariehbourget! I only created a branch so it’s OK.

1reaction

mariehbourgetcommented, Mar 10, 2022

While debugging for another issue #1096 from an external user, I came across this exact problem. In this case, we have the following target_suffix (coming from the segmentation in ADS): _seg-axon _seg-axonmyelin _seg-myelin With _seg-axon in the config file, both _seg-axon and _seg-axonmyelin are picked up in the indexation and one or the other is used for training which is not good at all.

I’ll inform the user for now and tag this issue as high priority.

Top Results From Across the Web

GNU make

Force Targets, You can use a target without a recipe or ... A recipe may have more than one command, either on the...

Tutorial on writing makefiles

Tutorial on writing makefiles. Do I need a makefile? A simple makefile. Using variables; Pattern rules. Phony targets; Working with several directories.

Build settings reference | Apple Developer Documentation

A detailed list of individual Xcode build settings that control or change the way a target is built.

describe-target-groups — AWS CLI 1.27.37 Command ...

describe-target-groups is a paginated operation. Multiple API calls may be issued in order to retrieve the entire data set of results.

Targets - Parcel

Parcel can compile your source code in multiple different ways simultaneously. These are called targets. For example, you could have a “modern” target...