Update ADS datasets following BEP release

The Microscopy BEP was integrated in BIDS and can be previewed here.

It will be part of the BIDS 1.7.0 release coming soon (EDIT: released on Feb 15th), without changes ~except for a JSON files for photos files that we don’t use (AFAIK). This part is not confirmed yet but can be previewed for reference here.~ EDIT: the JSON photo file was not be part of the 1.7.0 release, so it can be ignored for now.

Here are the updates that need to be done to our ADS datasets, in order to be fully compliant and before we can complete the BEP implementation in ivadomed (ref: https://github.com/ivadomed/ivadomed/pull/1025). Note: this will not affect the behavior of segmentation with the new ADS v4, but is necessary for training new models in ivadomed.

SEM dataset

Up to date with version 1.7.0 and available here

BF dataset

Up to date with version 1.7.0 on duke
Need to be uploaded to git-annex (ref: https://github.com/neuropoly/data-management/issues/152)

TEM dataset

Corrected “fuzzy mask” to update on git-annex (ref: https://github.com/neuropoly/data-management/issues/135#issuecomment-1020293016)
Update dataset to BIDS 1.7.0:
- In dataset_description.json, update "BIDSVersion" to "1.7.0".
- In derivatives/labels/dataset_description.json, update "BIDSVersion" to "1.7.0" and add the field "GeneratedBy": [{"Name": "Axon and myelin manual segmentation labels"}] (this is also related to pybids in https://github.com/ivadomed/ivadomed/pull/994).
- All microscopy folders must be renamed micr.
- In JSON sidecar in each “sub-XX/micr” folder: remove the "FieldOfView" field and value (not part of the spec anymore), add the field "PixelSizeUnits": "um" (now required) and replace "Environment": "exvivo" by "SampleEnvironment": "ex vivo" (key and value changed).

Wakehealth dataset (git-annex)

Update dataset to BIDS 1.7.0, I’m less familiar with this dataset but it was curated at the same time as the TEM so the changes should be about the same, refer to the spec if other metadata field are present.

@hermancollin, let me know of you are still available to help with this issue. Thanks!

Issue Analytics

State:
Created 2 years ago
Comments:13 (13 by maintainers)

Top GitHub Comments

1reaction

hermancollincommented, Feb 10, 2022

Ok. I have uploaded the training dataset on git-annex (datasets/data_axondeepseg_wakehealth_training). For now, it is consistent with the bf dataset (I used acq to label ROIs and specified the source chunk in the sourcedata folder). I will roll it back to the version I used for training in ivadomed (with desc entities) but I will keep the current version in a branch in case we decide to make both bright-field datasets more consistent.

Also, since I finally figured how to create a git-annex repo, I could upload both data_axondeepseg_bf_source and its training counterpart if Nick doesn’t answer.

Also, yes it worked in ivadomed but the dataset was a derived dataset.

1reaction

mariehbourgetcommented, Feb 10, 2022

Previously, I used the desc entity to name

Oh yeah, I remember discussing this last summer. Was desc working for the training in ivadomed?

If yes, it is because we force index microscopy files (skipping the validator). This is a bit unclear because ideally we would used desc but desc is only acceptable for derivatives (as per BIDS). The issue here is that we use the ROI data as “raw data” and not “derivatives” in the ivadomed pipeline (so we are kind of in a “corner” case).

We have an issue about the naming of derivatives dataset in ivadomed, but did not get around a solution yet: https://github.com/ivadomed/ivadomed/issues/860

My recommendation for now would be to:

Keep the same name as the one used for your training (so we can reproduce it), IIUC with desc.
Document this in the README of the dataset.
Open a separate issue about the naming and failing of the validator and linked it to the ivadomed issue so we can look further into this at a later time i.e. if we integrate microscopy to pybids and not force index the files, it will fail.

I feel like the problem is that the acq entity comes after chunk.

Yes, the order is important, if you put acq before chunk, it would work. See order here.

Also, in data_axondeepseg_bf_training, you used chunks to identity the ROIs but I can’t use it for this purpose because we would lose the information about the chunk it came from (chunk-4 in this example). Maybe I could simply specify what image it came from and use the chunk entity to enumerate the ROIs?

In the case we decide to not use desc, I feel it would be better to keep the “link” to the original chunk and use the acq label for ROI. But we may want to revisit that as well to make bf and wakehealth more consistent.