Wrong Segmentation Slices Position
See original GitHub issueI’m writing a script to attach AI made lesion segmentations to existing DICOM files (It’s my first time that I dive so deep into DICOM, so I might have missed something).
I have found your repository and used hd.seg.Segmentation
, but unfortunately the order of the segmentation slices was incorrect.
First I tried to use just a dummy segmentation (simple 3D box over an MRI), and after encountering the issue I decided to try another approach:
I took an existing DICOM file (of a CT scan) that contains both a scan and a segmentation, converted the segmentation to np.array, and used is as the mask
input, while the scan folder is the source_images
.
From the original segmentation file (9 slices between -224.5mm to -192.5mm, 4mm thickness):
(0020,0032) DS [-189.136\-320.136\-224.5] # 24, 3 ImagePositionPatient
(0020,0032) DS [-189.136\-320.136\-220.5] # 24, 3 ImagePositionPatient
:
(0020,0032) DS [-189.136\-320.136\-192.5] # 24, 3 ImagePositionPatient
Versions: Python 3.8.10 numpy 1.21.2 pydicom 2.2.1 highdicom 0.10.0
Code snippets:
:
# Collection the scan images
image_datasets = [dcmread(str(f)) for f in image_files]
# Creating a segmentation mask from the existing segmentation. shape = (108, 512, 512)
mask = np.zeros(
shape=(
len(image_datasets),
image_datasets[0].Rows,
image_datasets[0].Columns
),
dtype=bool
)
mask[86:95, :, :] = np.array(hd.seg.segread(
'/home/ben/.../original_seg.dcm'
).pixel_array, dtype=bool)
# Printing to check if the order is correct
sl_num = -568.5
for slice in mask:
print(sl_num, np.unique(slice))
sl_num += 4
Output:
-568.5 [False]
-564.5 [False]
-560.5 [False]
:
-236.5 [False]
-232.5 [False]
-228.5 [False]
-224.5 [False True]
-220.5 [False True]
-216.5 [False True]
-212.5 [False True]
-208.5 [False True]
-204.5 [False True]
-200.5 [False True]
-196.5 [False True]
-192.5 [False True]
-188.5 [False]
-184.5 [False]
-180.5 [False]
:
-148.5 [False]
-144.5 [False]
-140.5 [False]
Creating the segmentation:
# Get meta-data information from the existing series
series_instance_uid = image_datasets[0].SeriesInstanceUID
series_number = image_datasets[0].SeriesNumber
sop_instance_uid = image_datasets[0].SOPInstanceUID
instance_number = image_datasets[0].InstanceNumber
MANUFACTURER = 'Ben' # image_datasets[0].Manufacturer
MANUFACTURER_MODEL_NAME = "Prost" # image_datasets[0].ManufacturersModelName
SOFTWARE_VERSIONS = 'v1.0'
DEVICE_SERIAL_NUMBER = '0' # image_datasets[0].DeviceSerialNumber
# Describe the algorithm that created the segmentation family:
# http://dicom.nema.org/medical/dicom/current/output/chtml/part16/sect_CID_7162.html
algorithm_identification = hd.AlgorithmIdentificationSequence(
name=MANUFACTURER_MODEL_NAME,
version=SOFTWARE_VERSIONS,
family=codes.cid7162.ArtificialIntelligence
)
# Describe the segment:
# https://highdicom.readthedocs.io/en/latest/package.html#highdicom.seg.SegmentDescription
# segmented_property_category:
# http://dicom.nema.org/medical/dicom/current/output/chtml/part16/sect_CID_7150.html
# segmented_property_type:
# http://dicom.nema.org/medical/dicom/current/output/chtml/part16/sect_CID_7160.html
description_segment = hd.seg.SegmentDescription(
segment_number=1,
segment_label='Lesions',
segmented_property_category=codes.cid7150.AnatomicalStructure,
segmented_property_type=codes.cid7160.Prostate,
algorithm_type=hd.seg.SegmentAlgorithmTypeValues.AUTOMATIC,
algorithm_identification=algorithm_identification,
tracking_uid=hd.UID(),
tracking_id='Lesion Segmentation of a Prostate MR Image',
# anatomic_regions=Code("41216001", "SCT", "Prostate"), # BA - error, seems like the others are enough
)
# Create the Segmentation instance
# https://highdicom.readthedocs.io/en/latest/package.html#highdicom.seg.Segmentation
seg_dataset = hd.seg.Segmentation(
source_images=image_datasets,
pixel_array=mask,
segmentation_type=hd.seg.SegmentationTypeValues.BINARY, # FRACTIONAL,
segment_descriptions=[description_segment],
series_instance_uid=series_instance_uid, # hd.UID(),
series_number=series_number,
sop_instance_uid=sop_instance_uid, # hd.UID(),
instance_number=instance_number,
manufacturer=MANUFACTURER,
manufacturer_model_name=MANUFACTURER_MODEL_NAME,
software_versions=SOFTWARE_VERSIONS,
device_serial_number=DEVICE_SERIAL_NUMBER,
omit_empty_frames=True,
# content_creator_name=manufacturer,
)
# Compare generated and original segmentations
print('seg:')
for i, slice in enumerate(seg_dataset.PerFrameFunctionalGroupsSequence):
print(float(str(slice['PlanePositionSequence']._value[0])[-7:-1]), np.unique(seg_dataset.pixel_array[i]))
ref_path = '/home/ben/.../original_seg.dcm'
ref_seg = hd.seg.segread(ref_path)
print('\noriginal:')
for i, slice in enumerate(ref_seg.PerFrameFunctionalGroupsSequence):
print(float(str(slice['PlanePositionSequence']._value[0])[-7:-1]), np.unique(ref_seg.pixel_array[i]))
And the outputs:
seg:
-564.5 [0 1]
-552.5 [0 1]
-520.5 [0 1]
-488.5 [0 1]
-404.5 [0 1]
-328.5 [0 1]
-324.5 [0 1]
-244.5 [0 1]
-200.5 [0 1]
original:
-224.5 [0 1]
-220.5 [0 1]
-216.5 [0 1]
-212.5 [0 1]
-208.5 [0 1]
-204.5 [0 1]
-200.5 [0 1]
-196.5 [0 1]
-192.5 [0 1]
As seen, the newly generated segmentation got wrong PlanePositionSequence
. The same happens when I use omit_empty_frames=False
.
I’m still reading through the library source code, found another bug and told @hackermd about it but it was not related.
Had a thought about adding a reorder by original position function, to make sure the segmentation is being mapped correctly in case the original scan files are not in order.
Any other thoughts or comments?
Many thanks to all of you anyway, it’s a really impressive library and I’m glad you’ve released it exactly when I started working on this project 😃
Issue Analytics
- State:
- Created 2 years ago
- Comments:10 (3 by maintainers)
So I managed to solve it, not ideal and foolproof, but it handles to source of the issue.
To collect the source files I used
glob
, which collects according to the arbitrary-looking filesystem appearance (source).It was solved using
sorted
andreverse
:image_files = sorted(series_dir.glob("*"), reverse=True)
When time allows I’ll try to implement using the
order_datasets
function, but for now sorting the files is good enough. And in case I find other bugs I’ll create another issue or pull request.You can close the issue or implement the ordering verification function for the next version first.
I thank you all again, I really appreciate the time and effort!
Hi @benarnon8, thanks for using the library and for taking the time to write a nice bug report. I am open minded to there being a bug in the library but I want to eliminate a few other possibilities first. Unfortunately the order of frames within the segmentation images is much more complicated than you might initially think. This is because the standard also supports things like slide microscopy images, which are drawn from tiles of a 2D space.
I suspect your issue is due to a mismatch in the order of the planes between the input CT series and whatever the pydicom
.pixel_array
returns. Unfortunately there’s no reason at all to assume that these two would match. Instead of @hackermd 's approach, can you please try the following two steps (at the same time), and let me know whether this fixes your issue?.pixel_array
property to access the pixel data of a segmentation. Please use the highdicom method get_pixels_by_source_instance like this to have highdicom figure this out for you and ensure that the two lists match:This should return the full array of the mask, including all the empty frames (so you can remove the
mask[86:95, :, :]
line). I know that the documentation is lacking on this point currently and will hopefully improve it soon!If you follow these two steps, the order of the frames in the mask and the source datasets should match and I think everything should work correctly. We’ll find out!
I realise that this may not be possible, but would you be able to provide me with the set of files you are using, assuming that they are anonymised? That would help debugging immensely. Also, do you know what software created the segmentation you are reading?