Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Exif metadata causes training to fail with SizeMismatchError

See original GitHub issue

Instructions To Reproduce the Issue:

I’m following Detectron2 Beginner’s Tutorial Colab notebook to train my custom dataset. Images in my dataset have orientation exif metadata tag. I can see that Detectron2 has had a PR (#262) which implies that it supports exif orientation tag. I’m using VGG Image Annotator (VIA) to label my images. VIA tool loads images with respect to their orientation value and therefore polygons drawn on images are suitable for rotated images (I’m opening VIA tool in Safari/Chrome in OSX. I’ve read somewhere it makes difference).

Only modification I’ve made in colab notebook’s code is to alter get_balloon_dicts function to read my classes from region_attributes section of via_region_data.json file which is left intact in tutorial code. I can see Detectron2 is successfully showing table of distribution of categories which means my code works perfectly however training fails with following error:

model_final_f10217.pkl: 178MB [00:06, 26.5MB/s]                           
'roi_heads.box_predictor.cls_score.weight' has shape (81, 1024) in the checkpoint but (25, 1024) in the model! Skipped.
'roi_heads.box_predictor.cls_score.bias' has shape (81,) in the checkpoint but (25,) in the model! Skipped.
'roi_heads.box_predictor.bbox_pred.weight' has shape (320, 1024) in the checkpoint but (96, 1024) in the model! Skipped.
'roi_heads.box_predictor.bbox_pred.bias' has shape (320,) in the checkpoint but (96,) in the model! Skipped.
'roi_heads.mask_head.predictor.weight' has shape (80, 256, 1, 1) in the checkpoint but (24, 256, 1, 1) in the model! Skipped.
'roi_heads.mask_head.predictor.bias' has shape (80,) in the checkpoint but (24,) in the model! Skipped.
[02/02 21:16:42 d2.engine.train_loop]: Starting training from iteration 0
/usr/local/lib/python3.6/dist-packages/PIL/TiffImagePlugin.py:603: UserWarning: Metadata Warning, tag 282 had too many entries: 2, expected 1
  % (tag, len(values))
/usr/local/lib/python3.6/dist-packages/PIL/TiffImagePlugin.py:603: UserWarning: Metadata Warning, tag 283 had too many entries: 2, expected 1
  % (tag, len(values))
/usr/local/lib/python3.6/dist-packages/PIL/TiffImagePlugin.py:603: UserWarning: Metadata Warning, tag 282 had too many entries: 2, expected 1
  % (tag, len(values))
/usr/local/lib/python3.6/dist-packages/PIL/TiffImagePlugin.py:603: UserWarning: Metadata Warning, tag 283 had too many entries: 2, expected 1
  % (tag, len(values))
ERROR [02/02 21:16:45 d2.engine.train_loop]: Exception during training:
Traceback (most recent call last):
  File "/content/detectron2_repo/detectron2/engine/train_loop.py", line 132, in train
    self.run_step()
  File "/content/detectron2_repo/detectron2/engine/train_loop.py", line 208, in run_step
    data = next(self._data_loader_iter)
  File "/content/detectron2_repo/detectron2/data/common.py", line 109, in __iter__
    for d in self.dataset:
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 838, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 881, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.6/dist-packages/torch/_utils.py", line 394, in reraise
    raise self.exc_type(msg)
detectron2.data.detection_utils.SizeMismatchError: Caught SizeMismatchError in DataLoader worker process 1.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/content/detectron2_repo/detectron2/data/common.py", line 39, in __getitem__
    data = self._map_func(self._dataset[cur_idx])
  File "/content/detectron2_repo/detectron2/utils/serialize.py", line 23, in __call__
    return self._obj(*args, **kwargs)
  File "/content/detectron2_repo/detectron2/data/dataset_mapper.py", line 76, in __call__
    utils.check_image_size(dataset_dict, image)
  File "/content/detectron2_repo/detectron2/data/detection_utils.py", line 87, in check_image_size
    expected_wh,
detectron2.data.detection_utils.SizeMismatchError: Mismatched (W,H) for image dataset/train/img19.jpg, got (3024, 4032), expect (4032, 3024)

/usr/local/lib/python3.6/dist-packages/PIL/TiffImagePlugin.py:603: UserWarning: Metadata Warning, tag 34853 had too many entries: 9, expected 1
  % (tag, len(values))
[02/02 21:16:46 d2.engine.hooks]: Total training time: 0:00:00 (0:00:00 on hooks)

SizeMismatchError exception thrown with indication of expected (4032, 3024) but getting (3024, 4032) for img19.jpg makes me think that it is has something to do with orientation tag since this file has an orientation of 90 degrees. Interesting part is that there are some images with orientation tag that get passed this error (or at least I think so)!

Note that in my get_balloon_dicts function I’m reading height and width of images with following code:

height, width = cv2.imread(filename).shape[:2]
record["height"] = height
record["width"] = width

I’ve also tried followings without any luck:

height, width = cv2.imread(filename, cv2.IMREAD_UNCHANGED).shape[:2] (Described below why)
height, width = skimage.io.imread(filename).shape[:2])
height, width = skimage.io.imread(filename, plugin='pil').shape[:2])

Also note that in the section of verifying data loading:

dataset_dicts = get_balloon_dicts("balloon/train")
for d in random.sample(dataset_dicts, 3):
    img = cv2.imread(d["file_name"])
    visualizer = Visualizer(img[:, :, ::-1], metadata=balloon_metadata, scale=0.5)
    vis = visualizer.draw_dataset_dict(d)
    cv2_imshow(vis.get_image()[:, :, ::-1])

Images are shown without considering their respective orientation value and therefore their annotations are shown falsely. I fixed that by changing img = cv2.imread(d["file_name"]) to img = cv2.imread(d["file_name"], cv2.IMREAD_UNCHANGED) As a result, I thought applying same change to imread call in get_balloon_dicts function would help but doing so had no effect and same error showed up.

Environment:

I’m running the code in Google’s Colab notebook

Issue Analytics

State:
Created 4 years ago
Comments:5

Top GitHub Comments

1reaction

ppwwyyxxcommented, Apr 20, 2020

We’re evaluating whether we should use opencv. The main issue with it is opencv has led to a few system level issues in the past in the way it uses threads and exposes symbols.

You can change it to use cv2 following https://detectron2.readthedocs.io/tutorials/data_loading.html#write-a-custom-dataloader

1reaction

ppwwyyxxcommented, Feb 2, 2020

We use Pillow to read images during training https://github.com/facebookresearch/detectron2/blob/94d0f138e49c3d7e202025941b2d1890831f07fa/detectron2/data/detection_utils.py#L49-L55

If it does not read the correct size, this sound like a bug you could report to Pillow.