Assertion error during kinetics400 validation
See original GitHub issue🐛 Describe the bug
Running on main:
torchrun --nproc_per_node=8 train.py --data-path /datasets01/kinetics/070618/400/ --train-dir=val --val-dir=val --batch-size=16 --sync-bn --test-only --pretrained --cache-dataset
throws the following error:
Test: [2200/3008] eta: 0:11:24 loss: 2.6703 (2.1475) acc1: 43.7500 (57.4938) acc5: 68.7500 (77.8623) time: 0.9043 data: 0.6405 max mem: 5888
Traceback (most recent call last):
File "train.py", line 392, in <module>
main(args)
File "train.py", line 273, in main
evaluate(model, criterion, data_loader_test, device=device)
File "train.py", line 62, in evaluate
for video, target in metric_logger.log_every(data_loader, 100, header):
File "/private/home/vvryniotis/vision/references/video_classification/utils.py", line 128, in log_every
for obj in iterable:
File "/private/home/vvryniotis/.conda/envs/datumbox/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
data = self._next_data()
File "/private/home/vvryniotis/.conda/envs/datumbox/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1183, in _next_data
return self._process_data(data)
File "/private/home/vvryniotis/.conda/envs/datumbox/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
data.reraise()
File "/private/home/vvryniotis/.conda/envs/datumbox/lib/python3.8/site-packages/torch/_utils.py", line 438, in reraise
raise exception
AssertionError: Caught AssertionError in DataLoader worker process 3.
Original Traceback (most recent call last):
File "/private/home/vvryniotis/.conda/envs/datumbox/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/private/home/vvryniotis/.conda/envs/datumbox/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/private/home/vvryniotis/.conda/envs/datumbox/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/private/home/vvryniotis/vision/torchvision/datasets/kinetics.py", line 233, in __getitem__
video, audio, info, video_idx = self.video_clips.get_clip(idx)
File "/private/home/vvryniotis/vision/torchvision/datasets/video_utils.py", line 362, in get_clip
assert len(video) == self.num_frames, f"{video.shape} x {self.num_frames}"
AssertionError: torch.Size([17, 288, 352, 3]) x 16
If we apply the following patch:
$ git diff
diff --git a/torchvision/datasets/video_utils.py b/torchvision/datasets/video_utils.py
index f0f19e33..2254f8c5 100644
--- a/torchvision/datasets/video_utils.py
+++ b/torchvision/datasets/video_utils.py
@@ -359,8 +359,8 @@ class VideoClips:
resampling_idx = resampling_idx - resampling_idx[0]
video = video[resampling_idx]
info["video_fps"] = self.frame_rate
- assert len(video) == self.num_frames, f"{video.shape} x {self.num_frames}"
- return video, audio, info, video_idx
+ #assert len(video) == self.num_frames, f"{video.shape} x {self.num_frames}"
+ return video[:self.num_frames], audio[:self.num_frames], info, video_idx
def __getstate__(self):
video_pts_sizes = [len(v) for v in self.video_pts]
We get an accuracy which is far from the expected one:
Result:
* Clip Acc@1 56.488 Clip Acc@5 77.773
Expected:
* Clip Acc@1 57.50 Clip Acc@5 78.81
Questions:
- Is this the right dataset for validating the
r2plus1d_18
model?- Possibly not, we might have used another version of the dataset. See https://github.com/pytorch/vision/issues/4839#issuecomment-959652146
- As far as I can see, the assertion always existed. How did the model got trained without triggering it?
- This is due to a recently introduced but from audio-video sync. See https://github.com/pytorch/vision/issues/4839#issuecomment-958844706
- Are the accuracy numbers reported on doc correct?
- Unclear, more investigation needed. See https://github.com/pytorch/vision/issues/4839#issuecomment-959843305
Versions
Latest main 0817f7f
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (5 by maintainers)
Top Results From Across the Web
I'm getting an error while using torchvision.datasets.Kinetics
I'm trying to use [torchvision.datasets.Kinetics][1] but I'm getting this Assertion Error. This is the initialization of the dataset
Read more >Enabling Detailed Action Recognition Evaluation Through ...
The paper provides an evaluation toolkit and leaderboard for comparing current action recognition models on the famous Kinetics-400 dataset for their background ...
Read more >assertion-error | Yarn - Package Manager
Error constructor for test and validation frameworks that implements standardized ... Assertion Error is a module that contains two classes: AssertionError ...
Read more >Downloading The Kinetics Dataset For Human Action ...
The total downloaded video count is 631604 while the failed videos is 15380, which means 2.37 % of the entire dataset could not...
Read more >assertion-error - npm Package Health Analysis - Snyk
Error constructor for test and validation frameworks that implements standardized AssertionError specification. For more information about how to use this ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@datumbox I’ve been able to confirm that I don’t run into an error anymore. Could you double check on your end (I just used kinetics400 val set like in the example above)?
@bjuncek I confirm that this is solved on the latest main. Thanks!