Cannot obtain the accuracy stated in the doc for inception_v3 pretrained on Imagenet
See original GitHub issueHi.
I’m trying to evaluate inception_v3
pretrained model from the hub on Imagenet (ILSVRC 2012) test set. I use the following evaluation code:
def compute_accuracy(self):
num_correct = 0
num_images = 0
_IMAGE_MEAN_VALUE = [0.485, 0.456, 0.406]
_IMAGE_STD_VALUE = [0.229, 0.224, 0.225]
imgnet_loader = torch.utils.data.DataLoader(
ImageFolder('/home/amin/dataset/ILSVRC/val',
transforms.Compose([
transforms.Resize(299),
transforms.CenterCrop(299),
transforms.ToTensor(),
transforms.Normalize(mean=_IMAGE_MEAN_VALUE, std=_IMAGE_STD_VALUE),
])),
batch_size=16, shuffle=True,
num_workers=8, pin_memory=True)
self.model = torch.hub.load('pytorch/vision:v0.10.0', 'inception_v3', pretrained=True).cuda()
self.model.eval()
for i, (images, targets) in \
enumerate(tqdm(imgnet_loader, desc="Compute Accuracy", total=len(imgnet_loader))):
images = images.cuda() if torch.cuda.is_available() else images.cpu()
targets = targets.cuda() if torch.cuda.is_available() else targets.cpu()
output_dict = self.model(images)
pred=output_dict.argmax(dim=1)
num_correct += (pred == targets).sum().item()
num_images += images.size(0)
classification_acc = num_correct / float(num_images) * 100
return classification_acc
However, the accuracy I get is 77.216
while it should be 77.45
according to this page. I figured that the model has a transform_input as preprocess in itself. So if we are doing normalization beforehand (as suggested in the example code), we should set transform_input
to false. So if I add self.model.transform_input = False
, I get 77.472
, Which is closer to the expected value but not exactly the same.
Assuming that the issue isn’t from my code, I also found the thread on std
, mean
values (#1439). So I tested some of the values suggested there as well, and got these results:
mean | std | model’s transform_input | Accuracy |
---|---|---|---|
[0.485, 0.456, 0.406] | [0.229, 0.224, 0.225] | disabled | 77.472 |
[0.485, 0.456, 0.406] | [0.229, 0.224, 0.225] | enabled | 77.216 |
[0.4803, 0.4569, 0.4083] | [0.2806, 0.2736, 0.2877] | disabled | 77.448 |
[0.4803, 0.4569, 0.4083] | [0.2806, 0.2736, 0.2877] | enabled | 76.986 |
[0.4845, 0.4541, 0.4025] | [0.2724, 0.2637, 0.2761] | disabled | 77.456 |
[0.4845, 0.4541, 0.4025] | [0.2724, 0.2637, 0.2761] | enabled | 77.03 |
[0.4701, 0.4340, 0.3832] | [0.2845, 0.2733, 0.2805] | disabled | 77.44 |
[0.4701, 0.4340, 0.3832] | [0.2845, 0.2733, 0.2805] | enabled | 77.01 |
I appreciate any input on this. Thanks.
Issue Analytics
- State:
- Created a year ago
- Comments:6 (4 by maintainers)
yes this is tracked in https://github.com/pytorch/hub/issues/287
@m-parchami I’d recommend using the torchvision training references to check the accuracy of a given model - that’s what they use to report accuracies - https://github.com/pytorch/vision/tree/main/references/classification#inception-v3 Also please note that this isn’t a tochhub-related issue, more a torchvision issue (you’re in luck, we maintain both).
Various things can affect the results. Obviously slight differences in the code, but also batch size, shuffling, some samples potentially being dropped or duplicated depending on the number of GPUs, the use of CUDA determinism, etc…