PatchCore results are much worth than reported
See original GitHub issueDescribe the bug
- A clear and concise description of what the bug is.
To Reproduce
Steps to reproduce the behavior:
- Go to the Main directory
- Run python tools/train.py --model patchcore
Expected behavior
The image AUCROC to be .98 for the category carpet of the MvTech dataset but it is very low. Fastflow work as expected so the problem seems to be patchcore. Screenshots
Hardware and Software Configuration
- OS: [Ubuntu]
- NVIDIA Driver Version [470.141.03]
- CUDA Version [11.4]
- CUDNN Version [e.g. v11.4.120]
Log
WARNING: CPU random generator seem to be failing, disabling hardware random number generation
WARNING: RDRND generated: 0xffffffff 0xffffffff 0xffffffff 0xffffffff
----------------------------------/anomalib/config/config.py:166: UserWarning: config.project.unique_dir is set to False. This does not ensure that your results will be written in an empty directory and you may overwrite files.
warn(
2022-11-16 11:52:49,662 - anomalib.data - INFO - Loading the datamodule
2022-11-16 11:52:49,662 - anomalib.pre_processing.pre_process - WARNING - Transform configs has not been provided. Images will be normalized using ImageNet statistics.
2022-11-16 11:52:49,663 - anomalib.pre_processing.pre_process - WARNING - Transform configs has not been provided. Images will be normalized using ImageNet statistics.
2022-11-16 11:52:49,663 - anomalib.models - INFO - Loading the model.
2022-11-16 11:52:49,667 - torch.distributed.nn.jit.instantiator - INFO - Created a temporary directory at /tmp/tmpk5fh8j6r
2022-11-16 11:52:49,667 - torch.distributed.nn.jit.instantiator - INFO - Writing /tmp/tmpk5fh8j6r/_remote_module_non_scriptable.py
2022-11-16 11:52:49,674 - anomalib.models.components.base.anomaly_module - INFO - Initializing PatchcoreLightning model.
/home/-/code/anomalib/lib/python3.8/site-packages/torchmetrics/utilities/prints.py:36: UserWarning: Metric PrecisionRecallCurve
will save all targets and predictions in buffer. For large datasets this may lead to large memory footprint.
warnings.warn(*args, **kwargs)
2022-11-16 11:52:50,882 - timm.models.helpers - INFO - Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-weights/wide_resnet50_racm-8234f177.pth)
2022-11-16 11:52:51,009 - anomalib.utils.loggers - INFO - Loading the experiment logger(s)
2022-11-16 11:52:51,009 - anomalib.utils.callbacks - INFO - Loading the callbacks
/home/-/code/anomalib/src/anomalib/anomalib/utils/callbacks/init.py:141: UserWarning: Export option: None not found. Defaulting to no model export
warnings.warn(f"Export option: {config.optimization.export_mode} not found. Defaulting to no model export")
2022-11-16 11:52:51,012 - pytorch_lightning.utilities.rank_zero - INFO - GPU available: True, used: True
2022-11-16 11:52:51,012 - pytorch_lightning.utilities.rank_zero - INFO - TPU available: False, using: 0 TPU cores
2022-11-16 11:52:51,012 - pytorch_lightning.utilities.rank_zero - INFO - IPU available: False, using: 0 IPUs
2022-11-16 11:52:51,012 - pytorch_lightning.utilities.rank_zero - INFO - HPU available: False, using: 0 HPUs
2022-11-16 11:52:51,012 - pytorch_lightning.utilities.rank_zero - INFO - Trainer(limit_train_batches=1.0)
was configured so 100% of the batches per epoch will be usedβ¦
2022-11-16 11:52:51,012 - pytorch_lightning.utilities.rank_zero - INFO - Trainer(limit_val_batches=1.0)
was configured so 100% of the batches will be usedβ¦
2022-11-16 11:52:51,012 - pytorch_lightning.utilities.rank_zero - INFO - Trainer(limit_test_batches=1.0)
was configured so 100% of the batches will be usedβ¦
2022-11-16 11:52:51,012 - pytorch_lightning.utilities.rank_zero - INFO - Trainer(limit_predict_batches=1.0)
was configured so 100% of the batches will be usedβ¦
2022-11-16 11:52:51,012 - pytorch_lightning.utilities.rank_zero - INFO - Trainer(val_check_interval=1.0)
was configured so validation will run at the end of the training epochβ¦
2022-11-16 11:52:51,012 - anomalib - INFO - Training the model.
2022-11-16 11:52:51,016 - anomalib.data.mvtec - INFO - Found the dataset.
2022-11-16 11:52:51,018 - anomalib.data.mvtec - INFO - Setting up train, validation, test and prediction datasets.
/-/-/code/anomalib/lib/python3.8/site-packages/torchmetrics/utilities/prints.py:36: UserWarning: Metric ROC
will save all targets and predictions in buffer. For large datasets this may lead to large memory footprint.
warnings.warn(*args, **kwargs)
2022-11-16 11:52:52,479 - pytorch_lightning.accelerators.gpu - INFO - LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]
/-/-/code/anomalib/lib/python3.8/site-packages/pytorch_lightning/core/optimizer.py:183: UserWarning: LightningModule.configure_optimizers
returned None
, this fit will run with no optimizer
rank_zero_warn(
2022-11-16 11:52:52,482 - pytorch_lightning.callbacks.model_summary - INFO -
| Name | Type | Params
0 | image_threshold | AnomalyScoreThreshold | 0
1 | pixel_threshold | AnomalyScoreThreshold | 0
2 | model | PatchcoreModel | 24.9 M
3 | image_metrics | AnomalibMetricCollection | 0
4 | pixel_metrics | AnomalibMetricCollection | 0
5 | normalization_metrics | MinMax | 0
24.9 M Trainable params
0 Non-trainable params
24.9 M Total params
99.450 Total estimated model params size (MB)
Epoch 0: 8%|β | 1/13 [00:01<00:16, 1.37s/it, loss=nan]/-/-/code/anomalib/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py:137: UserWarning: training_step
returned None
. If this was on purpose, ignore this warningβ¦
self.warning_cache.warn(βtraining_step
returned None
. If this was on purpose, ignore this warningβ¦β)
Epoch 0: 69%|βββββββ | 9/13 [00:01<00:00, 4.67it/s, loss=nan]
Validation: 0it [00:00, ?it/s]2022-11-16 11:52:54,414 - anomalib.models.patchcore.lightning_model - INFO - Aggregating the embedding extracted from the training set.
2022-11-16 11:52:54,415 - anomalib.models.patchcore.lightning_model - INFO - Applying core-set subsampling to get the embedding.
Epoch 0: 69%|βββββββ | 9/13 [00:20<00:08, 2.22s/it, loss=nan]
Validation: 0%| | 0/4 [00:00<?, ?it/s]
Validation DataLoader 0: 0%| | 0/4 [00:00<?, ?it/s]
Validation DataLoader 0: 25%|βββ | 1/4 [00:00<00:00, 4.13it/s]
Epoch 0: 77%|ββββββββ | 10/13 [00:59<00:17, 5.94s/it, loss=nan]
Validation DataLoader 0: 50%|βββββ | 2/4 [00:00<00:00, 4.04it/s]
Epoch 0: 85%|βββββββββ | 11/13 [00:59<00:10, 5.42s/it, loss=nan]
Validation DataLoader 0: 75%|ββββββββ | 3/4 [00:00<00:00, 4.02it/s]
Epoch 0: 92%|ββββββββββ| 12/13 [00:59<00:04, 4.99s/it, loss=nan]
Validation DataLoader 0: 100%|ββββββββββ| 4/4 [00:00<00:00, 4.63it/s]
Epoch 0: 100%|ββββββββββ| 13/13 [01:00<00:00, 4.67s/it, loss=nan, pixel_F1Score=0.548, pixel_AUROC=0.986]
Epoch 0: 100%|ββββββββββ| 13/13 [01:01<00:00, 4.69s/it, loss=nan, pixel_F1Score=0.548, pixel_AUROC=0.986]
2022-11-16 11:53:53,628 - anomalib.utils.callbacks.timer - INFO - Training took 61.15 seconds
2022-11-16 11:53:53,628 - anomalib - INFO - Loading the best model weights.
2022-11-16 11:53:53,628 - anomalib - INFO - Testing the model.
2022-11-16 11:53:53,632 - anomalib.data.mvtec - INFO - Found the dataset.
2022-11-16 11:53:53,633 - anomalib.data.mvtec - INFO - Setting up train, validation, test and prediction datasets.
/-/code/anomalib/lib/python3.8/site-packages/torchmetrics/utilities/prints.py:36: UserWarning: Metric ROC
will save all targets and predictions in buffer. For large datasets this may lead to large memory footprint.
warnings.warn(*args, **kwargs)
2022-11-16 11:53:53,716 - pytorch_lightning.accelerators.gpu - INFO - LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]
2022-11-16 11:53:53,718 - anomalib.utils.callbacks.model_loader - INFO - Loading the model from /home/-/code/anomalib/src/anomalib/results/patchcore/mvtec/carpet/run/weights/model.ckpt
Testing DataLoader 0: 100%|ββββββββββ| 4/4 [00:19<00:00, 4.65s/it]2022-11-16 11:54:14,762 - anomalib.utils.callbacks.timer - INFO - Testing took 20.9255051612854 seconds
Throughput (batch_size=32) : 5.591262867883519 FPS
Testing DataLoader 0: 100%|ββββββββββ| 4/4 [00:19<00:00, 4.97s/it]
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Test metric DataLoader 0
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
image_AUROC 0.4036917984485626
image_F1Score 0.8640776872634888
pixel_AUROC 0.9860672950744629
pixel_F1Score 0.5481611490249634
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Process finished with exit code 0
Issue Analytics
- State:
- Created 10 months ago
- Comments:13 (8 by maintainers)
Top GitHub Comments
Thank you very much! I can confirm that the results are definitely better!
Just in case anyone is interested, I did a few experiments during the last few days which I wanted to share. But feel free to skip the rest of this comment.
I trained different settings twice over all categories with the seeds
0
and42
. I had an average image AUROC of 0.944 whereas the paper states 0.990. Using the newest fix #791 and cropping, I get 0.987, so Iβd say itβs close enough.The settings are as followed:
main
branch a few days ago with my fix but without the latest onePatchcoreModel.generate_embedding
with the algorithm from the original Patchcore implementation (although I think it has two errors (or I messed up the parameters π€· ) but it still returns good results)We managed to narrow down the performance deterioration to a small change in the Average Pooling layer that was made some time ago, and weβve reverted that commit for now. I ran a quick experiment on a few MVTec categories where I compared our numbers to those obtained by running the original implementation. These are some results:
(The seed was set to 0 in both the implementations, and all other parameters were kept at default, in case anyone would like to repeat these numbers).
There is still a small difference in the
grid
category, but this could possibly be attributed to the absence of center-cropping in Anomalib. The original implementation first resizes the images to 256x256 and then center-crops to 224x224, while we directly resize to 224x244. When I increase the image_size to 256x256 in the Anomalib config, the numbers are already much closer:So I believe that our PatchCore model is now on par with the original implementation π