Different AP after training
See original GitHub issueHi,
I’m facing to a very strange behavior. I’m using the same evaluation method during the training and after. The results, in terms of average precision are different when I reload and evaluate my model. I’m using the same parameters and the same evaluation dataset during the evaluation after and during the training.
Training results : cabbage 0.9206 colza 0.8249 greensalad 0.9800 leekonion 0.7145 redsalad 0.9764 mAP: 0.8833 Epoch 00033: mAP improved from 0.88021 to 0.88327, saving model to …/snapshots/resnet101_csv.h5
Evaluation results (with …/snapshots/resnet101_csv.h5): cabbage 0.7723 colza 0.6389 greensalad 0.9561 leekonion 0.3722 redsalad 0.9596 mAP: 0.7398
I tested two different ways to evaluate the model. But, both products the same result.
# create object that stores backbone information
backbone = models.backbone('resnet101')
backbone_retinanet = backbone.retinanet
# model = models.load_model(model_path, backbone_name='resnet101', convert=True)
model = backbone_retinanet(validation_generator.num_classes())
model.load_weights(model_path, by_name=True, skip_mismatch=True)
# make prediction model
from keras_retinanet.models.retinanet import retinanet_bbox
prediction_model = retinanet_bbox(model=model)
# Evaluate
average_precisions, recalls, precisions, infer_time = evaluate(validation_generator,
prediction_model,
score_threshold=0.3,
max_detections=100,
iou_threshold=0.5)
Issue Analytics
- State:
- Created 5 years ago
- Comments:7 (3 by maintainers)
Top GitHub Comments
Thank you very much hgaiser ! I passed over this line several times… And, it was so obvious. I’m feeling worse than ever.
Excuse me, sorry to bother you. But I meet a problem which is similar with this one. I separate my own dataset into training data, validation data and test data. And the mAP performs well on validation data. After i convert model into inference model, the mAP performs very bad on test data (even validation data ). I donnot know why. When i reading this issue#549 , i guess, maybe i also need to get the same detections during training and inference. But i can’t understand what do you say. Can you express it more explitly?