[suggestion] Check face preprocessing step for all the keras face embedding models
See original GitHub issueIn this repo, the face patch preprocessing step is same for all the models.
img = cv2.resize(img, target_size)
img_pixels = image.img_to_array(img)
img_pixels = np.expand_dims(img_pixels, axis = 0)
img_pixels /= 255 #normalize input in [0, 1]
Please look into it if this is valid for all, as the inference preprocessing step must match the model’s training pipeline for accurate result. For example, the original Facenet repo has done it differently.
Issue Analytics
- State:
- Created 3 years ago
- Comments:42 (16 by maintainers)
Top Results From Across the Web
How to Perform Face Recognition With VGGFace2 in Keras
In this tutorial, you will discover how to develop face recognition systems for face identification and verification using the VGGFace2 deep ...
Read more >Deep face recognition with Keras, Dlib and OpenCV
Detect, transform, and crop faces on input images. This ensures that faces are aligned before feeding them into the CNN. This preprocessing step...
Read more >Face Detection and Recognition with Keras - SitePoint
In the first step of this tutorial, we'll use a pre-trained MTCNN model in Keras to detect faces in images. Once we've extracted...
Read more >Face Recognition with FaceNet in Keras - Sefik Ilkin Serengil
It was built on the Inception model. We have been familiar with Inception in kaggle imagenet competitions. Basically, the idea to recognize face...
Read more >End to End solution of detecting kin-relationship from Faces.
This blog will step by step guide you on how to build a Deep Neural Network model with Keras from scratch to finally...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
deepface 0.0.66 is on live!
I added a normalization argument in verify, find and represent function. Its default value is base and it will work as is in this case.
If you set Facenet, v1, v2, … to normalization, it will apply the logic @trevorgribble shared.
@trevorgribble buddy, I mentioned you in the source code as well. Thank you!
@iamrishab @serengil I am very interested in this topic and would like to discuss more about refining the pre-processing steps as well as the threshold for decisions between True and False verification.
Specifically, I’ve been working on creating face detection/recognition pipeline using MTCNN for detection and both VGG-Face and Facenet for calculating embeddings.
@serengil, please correct me if I’m wrong, but I believe that the True/False thresholds you have hard-coded in distance.py (and discussed at length on this post: https://sefiks.com/2020/05/22/fine-tuning-the-threshold-in-face-recognition/ ) were all decided upon from normalizing the RGB pixel values dividing by 255 (as @iamrishab mentions).
I ran some tests against 20 different people (all with between 6-30 images per person) using the default “/255” method as RGB input. I ran with both Euclidean Distance and Cosine options.
I found there was a lot of error both false positives and false negatives.
I tried to refine and found that Facenet model was originally trained on dataset where all images were normalized by calculating img RGB mean and img RGB std and then img = (img-mean)/std
When performing this preprocess calculation (rather than simply dividing by 255 before calculating embeddings), my TRUE/FALSE verification accuracy got much better, using euclidean distance and given threshold of 10) - awesome!
Then onto VGG-Face, I found that simply dividing by 255 and using given thresholds for cosine/euclidean distance (.4 (cosine) / .55 (euclidean)) were giving me tons of false positives, and all true positives seemed to be way below the given thresholds - typically scoring at .2 or lower.
Again, I researched VGG-Face as much as I could to try and figure out how the model was trained and found some posts where it seemed that they subtracted individual R, G, and B means. (R - 131.0912, G - 103.8827 B - 91.4953) with no division, and fed those values into model: (https://github.com/ox-vgg/vgg_face2/issues/17 ) I also saw ambiguous information that the VGG-FACE model might have taken images in as BGR rather than RGB: (https://github.com/rcmalli/keras-vggface/issues/62)
So I attempted 6 different permutations of preprocessing the RGB values before sending them to embeddings, to see if there was a clearer delineation for threshold calculations. Sadly, whether I used cosine or euclidean comparisons, I couldn’t dial in the VGG-Face threshold like I seemed to accomplish with Facenet.
@serengil , I do believe deepface shines brightest if I stick with the “/255” (or any other normalization technique for that matter) and create embeddings for hundreds of thousands of people, eventually filling up a database with millions of embeddings, and then do a face recognition search of an unknown face against those millions of embeddings, that the lowest cosine variation or lowest euclidean distance will indeed have a high likelihood of matching my unknown image.
However I’d like to explore the True/False (“head to head”) verification some more to try and dial in the “best embeddings” for each model.
I also noticed in your earlier blogs, you were using a keras preprocess_input function to normalize VGG faces before passing them in: https://sefiks.com/2018/08/06/deep-face-recognition-with-keras/ - referencing the preprocess_input function in this repo file: https://github.com/keras-team/keras-applications/blob/master/keras_applications/imagenet_utils.py But currently in deepface repo it doesn’t seem we are still using that tool? In functions.py we are including “from tensorflow.keras.applications.imagenet_utils import preprocess_input” but doesn’t appear we are calling it anywhere?
I’m sure there are still complex elements that I’m failing to fully understand (Was this VGGFace model trained on RESNET50 or “vgg16” or “senet50” or something else? Where did the vgg_face_weights.h5 file come from?) but yes let’s open conversation so we can fully understand it because it seems there is confusion in many threads around the internet.
Thanks much, sorry for the long post.