Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[suggestion] Check face preprocessing step for all the keras face embedding models

See original GitHub issue

In this repo, the face patch preprocessing step is same for all the models.

img = cv2.resize(img, target_size)
img_pixels = image.img_to_array(img)
img_pixels = np.expand_dims(img_pixels, axis = 0)
img_pixels /= 255 #normalize input in [0, 1]

Please look into it if this is valid for all, as the inference preprocessing step must match the model’s training pipeline for accurate result. For example, the original Facenet repo has done it differently.

Issue Analytics

State:
Created 3 years ago
Comments:42 (16 by maintainers)

Top GitHub Comments

4reactions

serengilcommented, Jul 31, 2021

deepface 0.0.66 is on live!

I added a normalization argument in verify, find and represent function. Its default value is base and it will work as is in this case.

If you set Facenet, v1, v2, … to normalization, it will apply the logic @trevorgribble shared.

@trevorgribble buddy, I mentioned you in the source code as well. Thank you!

4reactions

trevorgribblecommented, May 22, 2021

@iamrishab @serengil I am very interested in this topic and would like to discuss more about refining the pre-processing steps as well as the threshold for decisions between True and False verification.

Specifically, I’ve been working on creating face detection/recognition pipeline using MTCNN for detection and both VGG-Face and Facenet for calculating embeddings.

@serengil, please correct me if I’m wrong, but I believe that the True/False thresholds you have hard-coded in distance.py (and discussed at length on this post: https://sefiks.com/2020/05/22/fine-tuning-the-threshold-in-face-recognition/ ) were all decided upon from normalizing the RGB pixel values dividing by 255 (as @iamrishab mentions).

I ran some tests against 20 different people (all with between 6-30 images per person) using the default “/255” method as RGB input. I ran with both Euclidean Distance and Cosine options.

I found there was a lot of error both false positives and false negatives.

I tried to refine and found that Facenet model was originally trained on dataset where all images were normalized by calculating img RGB mean and img RGB std and then img = (img-mean)/std

When performing this preprocess calculation (rather than simply dividing by 255 before calculating embeddings), my TRUE/FALSE verification accuracy got much better, using euclidean distance and given threshold of 10) - awesome!

Then onto VGG-Face, I found that simply dividing by 255 and using given thresholds for cosine/euclidean distance (.4 (cosine) / .55 (euclidean)) were giving me tons of false positives, and all true positives seemed to be way below the given thresholds - typically scoring at .2 or lower.

Again, I researched VGG-Face as much as I could to try and figure out how the model was trained and found some posts where it seemed that they subtracted individual R, G, and B means. (R - 131.0912, G - 103.8827 B - 91.4953) with no division, and fed those values into model: (https://github.com/ox-vgg/vgg_face2/issues/17 ) I also saw ambiguous information that the VGG-FACE model might have taken images in as BGR rather than RGB: (https://github.com/rcmalli/keras-vggface/issues/62)

So I attempted 6 different permutations of preprocessing the RGB values before sending them to embeddings, to see if there was a clearer delineation for threshold calculations. Sadly, whether I used cosine or euclidean comparisons, I couldn’t dial in the VGG-Face threshold like I seemed to accomplish with Facenet.

@serengil , I do believe deepface shines brightest if I stick with the “/255” (or any other normalization technique for that matter) and create embeddings for hundreds of thousands of people, eventually filling up a database with millions of embeddings, and then do a face recognition search of an unknown face against those millions of embeddings, that the lowest cosine variation or lowest euclidean distance will indeed have a high likelihood of matching my unknown image.

However I’d like to explore the True/False (“head to head”) verification some more to try and dial in the “best embeddings” for each model.

I also noticed in your earlier blogs, you were using a keras preprocess_input function to normalize VGG faces before passing them in: https://sefiks.com/2018/08/06/deep-face-recognition-with-keras/ - referencing the preprocess_input function in this repo file: https://github.com/keras-team/keras-applications/blob/master/keras_applications/imagenet_utils.py But currently in deepface repo it doesn’t seem we are still using that tool? In functions.py we are including “from tensorflow.keras.applications.imagenet_utils import preprocess_input” but doesn’t appear we are calling it anywhere?

I’m sure there are still complex elements that I’m failing to fully understand (Was this VGGFace model trained on RESNET50 or “vgg16” or “senet50” or something else? Where did the vgg_face_weights.h5 file come from?) but yes let’s open conversation so we can fully understand it because it seems there is confusion in many threads around the internet.

Thanks much, sorry for the long post.