Getting completely different hashes of almost identical images
See original GitHub issueHi! I’m trying to compare image unmodified, as taken by camera, and photoshopped image (tweaked histogram and a bit changed white balance) and get distances above 25
if using code as in examples:
hash = imagehash.phash(Image.open(path))
But if i modify code like this:
img = cv2.imread(path)
img = Image.fromarray(img)
hash = imagehash.phash(img)
I get distance of 0
Looks like it might be caused by different color spaces, or something like that.
Hope this info could help somebody.
Issue Analytics
- State:
- Created 6 years ago
- Comments:16 (9 by maintainers)
Top Results From Across the Web
Detection of Duplicate Images Using Image Hash Functions
The most straightforward approach to detect duplicates would be on file size or filename. However, photos are usually derived from different sources such...
Read more >Two identical images have a different hash can't figure out why
So 1.jpg and 2.jpg are identical. Then for each image I calculate a "difference" hash of length 256 using the function get_hash.
Read more >Duplicate image detection with perceptual hashing in Python
To determine whether an image is a duplicate, you compare their dHash values. If the hash values are equal, the images are nearly...
Read more >Detecting similar and identical images using perseptual hashes
To compare two images, calculate the Hamming distance between two average hashes. A distance of zero indicates that it is likely a very...
Read more >Deduplication: Why Computers See Differences in Files that ...
For example, no two optical scans of a document will produce identical hash values because there will always be some variation in the...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
If you flip between them, one has several bright spots. That might have to do with the transparency. You could try putting a white background, transform to RGB (without alpha channel), and see if it makes a difference.
With hash1 - hash2 you can compute the hamming distance. For your application, you might want to live with a threshold > 0.
@JohannesBuchner As you suggested, I’ve tried printing out
hash.hash
to see the differences but I am not able to proceed using that info.Even though the images look visually similar to the naked eye, they are getting hashed differently. Any way for me to use the
hash.hash
information to fix the detection?