Enhancement: attention crop strategy improvements
See original GitHub issueThese are via @jcupitt, thanks John.
Rank regions by average rather than maximum value
sharp uses max on the sum image. This is very sensitive to noise, to the exact alignment of the masks, and takes no account of how “much” of something there is. A single pixel of skin colour is as significant as a whole face.
We could gaussblur before max, but the simplest thing is probably just to use avg rather than max.
No need to separate A and B channels when detecting skin tones
sharp does
a = lab[1]
b = lab[2]
skin = (a >= 3) & (a <= 22) & (b >= 4) & (b <= 31)
It would probably be slightly quicker to get ab as a two-band image, then do the tests against a two-element array, then AND the two bands together, something like:
ab = lab[1:2]
skin = (ab >= [3, 4] & ab <= [22, 31]).bandand()
Investigate using L channel in skin tone detection
When sharp tests for skin colour, it just tests AB, it doesn’t test lightness. I found a few images where very deep shadows were mistakenly tagged as skin, throwing off the crop.
(The L channel values were discarded when the current thresholds were “trained”. Let’s add them back into the mix and see what it comes up with.)
Issue Analytics
- State:
- Created 7 years ago
- Comments:5 (3 by maintainers)
I had another thought – the cropper works at the moment by discarding boring areas, which is quick, but it means it won’t centre on interesting areas.
Instead, how about calculating a score image, as now, but doing it for the whole frame. Then shrink, blur, search for the peak, and centre the crop box on that. There’s some sample code here:
https://github.com/jcupitt/libvips/blob/master/libvips/conversion/smartcrop.c#L162
And a little discussion here:
https://github.com/jcupitt/libvips/issues/619
It’s a bit slower, but not catastrophically.
[update jcupitt/libvips#619 has been improved a bit more, and now has a nip2 workspace for experimenting with the settings, plus some useful test images]
Commit 36078f9 on the ridge branch makes an internal switch to the smartcrop feature of libvips. The existing public API remains the same. This will be in v0.18.0.