Options for padding
See original GitHub issueWorking with scattering for weather forecasts image analysis, I was wondering what influence the type of padding did have on results.
The reflection padding implemented by default seems to produce high gradients on the boundaries. After a few subsamplings, the large values on the boundary gain a certain weight in the spatial average, that could lead to estimation biases.
This is visible on orientation averaged second-order coefficients.
Question 1 : have the topic already been discussed ?
Question 2 : what about providing numpy padding modes as an option to scattering ?
Now for a bit of demo, if it adds to the discussion. After a fork and 1 or 2 adds on numpy frontend/backend, one comes to the following code :
from kymatio import Scattering2D
import numpy as np
import matplotlib.pyplot as plt
wind_norm=np.load('wind_norm.npy')
J,L=4,8
# defining indexing of order 2 coefficients
order2_ind=[[[[L**2*(j1*(J-1)-j1*(j1-1)//2)\
+L*(j2-j1-1)\
+L*(J-j1-1)*l1\
+l2 \
for l2 in range(L)]\
for l1 in range(L)]\
for j2 in range(j1+1,J)] \
for j1 in range(J-1)]
def order2(data, pad_type):
"""
compute orientation averaged l2 coefficients
"""
scattering=Scattering2D(4,data.shape,8,
max_order=2,pre_pad=False,
frontend='numpy',backend='numpy',
out_type='array', pad_type=pad_type)
################### order 2 coefficients
results_order2=scattering(data)[1+4*8:]
################### averaging over l2 orientations
S2_j1j2l1=[]
for j1 in range(J-1):
for j2 in range(J-j1-1):
for l1 in range(L):
S2_j1j2l1.append(
np.mean(results_order2[order2_ind[j1][j2][l1],:,:],axis=0)
)
S2_j1j2l1=np.array(S2_j1j2l1)
return S2_j1j2l1
S2_symm=order2(wind_norm, 'symmetric')
S2_reflect=order2(wind_norm, 'reflect')
ratio=100*(S2_symm/S2_reflect -1.0)
print(ratio.min(), ratio.max(), ratio.mean())
plt.imshow(ratio[-1][::-1,:])
plt.colorbar()
plt.show()
The data is downloadable in the zip attached. wind_norm.zip
The result of this yields:
-19.169381905671035 13.210792343005728 -0.07116768707598231
Here both transforms coincide at the center, but not on the border. Overestimation by reflection padding is manifest here.
Issue Analytics
- State:
- Created a year ago
- Comments:5
Top GitHub Comments
OK, sorry for the long delay. Thank you both for pointing me to useful discussions.
Great proposal @lostanlen , having flexible padding options for the same scattering object really more elegant (and lean). Looking at #727, the main issue for implementation seems to be the diversity of backends.
Agree that pad_mode seems better than pad_type.
As for validation and correctness issue @MuawizChaudhary :
A starting point could be something like https://arxiv.org/abs/2010.02178 and showing what bias the different padding modes introduce, especially on intermediate activations (my modest plot upwards was a push in this direction).
Hoping for “perfect” 2D padding, that would allow to forget boundary effects, seems unreasonable however.
Anyway, on texture datasets, it would be interesting to (statistically) compare
I might be able to have a try on my weather dataset (which, strictly speaking, is not a texture but incorporate some). I keep you updated on this issue 1) if you are interested and 2) if i can get it done in the coming weeks.
Hello @flyIchtus ,
Right now, what you’re doing is indeed the only way to have custom padding/unpadding in Kymatio 2D: pass
pre_pad=False
and roll up your own implementation,order2_ind
in your case.But i agree that this is not satisfactory and ideally you should be able to pass a single keyword argument at runtime. Not:
But
(the keyword
pad_type
is up for debate. It’s not a “type” in the CS sense of the word)In other words, the
Scattering2D
object shouldn’t need to know about what type of padding you’ll want to apply later. From a user’s perspective, this will allow you to have a single object computing scattering transforms with several kinds of padding. When that plan is ready, all your code will rewrite asThoughts?