averaging rotations, why so many possibilities ?
See original GitHub issueHello, my apologies for this issue that is neither a bug nor a feature request. I just would like to better understand how to correctly compute the average of n rotation.
Reading a bit from publication I found on this topic, it seems my maths level is not enough to fully understand why there are so many ways to compute a mean of rotation.
I first to try to define what I expect for an average of rotation. let’s consider 2 rotation R1 and R1 that move n points P0 to P1 and P2. The average of those 2 rotation should be a rotation Rmean that moves the points P0 to Pmean, where Pmean is the mean position between P1 and P2… in the simple case where the 2 rotations are rotations around the same axis with an angle a1 and a2, then the mean rotation is rotation of angle = mean(a1,a2)
Now can I do it for any n rotations defines from euler angles on all three axes.
So here I will compare different way of computing the mean, from euler angle, from rotation matrix, or from quaternion. then I convert the resulting mean into euler angle, to see how different are those means.
euler_mean : it is simply the mean of each euler angles.
euler_exp : taken from this article “Linear Combination of Transformations” the average of a rotation matrix should be done with matrix log and exp
import scipy.linalg as scl
import numpy as np
import quaternion as nq
def exp_mean_affine(aff_list, weights=None):
if weights is None:
weights = np.ones(len(aff_list))
#normalize weigths
weights = weights / np.sum(weights)
Aff_mean = np.zeros((4, 4))
for aff, w in zip(aff_list, weights):
Aff_mean = Aff_mean + w*scl.logm(aff)
Aff_mean = scl.expm(Aff_mean)
return Aff_mean
euler_polar : As you suggest here is this issue #120 one can simply average the rotation matrix, and then take the polar decomposition
def polar_mean_affin(aff_lis, weights=None):
if weights is None:
weights = np.ones(len(aff_list))
#normalize weigths
weights = weights / np.sum(weights)
Aff_Euc_mean = np.zeros((3, 3))
for aff, w in zip(aff_list, weights):
Aff_Euc_mean = Aff_Euc_mean + w * aff[:3, :3]
Aff_mean = np.eye(4)
Aff_mean[:3,:3] = scl.polar(Aff_Euc_mean)[0]
return Aff_mean
It is well know that quaternion is the way to go, to get proper interpolation between rotation (or dual quaternion for the general case of rigid motion). For instance Skinning with Dual Quaternions demonstrate the better interpolation with dual quaternion compare to the exponential matrix propose by Alexa I then conclude that it may also be important for this topic. and this is the reason I start looking at quaternion
euler_qr_slerp : use the slerp interpolation to compute the mean. The trick here is to decompose a mean over n element to successive addition of 2 number with changing the weight. (and the weighted mean of 2 number is then compute with slerp)
def qr_slerp_mean(qr_list):
c_num, c_deno = 1, 2
for ii, qr in enumerate(qr_list):
if ii==0:
res_mean = qr
else:
t = 1 - c_num/c_deno
res_mean = nq.slerp(res_mean, qr, 0, 1, t)
c_num+=1; c_deno+=1
return res_mean
Just to check the method is correct, here is the same fonction, but for real number, this is indeed equivalent to the standard mean function
def my_mean(x_list):
#just decompose the mean as a recursiv interpolation between 2 number, (to be extend to slerp interpolation)
c_num, c_deno = 1, 2
for ii, x in enumerate(x_list):
if ii==0:
res_mean = x
else:
t = 1 - c_num/c_deno
res_mean = np.interp(t,[0,1],[res_mean,x])
#res_mean = (res_mean * c_num + x) / c_deno
c_num+=1; c_deno+=1
return res_mean
x = np.random.random(100)
np.mean(x)-my_mean(x)
euler_choral : an other way is to use your function mean_rotor_in_chordal_metric() (although I am not sure it is appropriate, since you told me here
that’s referring to a different type of mean
how different ? is what I trying to evaluate here, but which one to choose ? is not clear at all to me …
Here is the code to compare those 5 ways to compute a mean, I compute 50 times a means of 10 random rotation define from random euler angle:
import quaternion as nq
def get_affine_rot_from_euler(e_array):
aff_list=[]
for r in e_array:
r = np.deg2rad(r)
qr = nq.from_euler_angles(r)
aff = np.eye(4, 4);
aff[:3,:3] = nq.as_rotation_matrix(qr)
aff_list.append(aff)
return aff_list
def get_modulus_euler_in_degree(euler):
euler = np.rad2deg(euler)
for index, e in enumerate(euler):
if e < -180:
euler[index] = e + 360
if e > 180:
euler[index] = e -360
return euler
def get_euler_from_qr(qr):
return get_modulus_euler_in_degree(nq.as_euler_angles(qr))
def get_euler_from_affine(aff):
qr = nq.from_rotation_matrix(aff)
return get_modulus_euler_in_degree(nq.as_euler_angles(qr))
np.random.seed(4)
nb_mean=50
euler_mean, euler_choral, euler_exp, euler_pol, euler_slerp, euler_qr_slerp = np.zeros((nb_mean,3)), np.zeros((nb_mean,3)), np.zeros((nb_mean,3)), np.zeros((nb_mean,3)), np.zeros((nb_mean,3)), np.zeros((nb_mean,3))
for i in range(nb_mean):
rot_euler = np.random.uniform(2,10,size=(10, 3)) #in degree
print(f'min {np.min(rot_euler)} max {np.max(rot_euler)}')
aff_list = get_affine_rot_from_euler(rot_euler) #4*4 affine matrix
qr_list = [ nq.from_rotation_matrix(aff) for aff in aff_list] #unit quaternion
euler_mean[i,:] = np.mean(rot_euler,axis=0)
qr_cm = nq.mean_rotor_in_chordal_metric(qr_list); #print(get_info_from_quat(qr_cm)); print(get_euler_for_qr(qr_cm))
euler_choral[i,:] = get_euler_from_qr(qr_cm)
aff_exp_mean = exp_mean_affine(aff_list); #print(spm_imatrix(aff_exp_mean)[3:6])
euler_exp[i,:] = get_euler_from_affine(aff_exp_mean)
aff_polar_mean = polar_mean_affin(aff_list); #print(spm_imatrix(aff_exp_mean)[3:6])
euler_pol[i,:] = get_euler_from_affine(aff_polar_mean)
qr_mean = qr_slerp_mean(qr_list)
euler_qr_slerp[i,:] == get_euler_from_qr(qr_mean)
print(f'max diff euler between choral and euler mean {np.max(np.abs(euler_choral - euler_mean))}')
print(f'max diff euler between exp matrix and euler mean {np.max(np.abs(euler_exp - euler_mean))}')
print(f'max diff euler between polar matrix and euler mean {np.max(np.abs(euler_pol - euler_mean))}')
#print(f'max diff euler between dq slerp and euler mean {np.max(np.abs(euler_slerp - euler_mean))}')
print(f'max diff euler between slerp and euler mean {np.max(np.abs(euler_qr_slerp - euler_mean))}')
print(f'max diff euler between slerp and choral {np.max(np.abs(euler_qr_slerp - euler_choral))}')
print(f'max diff euler between slerp and exp {np.max(np.abs(euler_qr_slerp - euler_exp))}')
print(f'max diff euler between slerp and polar {np.max(np.abs(euler_qr_slerp - euler_pol))}')
print(f'max diff euler between exp matrix and choral {np.max(np.abs(euler_exp - euler_choral))}')
print(f'max diff euler between exp matrix and polar {np.max(np.abs(euler_exp - euler_pol))}')
print(f'max diff euler between choral and polar {np.max(np.abs(euler_choral - euler_pol))}')
when runing with relative small euler angles ([2 10], I get the following results
max diff euler between choral and euler mean 0.5771007165992952
max diff euler between exp matrix and euler mean 0.5792118663119004
max diff euler between polar matrix and euler mean 0.5770459422450864
max diff euler between slerp and euler mean 0.5773281912799781
max diff euler between slerp and choral 0.0006857539330056284
max diff euler between slerp and exp 0.0038490835044502347
max diff euler between slerp and polar 0.0017496106805383604
max diff euler between exp matrix and choral 0.003745340455438395
max diff euler between exp matrix and polar 0.0034720351873698263
max diff euler between choral and polar 0.0011792367475038645
So the basic mean of euler angle is not that bad ! maximum error found is 0.57 °, and the other methods, are much more similar, with less than 0.001° maximum error
If now i re-run but just changing the range of euler angle to 2 30 (degree)
rot_euler = np.random.uniform(2,30,size=(10, 3)) #in degree
then I get much differences compare to basic euler mean
max diff euler between choral and euler mean 3.4405882549275937
max diff euler between exp matrix and euler mean 3.477347445081607
max diff euler between polar matrix and euler mean 3.4494764238804567
max diff euler between slerp and euler mean 3.4314367559995063
and other method are still similar although euler_exp seem to start to sligthly diverge (max diff aroun 0.1)
max diff euler between slerp and choral 0.05032053106625689
max diff euler between slerp and exp 0.1275873270845338
max diff euler between slerp and polar 0.06344169651966247
max diff euler between exp matrix and choral 0.12007111035800833
max diff euler between exp matrix and polar 0.11048402024126425
max diff euler between choral and polar 0.04851378454809563
So here is my 'informatic" way to understand what is going on, but a more theoretical understanding would be much better, … any clues would be appreciate
Many thanks
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (2 by maintainers)
Remember that there is no single “correct” average. Instead, there are numerous ways — as you have found — to define various things that might roughly correspond to what you intend in a particular situation. The only way to decide if one definition is the best for some particular purpose is to investigate it for that particular purpose. You’ve discovered this for yourself, too. But I worry that the measure you’ve defined is not a very good one, just because Euler angles really are terrible.[1] In particular, your measure is not “invariant”, which means that your results will change if you just rotate your coordinates. I suggest switching to
quaternion.rotation_chordal_distance
orquaternion.rotation_intrinsic_distance
.I suggest taking a step back and thinking about the rotation group itself. Unit quaternions, rotation matrices, and Euler angles are all just ways of putting the rotation group on computers, so fundamentally they sort of represent the same thing. In particular, the logarithm and exponential of quaternions are very closely related to the logarithm and exponential of rotation matrices, and — in some ways — to the Euler angles and resulting rotations themselves. But there are lots of ways in which quaternions are the most “faithful” of these, so let’s stick with them for now.
The quaternions can be thought of as just a regular four-dimensional space, so the unit quaternions can be thought of as a three-dimensional sphere in that four-dimensional space. It may be a little hard to visualize four dimensions, but luckily you can get most of the intuition you need just by thinking of an ordinary two-dimensional sphere in three dimensions, and the standard spherical coordinates.[2]
So now we can think of a simple but useful analogy. Imagine trying to define the average location of a bunch of points on a sphere. You’ll probably want the average to be a point on the sphere that minimizes the average distance to all the other points. But your result will depend on how you define distance between two points.
The first of these depends on how the streets are arranged, so it isn’t geometrically meaningful. In the same way, Euler angles depend on the coordinates you chose when defining the Euler angles. In particular, your Euler measure will give a huge (possibly infinite!) amount of weight to points close to the poles, compared to points near the equator. If you had just decided to use different coordinates, that pole would be in a different place, so you would effectively give all your points completely different weights.
The second and third don’t depend on coordinates at all, which implies they are both geometrically meaningful, which is why I use them in the code. But they are different. The results are nearly equal for points that are close to each other, but as the points get farther away, the intrinsic distance grows more quickly than the chordal distance. So the intrinsic average will give more weight to points that are farther away from the average. In statistics, we would say that this is more sensitive to “outliers”. They’re different, but neither is “wrong”, so neither is “right”.
Now go back to the definition of the average. I suggested that you want to minimize the distances between the average and all the other points. But why not the square of the distances? There are examples of this in the literature. The results again give much more weight to “outliers”, but I can’t say that they’re wrong.
Or maybe you are less confident in your measurement of the locations of some points than others, so why not manually give different weights to different points?
Or maybe it’s easier for you to compute things with rotation matrices. Just as with quaternions, you can define different measures of distances between rotation matrices — including intrinsic and chordal measures — that won’t be precisely the same as the quaternion measures.
Or maybe you want to talk about “generators” of rotations (which are essentially the logarithms of quaternions or matrices).
Or… I could go on.
So the reason there are so many ways to average rotations is just because there are lots of ways to represent rotations, lots of ways to measure “distances” with each of those representations, and lots of ways (and weights) to combine the distances of various points.[3]
We’re still left with the question of which one to use. The “best” answer depends on your precise requirements. But for me, simplicity and speed are important requirements, and I haven’t seen any reason to think that any other definition is ever any better than just the simple and fast
mean_rotor_in_chordal_metric
.[1] The only time Euler angles are reliable is when you’re doing analytical integration over the rotation group. It doesn’t sound like you’re interested in that, so it really is best to just stay away from them entirely.
[2] There’s another complication due to the fact that the unit quaternions actually map projectively onto the rotation group (meaning that
q
and-q
represent the same rotation). But at least if we stick to one hemisphere, this doesn’t actually matter, so we can leave that aside for now.[3] For reference, here are some references that present various ways of computing averages: * http://arc.aiaa.org/doi/10.2514/1.28949 * http://doi.org/10.1145/566570.566592 * https://doi.org/10.1007/s00006-013-0432-2 * https://doi.org/10.1023/A:1011129215388 * http://epubs.siam.org/doi/abs/10.1137/S0895479801383877
This part:
You need to step back and think about exactly what you want. Ignore quaternions for the moment, and go back to your data, and think about what you actually want to do with it.
I’m not exactly clear on how your data are represented, but at a basic level I guess the quantity you measure is density (or something similar) as a function of position: ρ(𝒙). It sounds like you have multiple measurements of more-or-less the same object — and therefore roughly the same underlying density field — and you want to line them up with each other as well as possible. Obviously if they were identical objects, you could do this with just a translation and rotation. You can deal with the translation by just computing the center of mass of the density field: ∫ ρ(𝒙) 𝒙 d³𝒙. This is invariant under rotations about the center of mass, so you can just bring the center of mass to — say — the origin of coordinates. You can do that for each measurement you have, and just be done with translations. (If you want to use dual quaternions / conformal geometric algebra to do all of this at once, it is possible to do so, but nothing you’ve written suggests that’s necessary.)
Now, the game is just to rotate about the origin to match your different measurements to each other. There are lots of ways I can imagine doing this. The most obvious (though probably not easiest) way would be to optimize ∫ |ρ₁(𝒙) - ρ₂(𝑹𝒙𝑹̄)|² d³𝒙 over the quaternion 𝑹. Another way would be to measure the second moments of the density fields (the center of mass being the first moment), and then solve for the rotation that takes one moment onto the other. Another way would be to establish a minimal set of reference points that let you define the orientation of the object, and then just solve algebraically for the rotation that takes one onto the other. Another way would be to measure numerous reference points and minimize the average distance between corresponding points Σᵢ|𝒙ⁱ₁ - 𝑹𝒙ⁱ₂𝑹̄|².
My point is that, even in this simplified description, there are again many ways you could approach the problem. So the mathematics can’t tell you which one you want to do; you have to decide. If your chosen method leads you back to averaging quaternions, then I bet it will either indicate which type of average you want, or it won’t matter which type of average you use. (In the limit that your different measurements really are just rotated and translated, it shouldn’t matter which type of average you use.)