Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Feature Request][Lie] Support for SO(3) and SE(3) Lie groups

See original GitHub issue

Created this to discuss implementation of rigid-body motion parameterization on SO(3) and SE(3).

Currently, several optimization schemes for sfm and related problems parameterize rotations and rototranslations as Lie group elements. Implementing these utils is also part of the kornia.sfm roadmap (see https://github.com/kornia/kornia/issues/480).

I think it makes sense for these ops to be in kornia.linalg (given all the other transformation stuff is in there). For the internal design, we could have SO(3) utils and SE(3) utils inside separate submodules and implement torch.nn.Modules for the ExpMap and LogMap for both manifolds.

Alternate propositions welcome!

Issue Analytics

State:
Created 4 years ago
Comments:11 (7 by maintainers)

Top GitHub Comments

2reactions

versatran01commented, Nov 25, 2020

I would start by proposing the following concepts and APIs. This is by no means complete, but serves as a good starting point.

Frame convention

Talking about transformations without frames is meaningless. We should therefore clarify our frame convention first. We should follow the convention used in Robot Modeling and Control - Spong, which is explained in detail by http://paulfurgale.info/news/2014/6/9/representing-robot-pose-the-good-the-bad-and-the-ugly, and only use the concept of passive rotations (transforming frames) and try to avoid active rotations (transforming points).

Basically R_0_1 represents a rotation that transform a point (vector) from frame 1 to frame 0. p_1 represents a point in frame 1 t_1_10 represents a vector in frame 1 that starts from frame 1 and points to frame 0 with such we have p_0 = R_0_1 @ p_1 R_0_2 = R_0_1 @ R_1_2 p_1 = R_0_1.T @ P_0 = R_1_0 @ P_0 t_0_10 = R_0_1 @ t_1_10

To be a bit more specific, in sfm, an extrinsic matrix is T_cam_world where as a pose is T_world_cam. And in our api, we should be very clear about which one we are accepting.

Rotation

A new module(file) named rotation.py, which can live wherever you see fit. For example kornia.geometry.rotation. Inside we have the following concepts: order within each bullet point: full name, short name, type, shape, representation

rotation_matrix, rmat, torch.tensor, (*,3,3), [[r11, r12, r13], [r21, r22, r23], [r31, r32, r33]] The only issue is that it can become non-orthonormal, and it is hard to renormalize. So we only use it as an intermediate representation (to transform points).
unit_quaternion, quat, torch.tensor, (*,4), [qw, qx, qy, qz] Have the same issue with rotation matrix, where it needs to be re-normalized. But at least it’s an easy operation. The question here is for any function that takes quat, should we provide an option to re-normalize it? For example rmat_from_quat(quat, renorm=False), if user knows that quaterion is unit, then there’s no extra cost, but if not (quat has been composed multiple times), then they should use renomr=True. This also uses Hamilton quaternion, instead of JPL [qx, qy, qz, qw].
rotation_vector, rvec, torch.tensor, (*,3), [rx, ry, rz]
angle_axis, anax, torch.tensor, (*,4), [an, ax, ay, az] Rotation vector is very similar to angle axis, which is just [rx, ry, rz] = an * [ax ,ay, az]. Since angle axis is indistinguishable from quaterion by shape (they both have 4 elements), I think we should only keep rotation vector for public api? (To prevent people from passing angle axis to quaterion functions and vice versa) We could still use angle axis internally for convenience. There are details to sort out when the magnitude of rotation vector is very small.
euler_angles, erpy, torch.tensor, (*,4), [r, p, y] We should NOT use this representation at all, just causes more confusion. List here only for completeness.

Conversions

We then provide the following conversions (could use full name but I use short name here). The conversion function follows the form a_from_b such that you can write a = a_from_b(b_from_c(c)) instead of a = b2a(c2b(c)), which is just not intuitive. For example quat_from_rmat and rmat_from_quat, and all the othe rcombinations. And like one of the issues suggested, we make sure the conversion cycle is consistent. e.g. assert q == quat_from_rmat(rmat_from_rvec(rvec_from_quat(q))

Utilities

Then for each representation, we provide some useful functions (could put them into individual files)

rotation_matrix
- R = rmat_normed(Q), solves the best rotation matrix R that approximate a given matrix Q, best is defined by argmin_{R} ||R -Q||^2_F, subject to R' R = I. Implemented via svd.
- R = rmat_identity(n=0), generate identity rotation matrices, n is the batch size, n=0 or None means a single rotation matrix of 3x3.
- R_1_0 = rmat_inv(R_0_1) -> R_1_0 = R_0_1^T,
- R_0_2 = rmat_oplus(R_0_1, R_1_2) -> R_0_2 = R_0_1 @ R_1_2, basically the boxplus operator, composes two rotations, user has to make sure the frame aligns. See paper by Hertzberg (Intergrating generic sensor fusion …)
- R_2_1 = rmat_ominus(R_0_1, R_0_2) -> R_0_2^{-1} @ R_0_1 = R_2_0 * R_0_1 = R_2_1, boxminus operator.
- p_0 = rmat_action(R_0_1, p_1) -> p_0 = R_0_1 @ p_1, action on 3d points.
- rmat_x(phi), rmat_y(theta), rmat_z(psi), active rotation around each axis by some amount, which is equivalent to rotating the frame the other way by the same amount.
quaternion
- q = quat_identity(n=0), generate identity unit quaterions.
- |q| = quat_norm(q), the norm of the quaterion
- uq = quat_normalized(q) -> uq = q / |q|, return normalized quaterion
- q = quat_mul(q1, q2) -> q = q1 * q2, quaterion multiplication. Note that there is not frame convention here, this is a purely mathematic operation defined on quaternion.
- quat_conj(q), conjugate of quaterion
- quat_inv(q), inverse of quaternion
- maybe some more, but for now these will suffice
random This is mostly used for testing, where we could generate random rotations. For now I think uniform sampling will suffice for testing.
- rand_quat, Effective sampling and distance metrics for 3d rigid body path planning, J. Kuffner
- rand_rmat, uniform sampling of ration matrix
- rand_rvec or rand_anax, uniform sampling of rotation vector

Transformation

Follow on, there can also be a transformation.py Where we have

transformation_matrix, tmat, th.tensor, (*, 4, 4), [[R, t], [0, 1]]
- T = tmat_from_rmat_tvec(R, t)
- R, t = rmat_tvec_from_tmat(T)
- p_0 = tmat_act(T_0_1, p_1)
- T_0_2 = tmat_oplus(T_0_1, T_1_2)
- T_2_1 = tmat_ominus(T_0_1, T_0_2)
- T = T_identity(n=0)
- T_1_0 = T_inv(T_0_1)
pose_quat_vec, pose, th.tensor, (*, 7), [qw, qx, qy, qz, tx, ty, tz] Feels less useful if we already have the above, maybe just provide conversions to and from.

Batch dimensions, Numpy Interop, and other concerns

All functions should not care about batch dimensions hopefully. So that we could pass a single 3x3 matrix or a Nx3x3 tensor and get the correct dimension back. I think this is already the case in kornia.
All functions that don’t use torch specific stuff could have the type Union[torch.Tensor, numpy.ndarray]. This is debatable, but I think this would be useful to have, intead of having fun(torch.from_numpy(R)).numpy() everywhere. https://eagerpy.jonasrauber.de/ might be useful for this.
Another question is do we want to append dimension to some of these concepts? For example rmat3 vs rmat2 and pose3 vs pose2? Or just use normal names for 3d constructs and append 2 for 2d version (since people don’t use those very often).
I also agree with you regarding the functional vs OO api. I think for tensors a functional api is better, where as an OO one seems more suitable for a single object. We should start with the functional one and the OO one can be built on top of it. This is also related to the pinhole camera class in kornia. Applying the same logic here, we should provide functions operating on (*,4) tensors of the form [fx, fy, cx, cy], and build PinholeCamera using these functions. A rectified stereo camera can be represented by a left one [fx, fy, cx, cy, b] and a right one [fx, fy, cx, cy, -b]. And a function that scales the intrinsics can take both mono and stereo representation and only operate on the part fxycxy[..., :4].

Lie groups

Once we have these building blocks in place, we can then move on the more sophisticated Lie group stuff. But I think the proposed rotation and transformation should cover most of the use case.

I have most of these stuff inplememted in my own code base with tests (although it will take time to carefully convert to torch). Any feedback would be appreciated. Also @edgarriba can you point me to the workflow of developing with in kornia?

1reaction

krrish94commented, Mar 21, 2020

LGTM, I’d presonally shorten the module name to kornia.geometry.linalg.lie

Top Results From Across the Web

The Lie group SE(3) - University of Pennsylvania

Rigid Body Kinematics. University of Pennsylvania. 10. SE(3) is a Lie group. SE(3) satisfies the four axioms that must be satisfied by the...

Lie Groups for 2D and 3D Transformations - Ethan Eade

Here are the Lie groups that this document addresses: Group. Description. Dim. Matrix Representation. SO(3). 3D Rotations. 3. 3D rotation matrix. SE(3).

Lie Groups and Lie Algebras: Lesson 29 - SO(3) from so(3)

Lie Groups and Lie Algebras : Lesson 29 - SO ( 3 ) from so ( 3 )In this video lesson we construct...

the physical significance of the Lie Algebra of SE(3)

Physically, SE(3) (the Special Euclidean Group in 3 dimensions) is the group of simultaneous rotations and translations for a vector.

Lie algebra of a compact Lie group and derivations of the Hopf ...

So you get a "Hopf"-way of defining "the unique left invariant vector field" associated to an element in the tangent space of the...