Faster U3 composition
See original GitHub issueWhat should we add?
Recently dealing with circuits with large numbers of U / U3 gates has become important. When looking into the composition of U3 gates Optimize1qGates it was noted that (if you use U over U3 due to overheads in the latter) that Optimize1qGates.compose_u3
, and in particular calls to Quaternion
become the bottleneck. It would be nice to push this bit of code down into Rust to get an easy 100x improvement in compose_u3
. Here is some example Cython code that is a drop in replacement for the current code, and is hopefully easy for someone to port over:
import numpy as np
cimport numpy as np
cimport cython
from libc.stdlib cimport malloc, free
from libc.math cimport cos, sin, atan2, acos, M_PI, abs
@cython.boundscheck(False)
@cython.cdivision(True)
def cy_compose_u3(double theta1, double phi1, double lambda1, double theta2, double phi2, double lambda2):
cdef double[::1] out_angles = np.zeros(3, dtype=float)
cdef size_t kk
# temp arrays
cdef double * q = <double *>malloc(3*sizeof(double))
cdef double * r = <double *>malloc(3*sizeof(double))
cdef double * s = <double *>malloc(3*sizeof(double))
cdef double * temp = <double *>malloc(4*sizeof(double))
cdef double * out = <double *>malloc(4*sizeof(double))
cdef double * mat = <double *>malloc(9*sizeof(double))
cdef double * euler = <double *>malloc(3*sizeof(double))
q[0] = cos(theta1/2.0)
q[1] = 0
q[2] = sin(theta1/2.0)
q[3] = 0
r[0] = cos((lambda1 + phi2)/2.0)
r[1] = 0
r[2] = 0
r[3] = sin((lambda1 + phi2)/2.0)
s[0] = cos(theta2/2.0)
s[1] = 0
s[2] = sin(theta2/2.0)
s[3] = 0
# Compute YZY decomp (q.r.s in variable names)
temp[0] = r[0] * q[0] - r[1] * q[1] - r[2] * q[2] - r[3] * q[3]
temp[1] = r[0] * q[1] + r[1] * q[0] - r[2] * q[3] + r[3] * q[2]
temp[2] = r[0] * q[2] + r[1] * q[3] + r[2] * q[0] - r[3] * q[1]
temp[3] = r[0] * q[3] - r[1] * q[2] + r[2] * q[1] + r[3] * q[0]
out[0] = s[0] * temp[0] - s[1] * temp[1] - s[2] * temp[2] - s[3] * temp[3]
out[1] = s[0] * temp[1] + s[1] * temp[0] - s[2] * temp[3] + s[3] * temp[2]
out[2] = s[0] * temp[2] + s[1] * temp[3] + s[2] * temp[0] - s[3] * temp[1]
out[3] = s[0] * temp[3] - s[1] * temp[2] + s[2] * temp[1] + s[3] * temp[0]
# out is now in YZY decomp, make into ZYZ
mat[0] = 1 - 2 * out[2] * out[2] - 2 * out[3] * out[3]
mat[1] = 2 * out[1] * out[2] - 2 * out[3] * out[0]
mat[2] = 2 * out[1] * out[3] + 2 * out[2] * out[0]
mat[3] = 2 * out[1] * out[2] + 2 * out[3] * out[0]
mat[4] = 1 - 2 * out[1] * out[1] - 2 * out[3]*out[3]
mat[5] = 2 * out[2] * out[3] - 2 * out[1] * out[0]
mat[6] = 2 * out[1] * out[3] - 2 * out[2] * out[0]
mat[7] = 2 * out[2] * out[3] + 2 * out[1] * out[0]
mat[8] = 1 - 2 * out[1] * out[1] - 2 * out[2] * out[2]
# Grab the euler angles
if mat[3*2+2] < 1.0:
if mat[3*2+2] > -1.0:
euler[0] = atan2(mat[3*1+2], mat[3*0+2])
euler[1] = acos(mat[3*2+2])
euler[2] = atan2(mat[3*2+1], -mat[3*2+0])
else:
euler[0] = -atan2(mat[3*1+0], mat[3*1+1])
euler[1] = M_PI
else:
euler[0] = atan2(mat[3*1+0], mat[3*1+1])
for kk in range(3):
if abs(euler[kk]) < 1e-15:
euler[kk] = 0.0
out_angles[0] = euler[1]
out_angles[1] = phi1+euler[0]
out_angles[2] = lambda2+euler[2]
# free pointers
free(q)
free(r)
free(s)
free(temp)
free(out)
free(mat)
free(euler)
return np.asarray(out_angles)
Issue Analytics
- State:
- Created a year ago
- Comments:9 (6 by maintainers)
Top Results From Across the Web
Components of U3 snoRNA-containing Complexes Shuttle ...
There are two major U3 snoRNA-containing complexes. The monoparticle contains U3 snoRNA and the core Box C/D snoRNA-associated proteins and an early preribosome ......
Read more >Jason Castillo, U3 Website - City Tech OpenLab
ENG1121 English Composition 2, SP2020 TuTh ... One thought on “Jason Castillo, U3 Website”. Jessica Penner says: May 13, 2020 at 5:00 pm....
Read more >Fast-growing algicidal Streptomyces sp. U3 and its potential in ...
In this study, a potent algicidal Streptomyces U3, which could produce algicidal compounds to remove several harmful algae, was isolated from ...
Read more >Natural Uranium Utilization in FUJI-U3 Molten Salt Reactor
Study of Helium Cooled Fast Reactor Core ... The composition of the fuel used is 72 mol% LiF, 16 mol% BeF2, and 12...
Read more >SanDisk Cruzer Titanium U3 Review - APH Networks
A quick search leads us to Liquidmetal Technologies' site, a company specializing in unique and improved alloys upon 'typical' compositions. So, ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I have some interest in this improvement and could potentially try porting to Rust (at least the Cython snippet, not sure about all of
Quaternion
). Just to check first:The original Cython code gave around 60x, so seems about right.