Single-precision `sum()` over certain axes sometimes deviates from NumPy's outcome
See original GitHub issueWhen using single-precision floating numbers (float32
and complex64
) and summing over a particular axis set (axis=(0, 1)
for ndim=3
; other possible axis combinations work fine), the result of sum(axis=axis)
in CuPy is not the same (within tolerance) as that of NumPy.
Not sure how this can be fixed – most likely it’s due to a combination insufficient numerical precision and the order of operation on GPU, as this only happens randomly (depending on input). I believe CuPy’s ufunc implementation is correct, because this also happened with my experimental CUB support for sum(axis=axis)
(see https://github.com/cupy/cupy/issues/2519#issuecomment-544531651), which does not use ufunc at all.
The test below is verified and reproducible on two different GPU machines with different environment setups.
- Conditions (you can just paste the output of
python -c 'import cupy; cupy.show_config()'
)
CuPy Version : 7.0.0b4
CUDA Root : /usr/local/cuda
CUDA Build Version : 9010
CUDA Driver Version : 9010
CUDA Runtime Version : 9010
cuDNN Build Version : None
cuDNN Version : None
NCCL Build Version : 2115
NCCL Runtime Version : (unknown)
Code to reproduce:
import cupy as cp
from contextlib import contextmanager
@contextmanager
def sync_time(name):
start = cp.cuda.Event()
end = cp.cuda.Event()
start.record()
#start.synchronize()
yield
end.record()
end.synchronize()
t = cp.cuda.get_elapsed_time(start, end)
print("{} : {} ms".format(name, t/N))
N = 100
shape = (400, 326, 287)
axis = (0, 1)
for dtype in (cp.float32, cp.complex64):
if dtype is cp.float32:
x = cp.random.random(shape, dtype=dtype)
else:
x = cp.random.random(shape).astype(dtype) + 1j * cp.random.random(shape).astype(dtype)
x_np = cp.asnumpy(x) #move to cpu
for func in ('sum',): #, 'max', 'min'):
for keepdims in (False, True):
print("testing", axis, "+", str(dtype), "+", "keepdims=", keepdims, "+", func, "...")
y = None
with sync_time("cupy"):
for i in range(N):
y = getattr(x, func)(axis=axis, keepdims=keepdims)
z = None
with sync_time("numpy"):
for i in range(N):
z = getattr(x_np, func)(axis=axis, keepdims=keepdims)
try:
assert cp.allclose(y, z)
except AssertionError:
print(y, z)
output:
testing (0, 1) + <class 'numpy.float32'> + keepdims= False + sum ...
cupy : 3.833934326171875 ms
numpy : 42.49509765625 ms
[65107.914 65282.875 65074.527 65245.703 65266.44 65286.04 65184.78
65211.15 65307.42 65170.098 65158.305 65215.14 65141.066 65230.62
65214.22 65119.094 65190.766 65161.977 65225.023 65134.875 65121.
65294.008 65219.695 65240.64 65227.52 65444.656 65345.46 65153.312
65024.04 65165.46 65223.586 65127.492 65289.99 65319.332 65268.336
65325.74 65189.03 65193.254 65242.39 65374.28 65318.742 65159.24
65151.15 65243.89 65171. 65236.367 65152.68 65242. 65187.816
65183.47 65174.17 65066.973 65166.8 65240.254 65223.773 65232.934
65282.848 65228.38 65221.734 65076.96 65212.434 64956.46 65161.156
65248.61 64932.22 65230.9 65164.832 65301.64 65013.23 65292.457
65153.93 65129.562 65159.97 65012.523 65105.11 65071.92 65223.633
65228.176 65221.07 65147.34 65090.316 65007.17 65244.207 65090.633
65204.258 65141.383 65109.605 65303.438 65145.023 65118.5 65162.703
65051.19 65089.703 65132.79 65378.133 65325.844 65010.87 65214.14
65181.883 65265. 65044.297 65202.914 65191.93 65093.992 65233.688
65149.426 65034.61 65178.35 65255.484 65171.543 65185.45 65369.926
65241.508 64994.617 65123.93 65361.902 65289.72 65063.234 65058.094
65194.844 65137.633 65108.223 65169.094 65072.94 65245.15 65164.867
65404.664 65034.87 65113.027 65195.656 65082.4 65249.93 65176.285
65166.11 65156.22 65109.6 65119.69 65287.516 65305.855 65116.85
65218.93 65180.117 65183.926 65069.03 65383.156 65127.586 65154.977
65192.246 65270.613 65190.4 65234.477 65188.23 65082.664 65173.89
65179.63 65081.54 65258.586 65239.27 65104.59 65188.19 65377.164
65340.33 65277.566 65273.84 65360.4 64913.676 65230.035 65189.375
65176.19 65145.055 65183.42 65287.6 65153.863 65253.086 65128.656
65256.68 65142.11 65116.05 65289.883 65257.78 65202.5 65010.203
65319.85 65172.47 65108.16 65154.11 65435.27 65155.46 65141.5
65285.22 65269.426 65190.023 65073.77 65000.266 65048.54 65464.414
65240.625 65199.78 65292.383 65203.21 65205.5 65174.68 65245.395
65301.176 65377.043 65446.344 65074.4 65330.777 65270.45 65111.32
65142.703 65242.203 65285.64 65023.3 65301.727 65062.633 65216.176
65172.117 65168.15 65238.742 65212.816 65256.156 65153.105 65220.883
65219.938 65346.453 65209.484 65186.78 65160.312 65279.64 65152.035
65138.812 65313.457 65147.242 65368.098 65161.125 65293.492 65032.4
65404.8 65326.9 65241.727 65219.797 64880.03 64901.67 65310.965
65246.652 65217.465 65131.83 65206.484 65153.176 65250.65 65174.85
65114.11 65147.35 65187.47 65149.758 65234.58 65211.24 65109.348
65302.438 65206.617 65236.13 65377.47 65212.79 65493.26 65375.695
65239.36 65135.438 65318.344 65170.797 65088.64 65204.184 65294.945
65204.234 65475.164 65355.15 65273.21 65380.145 65257.37 65063.953
65370.555 65265.113 65043.758 65127.53 65282.516 65243.234 65364.777] [65107.883 65282.35 65074.676 65245.32 65266.36 65285.984 65184.547
65210.523 65307.855 65169.52 65158.074 65215.074 65141.355 65230.176
65213.965 65118.953 65190.574 65161.617 65224.723 65135.094 65120.688
65294.02 65220.082 65240.156 65227.957 65444.742 65345.457 65152.65
65023.844 65165.258 65223.34 65127.523 65290.13 65319.445 65268.605
65325.27 65189.066 65192.637 65242.312 65374.68 65319.023 65159.426
65151.062 65243.945 65171.39 65236.027 65152.574 65242.484 65187.957
65183.805 65174.535 65067.7 65166.684 65239.785 65223.03 65233.3
65282.434 65229.223 65221.227 65076.523 65212.336 64956.53 65160.83
65249.09 64931.848 65230.945 65164.914 65301.65 65012.566 65292.695
65153.793 65129.68 65160.086 65012.527 65104.84 65071.867 65223.832
65228.16 65221.77 65147.316 65090.176 65007.08 65243.695 65091.16
65204.414 65141.598 65109.734 65303.777 65144.87 65118.312 65162.863
65051.41 65089.12 65133.07 65378.387 65325.887 65010.406 65213.996
65181.633 65265.008 65043.87 65202.887 65192.117 65094.37 65234.113
65149.832 65034.445 65178.18 65256.18 65171.2 65185.23 65370.367
65241.44 64995. 65124. 65361.87 65289.363 65063.07 65058.508
65194.895 65137.32 65107.72 65168.406 65072.844 65245.445 65165.215
65404.582 65035.41 65113.227 65194.934 65082.617 65249.785 65176.043
65165.84 65156.496 65109.457 65119.06 65287.992 65305.613 65116.598
65219.16 65180.02 65184.242 65069.168 65382.902 65127.887 65155.1
65192.094 65270.625 65190.67 65234.94 65187.965 65082.875 65173.992
65179.65 65081.84 65258.31 65239.793 65104.5 65188.242 65377.598
65340.195 65278.08 65273.973 65360.434 64913.477 65230.684 65189.582
65176.19 65145.11 65183.605 65287.656 65153.957 65252.758 65128.547
65256.523 65141.965 65116.035 65289.574 65258.137 65202.562 65010.098
65319.91 65172.297 65107.92 65154.008 65435.32 65155.812 65141.53
65284.86 65269.76 65190.688 65074.094 65000.547 65048.11 65464.383
65241.01 65200.184 65292.582 65203.15 65205.684 65175.008 65245.504
65301.74 65377.395 65446.73 65074.125 65331.113 65270.33 65111.53
65142.87 65242.605 65286.273 65023.305 65301.617 65062.543 65216.664
65172.35 65167.938 65239.844 65212.58 65256.137 65152.88 65220.562
65220.293 65346.58 65209.445 65186.72 65160.38 65279.92 65152.734
65138.57 65312.75 65147.273 65367.734 65160.633 65293.766 65032.246
65404.46 65326.59 65241.44 65219.594 64880.023 64901.617 65310.84
65246.285 65217.363 65131.59 65206.41 65153.504 65250.973 65175.016
65113.754 65146.875 65188.035 65149.98 65234.703 65211.105 65109.586
65302.21 65206.832 65235.777 65377.594 65212.49 65493.37 65376.066
65239.453 65135.81 65318.477 65171.047 65088.65 65204.316 65294.867
65204.11 65474.746 65354.977 65273.64 65380.465 65257.703 65063.902
65370.445 65265.04 65043.586 65127.348 65282.49 65242.777 65365.195]
testing (0, 1) + <class 'numpy.float32'> + keepdims= True + sum ...
cupy : 3.8184237670898438 ms
numpy : 42.4822802734375 ms
[[[65107.914 65282.875 65074.527 65245.703 65266.44 65286.04 65184.78
65211.15 65307.42 65170.098 65158.305 65215.14 65141.066 65230.62
65214.22 65119.094 65190.766 65161.977 65225.023 65134.875 65121.
65294.008 65219.695 65240.64 65227.52 65444.656 65345.46 65153.312
65024.04 65165.46 65223.586 65127.492 65289.99 65319.332 65268.336
65325.74 65189.03 65193.254 65242.39 65374.28 65318.742 65159.24
65151.15 65243.89 65171. 65236.367 65152.68 65242. 65187.816
65183.47 65174.17 65066.973 65166.8 65240.254 65223.773 65232.934
65282.848 65228.38 65221.734 65076.96 65212.434 64956.46 65161.156
65248.61 64932.22 65230.9 65164.832 65301.64 65013.23 65292.457
65153.93 65129.562 65159.97 65012.523 65105.11 65071.92 65223.633
65228.176 65221.07 65147.34 65090.316 65007.17 65244.207 65090.633
65204.258 65141.383 65109.605 65303.438 65145.023 65118.5 65162.703
65051.19 65089.703 65132.79 65378.133 65325.844 65010.87 65214.14
65181.883 65265. 65044.297 65202.914 65191.93 65093.992 65233.688
65149.426 65034.61 65178.35 65255.484 65171.543 65185.45 65369.926
65241.508 64994.617 65123.93 65361.902 65289.72 65063.234 65058.094
65194.844 65137.633 65108.223 65169.094 65072.94 65245.15 65164.867
65404.664 65034.87 65113.027 65195.656 65082.4 65249.93 65176.285
65166.11 65156.22 65109.6 65119.69 65287.516 65305.855 65116.85
65218.93 65180.117 65183.926 65069.03 65383.156 65127.586 65154.977
65192.246 65270.613 65190.4 65234.477 65188.23 65082.664 65173.89
65179.63 65081.54 65258.586 65239.27 65104.59 65188.19 65377.164
65340.33 65277.566 65273.84 65360.4 64913.676 65230.035 65189.375
65176.19 65145.055 65183.42 65287.6 65153.863 65253.086 65128.656
65256.68 65142.11 65116.05 65289.883 65257.78 65202.5 65010.203
65319.85 65172.47 65108.16 65154.11 65435.27 65155.46 65141.5
65285.22 65269.426 65190.023 65073.77 65000.266 65048.54 65464.414
65240.625 65199.78 65292.383 65203.21 65205.5 65174.68 65245.395
65301.176 65377.043 65446.344 65074.4 65330.777 65270.45 65111.32
65142.703 65242.203 65285.64 65023.3 65301.727 65062.633 65216.176
65172.117 65168.15 65238.742 65212.816 65256.156 65153.105 65220.883
65219.938 65346.453 65209.484 65186.78 65160.312 65279.64 65152.035
65138.812 65313.457 65147.242 65368.098 65161.125 65293.492 65032.4
65404.8 65326.9 65241.727 65219.797 64880.03 64901.67 65310.965
65246.652 65217.465 65131.83 65206.484 65153.176 65250.65 65174.85
65114.11 65147.35 65187.47 65149.758 65234.58 65211.24 65109.348
65302.438 65206.617 65236.13 65377.47 65212.79 65493.26 65375.695
65239.36 65135.438 65318.344 65170.797 65088.64 65204.184 65294.945
65204.234 65475.164 65355.15 65273.21 65380.145 65257.37 65063.953
65370.555 65265.113 65043.758 65127.53 65282.516 65243.234 65364.777]]] [[[65107.883 65282.35 65074.676 65245.32 65266.36 65285.984 65184.547
65210.523 65307.855 65169.52 65158.074 65215.074 65141.355 65230.176
65213.965 65118.953 65190.574 65161.617 65224.723 65135.094 65120.688
65294.02 65220.082 65240.156 65227.957 65444.742 65345.457 65152.65
65023.844 65165.258 65223.34 65127.523 65290.13 65319.445 65268.605
65325.27 65189.066 65192.637 65242.312 65374.68 65319.023 65159.426
65151.062 65243.945 65171.39 65236.027 65152.574 65242.484 65187.957
65183.805 65174.535 65067.7 65166.684 65239.785 65223.03 65233.3
65282.434 65229.223 65221.227 65076.523 65212.336 64956.53 65160.83
65249.09 64931.848 65230.945 65164.914 65301.65 65012.566 65292.695
65153.793 65129.68 65160.086 65012.527 65104.84 65071.867 65223.832
65228.16 65221.77 65147.316 65090.176 65007.08 65243.695 65091.16
65204.414 65141.598 65109.734 65303.777 65144.87 65118.312 65162.863
65051.41 65089.12 65133.07 65378.387 65325.887 65010.406 65213.996
65181.633 65265.008 65043.87 65202.887 65192.117 65094.37 65234.113
65149.832 65034.445 65178.18 65256.18 65171.2 65185.23 65370.367
65241.44 64995. 65124. 65361.87 65289.363 65063.07 65058.508
65194.895 65137.32 65107.72 65168.406 65072.844 65245.445 65165.215
65404.582 65035.41 65113.227 65194.934 65082.617 65249.785 65176.043
65165.84 65156.496 65109.457 65119.06 65287.992 65305.613 65116.598
65219.16 65180.02 65184.242 65069.168 65382.902 65127.887 65155.1
65192.094 65270.625 65190.67 65234.94 65187.965 65082.875 65173.992
65179.65 65081.84 65258.31 65239.793 65104.5 65188.242 65377.598
65340.195 65278.08 65273.973 65360.434 64913.477 65230.684 65189.582
65176.19 65145.11 65183.605 65287.656 65153.957 65252.758 65128.547
65256.523 65141.965 65116.035 65289.574 65258.137 65202.562 65010.098
65319.91 65172.297 65107.92 65154.008 65435.32 65155.812 65141.53
65284.86 65269.76 65190.688 65074.094 65000.547 65048.11 65464.383
65241.01 65200.184 65292.582 65203.15 65205.684 65175.008 65245.504
65301.74 65377.395 65446.73 65074.125 65331.113 65270.33 65111.53
65142.87 65242.605 65286.273 65023.305 65301.617 65062.543 65216.664
65172.35 65167.938 65239.844 65212.58 65256.137 65152.88 65220.562
65220.293 65346.58 65209.445 65186.72 65160.38 65279.92 65152.734
65138.57 65312.75 65147.273 65367.734 65160.633 65293.766 65032.246
65404.46 65326.59 65241.44 65219.594 64880.023 64901.617 65310.84
65246.285 65217.363 65131.59 65206.41 65153.504 65250.973 65175.016
65113.754 65146.875 65188.035 65149.98 65234.703 65211.105 65109.586
65302.21 65206.832 65235.777 65377.594 65212.49 65493.37 65376.066
65239.453 65135.81 65318.477 65171.047 65088.65 65204.316 65294.867
65204.11 65474.746 65354.977 65273.64 65380.465 65257.703 65063.902
65370.445 65265.04 65043.586 65127.348 65282.49 65242.777 65365.195]]]
testing (0, 1) + <class 'numpy.complex64'> + keepdims= False + sum ...
cupy : 3.869806213378906 ms
numpy : 63.1233447265625 ms
[65186.426+65182.477j 65242.88 +65049.57j 65206.117+65214.55j
65048.14 +65278.664j 65197.906+65209.71j 65135.867+65215.363j
65327.36 +65208.02j 65108.22 +65141.016j 65333.676+65136.883j
65342. +65213.28j 65342.848+65298.93j 65177.773+65312.83j
65148.21 +65119.227j 65293.832+65267.695j 65199.023+65202.207j
65269.957+65003.875j 65027.14 +65102.934j 65293.555+65337.57j
65036.09 +65326.45j 65069.555+65122.21j 65170.188+65182.14j
65086.773+65204.25j 65245.074+65350.133j 65278.77 +65176.047j
65383.12 +65328.25j 65372.785+65076.875j 65340.812+65295.15j
65258.914+65019.492j 65179.637+65094.973j 65114.742+65139.965j
65228.574+65214.1j 65028.324+65035.492j 65158.71 +65270.234j
65299.96 +65310.32j 65314.164+65216.145j 65152.445+65163.82j
64984.176+65205.04j 65325.242+65244.406j 65308.74 +65173.617j
65165.5 +65330.65j 65126.445+65090.18j 65136.22 +65150.49j
65072.883+65090.805j 65156.95 +65377.67j 65243.934+65084.496j
65335.4 +65139.293j 65354.574+65371.266j 65277.297+65176.633j
65213.723+65281.656j 65096.094+65255.336j 65214.305+65097.867j
65304.68 +65323.668j 65336.617+65143.016j 65109.906+65269.316j
65259.906+65267.74j 65062.758+65238.457j 65106.82 +65017.633j
65121.54 +65096.45j 65187.688+65272.53j 65071.33 +65308.246j
64950.973+65187.082j 65391.445+65345.008j 64989.47 +65234.906j
65107.72 +65159.03j 65278.68 +65234.58j 65301.086+65031.836j
65155.4 +65310.01j 65247.79 +65235.55j 65272.406+65041.395j
65176.75 +65342.47j 65235.73 +65273.438j 65082.28 +65235.61j
65384.97 +65267.312j 65182.273+65253.41j 64940.703+65254.39j
65218.484+65367.008j 65279.703+65307.156j 65180.89 +65182.312j
65345.543+65134.254j 65168.86 +65003.234j 65209.777+65194.227j
65188.15 +65257.156j 65191.836+65144.566j 65330.88 +65215.676j
65241.883+65340.89j 65280.305+65152.39j 65297.39 +65284.293j
65348.71 +65193.9j 64999.133+65260.25j 65156.867+65223.32j
65131.63 +65199.j 65172.418+64953.39j 65135.727+65222.703j
65380.312+65251.58j 65325.17 +64980.1j 65282.47 +65105.555j
65455.387+65341.54j 65183.01 +65014.965j 65267.117+65139.87j
65088.547+65313.78j 65362.477+65227.875j 65148.18 +65164.008j
65254.21 +65241.027j 65168.984+65253.984j 65015.96 +65285.254j
65027.53 +64951.78j 65211.297+65116.21j 65158.47 +65244.29j
65213.37 +65143.49j 65220.664+65484.926j 64716.637+65213.066j
65112.074+65131.67j 65283.23 +65231.72j 65198.812+65132.902j
65300.664+65246.043j 65212.797+65223.035j 65311.69 +65423.25j
65248.95 +65294.04j 65215.062+65093.79j 65100.785+65147.414j
65112.848+65067.598j 65221.78 +65091.434j 65397.516+65159.254j
65271.918+65186.727j 65224.195+65424.516j 65287.332+65154.57j
65157.953+65352.48j 65062.152+65004.3j 65320.164+65241.26j
65189.066+65011.434j 65096.348+65292.695j 65366.41 +65339.62j
65203.574+65313.918j 65164.688+65089.72j 65363.906+65105.21j
65210.285+65177.57j 65130.637+65383.867j 65177.312+64996.492j
65224.613+65187.57j 65314.15 +65338.6j 65147.656+65344.566j
65170.39 +65166.43j 65284.086+65347.312j 65140.418+65056.22j
65270.562+65298.984j 65264.14 +65223.04j 65176.613+65201.1j
65215.47 +65253.18j 65356.883+65136.64j 65340.508+65317.445j
65285.664+65130.13j 65281.53 +65134.17j 65449.203+65235.1j
65069.51 +65198.688j 65286.984+65305.812j 65128.43 +65151.52j
65267.33 +65226.414j 65227.348+65078.125j 65158.664+65170.664j
65072.117+65127.188j 65256.84 +65116.508j 65306.344+65215.44j
65110.727+65133.85j 65126.82 +65289.82j 65181.977+65372.43j
65108.867+65202.812j 65083.97 +65196.074j 64985.957+65257.027j
65164.42 +65163.656j 65127.656+65188.01j 65121.15 +65099.04j
65224.57 +65233.22j 65273.414+65070.812j 65091.195+65293.54j
65122.22 +65185.887j 65043.633+65203.965j 65235.75 +65386.07j
65361.81 +65140.242j 65233.28 +65293.64j 65242.613+65347.305j
65235.4 +65294.477j 65232. +65352.223j 65089.562+65282.727j
65269.953+65192.168j 65116.46 +65109.34j 65126.63 +65325.508j
65098.48 +65027.402j 65180.312+65130.406j 65115.176+65095.074j
65349.234+65217.395j 65058.203+65263.215j 65230.21 +65226.797j
65397.12 +65272.043j 65447.234+65210.668j 65101.125+65398.227j
65140.82 +65309.32j 65289.61 +65348.215j 65219.492+65221.88j
65204.883+65104.527j 65321.914+65057.203j 65170.71 +65395.547j
65308.098+65301.617j 65251.414+65135.688j 65381.1 +65052.81j
65120.51 +65038.195j 65165.11 +65226.37j 65270.75 +65313.02j
65135.336+65280.863j 65315.26 +65280.418j 65017.05 +65338.773j
65404.258+65147.223j 65368.113+64879.72j 65126.57 +65152.5j
65205.844+65197.344j 65061.72 +65145.69j 65192.465+65223.29j
65285.633+65066.71j 65358.51 +65147.688j 65196.67 +65307.04j
65150.723+65298.555j 65131.4 +65077.805j 65100.883+65366.96j
65188.19 +65150.31j 65165.426+65262.816j 64952.125+65060.832j
65307.516+65211.21j 65107.035+65252.156j 65233.336+65139.34j
65104.086+65164.47j 65075.176+65326.38j 65282.25 +65212.67j
65005.17 +65282.598j 65294.805+65261.j 65083.047+65010.754j
65160.125+65091.93j 65052.54 +65043.266j 65311.547+65014.227j
65153.305+65039.86j 65328.71 +65063.04j 65255.258+65066.69j
65312.72 +65021.793j 65321.87 +65167.09j 65261.46 +65262.727j
65058.53 +65185.89j 65286.6 +65111.156j 65375.47 +65120.33j
64960.203+65133.453j 65278.54 +65225.324j 65232.58 +65241.8j
65199.484+65308.027j 65161.344+65087.562j 65181.836+65214.45j
65308.438+65397.33j 65242.66 +65344.504j 65178.836+64996.21j
65286.844+65174.64j 65308.1 +65146.703j 65355.15 +65279.137j
65120.03 +65279.45j 65195.45 +65307.027j 65070.617+65111.973j
65256.766+65073.06j 65313.32 +65010.457j 65162.406+65336.848j
65297.605+65196.777j 65276.484+65301.305j 65329.418+65249.47j
65368.273+65068.59j 65038.383+65442.312j 65181.723+65163.555j
65313.766+65182.96j 64997.31 +65328.15j 65221.523+65137.26j
65140.906+65131.824j 65274.016+65217.67j 65066.566+65309.312j
65173.82 +65126.816j 65184.883+65290.258j 65211.13 +65217.j
65131.348+65142.703j 65014.523+65148.055j 65311.63 +65348.3j
65231.09 +65018.53j 65197.867+65253.258j 65238.953+65378.49j
65218.242+65297.258j 65131.027+65239.32j ] [65186.664+65182.293j 65242.582+65049.9j 65205.8 +65214.957j
65047.76 +65278.89j 65198.117+65209.44j 65135.95 +65215.414j
65327.387+65207.84j 65108.16 +65140.83j 65333.125+65136.79j
65341.938+65213.203j 65343.348+65298.938j 65178.03 +65312.938j
65148.645+65119.004j 65294.133+65268.46j 65198.96 +65202.04j
65269.89 +65004.008j 65027.01 +65103.047j 65293.65 +65337.6j
65036.19 +65326.832j 65069.965+65122.598j 65170.03 +65182.074j
65087.344+65204.465j 65245.266+65350.035j 65278.414+65176.305j
65383.016+65328.375j 65373.43 +65076.684j 65340.67 +65295.527j
65259. +65019.723j 65179.645+65094.766j 65114.746+65139.6j
65228.836+65214.13j 65028.023+65035.53j 65158.55 +65270.76j
65299.91 +65310.258j 65313.76 +65215.97j 65152.2 +65164.15j
64984.76 +65205.156j 65324.7 +65243.957j 65309.062+65173.52j
65165.12 +65330.707j 65126.715+65090.336j 65136.426+65150.02j
65072.566+65090.72j 65156.85 +65377.05j 65243.69 +65084.406j
65335.32 +65138.906j 65354.33 +65371.133j 65277.465+65176.61j
65213.746+65281.777j 65096.402+65255.492j 65214.42 +65097.895j
65305.008+65323.586j 65336.6 +65143.145j 65110.023+65269.555j
65259.684+65268.332j 65063.008+65238.53j 65106.668+65017.973j
65121.74 +65097.12j 65187.633+65272.68j 65071.395+65308.08j
64950.992+65187.008j 65391.656+65344.75j 64989.65 +65235.098j
65107.703+65159.242j 65278.266+65234.137j 65300.723+65031.758j
65155.63 +65309.785j 65247.406+65235.44j 65272.594+65041.453j
65176.863+65342.516j 65235.523+65273.426j 65082.04 +65236.004j
65384.824+65267.285j 65181.914+65253.754j 64941.266+65254.5j
65218.625+65367.098j 65280.523+65306.793j 65180.46 +65182.31j
65346.062+65134.51j 65168.895+65002.758j 65210.26 +65194.312j
65188.24 +65256.844j 65191.652+65144.344j 65331.09 +65215.293j
65241.67 +65341.055j 65280.055+65152.94j 65297.836+65283.86j
65348.57 +65193.625j 64998.93 +65260.33j 65157.074+65223.074j
65131.66 +65198.867j 65172.5 +64953.434j 65136.234+65222.25j
65380.703+65251.027j 65325.188+64980.176j 65281.984+65105.902j
65454.78 +65341.68j 65183.65 +65015.29j 65266.707+65140.j
65088.645+65313.363j 65362.16 +65227.52j 65148.23 +65163.52j
65253.887+65241.293j 65169.01 +65253.586j 65015.98 +65285.26j
65027.035+64951.52j 65211.344+65116.01j 65158.07 +65244.57j
65213.96 +65142.83j 65221.113+65484.766j 64716.88 +65213.31j
65112.016+65131.637j 65283.305+65232.15j 65198.984+65133.113j
65301.02 +65245.38j 65212.87 +65222.82j 65311.69 +65423.48j
65248.707+65293.723j 65215.76 +65093.105j 65100.027+65147.387j
65112.9 +65067.348j 65221.918+65091.285j 65397.688+65159.39j
65272.047+65186.215j 65224.46 +65424.316j 65287.293+65154.42j
65157.67 +65352.21j 65062.04 +65004.652j 65319.77 +65241.01j
65189.234+65011.28j 65096.43 +65292.605j 65366.164+65340.08j
65204.02 +65313.88j 65164.77 +65089.98j 65364.34 +65104.895j
65210.246+65177.645j 65130.777+65384.023j 65177.535+64996.13j
65225.168+65187.734j 65313.97 +65338.477j 65147.49 +65344.17j
65170.797+65166.215j 65284.4 +65347.785j 65140.57 +65056.094j
65270.156+65298.664j 65264.152+65222.63j 65176.902+65200.996j
65215.39 +65253.887j 65356.484+65136.887j 65340.31 +65317.46j
65285.74 +65129.41j 65281.49 +65134.38j 65448.992+65235.008j
65069.63 +65199.1j 65287.055+65305.51j 65128.176+65151.844j
65267.273+65225.863j 65227.12 +65077.68j 65158.723+65170.797j
65072.26 +65127.195j 65257.086+65116.566j 65306.535+65215.52j
65111.49 +65133.59j 65126.69 +65289.844j 65182.35 +65372.855j
65108.535+65202.574j 65083.926+65196.133j 64985.72 +65256.83j
65164.42 +65163.65j 65128.11 +65188.49j 65121.094+65099.547j
65225.07 +65233.113j 65273.02 +65071.32j 65091.652+65293.81j
65122.56 +65185.633j 65043.355+65203.723j 65235.62 +65385.926j
65361.965+65139.914j 65233.855+65293.844j 65242.83 +65347.04j
65235.418+65294.39j 65231.715+65352.277j 65089.227+65282.79j
65270.2 +65192.484j 65116.113+65109.19j 65126.848+65325.945j
65098.14 +65027.234j 65180.434+65130.574j 65115.19 +65095.434j
65349.438+65217.96j 65057.887+65263.168j 65230.195+65226.83j
65397.56 +65272.01j 65446.703+65210.484j 65100.883+65398.22j
65140.59 +65309.32j 65289.555+65347.79j 65219.7 +65221.56j
65205.086+65104.023j 65322.035+65057.1j 65170.19 +65395.54j
65307.965+65300.742j 65251.168+65135.992j 65380.82 +65052.605j
65120.418+65038.52j 65164.625+65226.363j 65270.918+65312.81j
65135.195+65280.934j 65315.26 +65280.516j 65016.99 +65338.73j
65404.426+65148.023j 65368.33 +64879.938j 65126.477+65152.457j
65205.633+65197.01j 65061.52 +65146.04j 65192.785+65223.695j
65285.594+65066.82j 65358.523+65147.285j 65197.29 +65307.402j
65150.992+65298.633j 65131.47 +65077.71j 65100.273+65367.51j
65188.617+65149.75j 65165.38 +65263.45j 64952.477+65061.312j
65307.48 +65210.91j 65107.41 +65252.992j 65233.242+65139.305j
65104.312+65164.72j 65075.605+65326.2j 65282.16 +65212.58j
65005.188+65282.152j 65295.234+65260.926j 65082.79 +65010.73j
65160.4 +65092.027j 65052.523+65043.293j 65311.438+65014.258j
65153.266+65039.3j 65328.203+65062.445j 65255.258+65067.086j
65312.555+65021.754j 65321.914+65166.96j 65261.867+65262.285j
65058.324+65185.867j 65286.355+65111.082j 65375.37 +65120.355j
64960.34 +65133.445j 65278.29 +65225.168j 65232.746+65241.777j
65199.92 +65308.184j 65161.457+65088.j 65181.742+65214.035j
65308.707+65397.023j 65242.73 +65344.92j 65179.18 +64996.117j
65286.523+65175.07j 65308.207+65146.984j 65355.18 +65279.004j
65120.06 +65279.19j 65195.152+65307.133j 65070.89 +65112.17j
65256.79 +65073.36j 65313.33 +65010.973j 65162.688+65337.055j
65297.51 +65196.73j 65276.305+65300.71j 65329.098+65249.12j
65367.688+65068.473j 65038.86 +65442.4j 65181.68 +65163.61j
65313.508+65183.29j 64996.754+65328.06j 65221.57 +65136.816j
65141.156+65131.523j 65274.297+65218.03j 65066.8 +65309.203j
65173.91 +65126.945j 65184.523+65289.848j 65211.54 +65217.41j
65131.734+65142.906j 65014.73 +65147.723j 65311.312+65348.09j
65231.156+65018.02j 65198.21 +65252.832j 65239.105+65378.1j
65218.16 +65297.176j 65131.01 +65239.188j]
testing (0, 1) + <class 'numpy.complex64'> + keepdims= True + sum ...
cupy : 3.859105224609375 ms
numpy : 63.1188671875 ms
[[[65186.426+65182.477j 65242.88 +65049.57j 65206.117+65214.55j
65048.14 +65278.664j 65197.906+65209.71j 65135.867+65215.363j
65327.36 +65208.02j 65108.22 +65141.016j 65333.676+65136.883j
65342. +65213.28j 65342.848+65298.93j 65177.773+65312.83j
65148.21 +65119.227j 65293.832+65267.695j 65199.023+65202.207j
65269.957+65003.875j 65027.14 +65102.934j 65293.555+65337.57j
65036.09 +65326.45j 65069.555+65122.21j 65170.188+65182.14j
65086.773+65204.25j 65245.074+65350.133j 65278.77 +65176.047j
65383.12 +65328.25j 65372.785+65076.875j 65340.812+65295.15j
65258.914+65019.492j 65179.637+65094.973j 65114.742+65139.965j
65228.574+65214.1j 65028.324+65035.492j 65158.71 +65270.234j
65299.96 +65310.32j 65314.164+65216.145j 65152.445+65163.82j
64984.176+65205.04j 65325.242+65244.406j 65308.74 +65173.617j
65165.5 +65330.65j 65126.445+65090.18j 65136.22 +65150.49j
65072.883+65090.805j 65156.95 +65377.67j 65243.934+65084.496j
65335.4 +65139.293j 65354.574+65371.266j 65277.297+65176.633j
65213.723+65281.656j 65096.094+65255.336j 65214.305+65097.867j
65304.68 +65323.668j 65336.617+65143.016j 65109.906+65269.316j
65259.906+65267.74j 65062.758+65238.457j 65106.82 +65017.633j
65121.54 +65096.45j 65187.688+65272.53j 65071.33 +65308.246j
64950.973+65187.082j 65391.445+65345.008j 64989.47 +65234.906j
65107.72 +65159.03j 65278.68 +65234.58j 65301.086+65031.836j
65155.4 +65310.01j 65247.79 +65235.55j 65272.406+65041.395j
65176.75 +65342.47j 65235.73 +65273.438j 65082.28 +65235.61j
65384.97 +65267.312j 65182.273+65253.41j 64940.703+65254.39j
65218.484+65367.008j 65279.703+65307.156j 65180.89 +65182.312j
65345.543+65134.254j 65168.86 +65003.234j 65209.777+65194.227j
65188.15 +65257.156j 65191.836+65144.566j 65330.88 +65215.676j
65241.883+65340.89j 65280.305+65152.39j 65297.39 +65284.293j
65348.71 +65193.9j 64999.133+65260.25j 65156.867+65223.32j
65131.63 +65199.j 65172.418+64953.39j 65135.727+65222.703j
65380.312+65251.58j 65325.17 +64980.1j 65282.47 +65105.555j
65455.387+65341.54j 65183.01 +65014.965j 65267.117+65139.87j
65088.547+65313.78j 65362.477+65227.875j 65148.18 +65164.008j
65254.21 +65241.027j 65168.984+65253.984j 65015.96 +65285.254j
65027.53 +64951.78j 65211.297+65116.21j 65158.47 +65244.29j
65213.37 +65143.49j 65220.664+65484.926j 64716.637+65213.066j
65112.074+65131.67j 65283.23 +65231.72j 65198.812+65132.902j
65300.664+65246.043j 65212.797+65223.035j 65311.69 +65423.25j
65248.95 +65294.04j 65215.062+65093.79j 65100.785+65147.414j
65112.848+65067.598j 65221.78 +65091.434j 65397.516+65159.254j
65271.918+65186.727j 65224.195+65424.516j 65287.332+65154.57j
65157.953+65352.48j 65062.152+65004.3j 65320.164+65241.26j
65189.066+65011.434j 65096.348+65292.695j 65366.41 +65339.62j
65203.574+65313.918j 65164.688+65089.72j 65363.906+65105.21j
65210.285+65177.57j 65130.637+65383.867j 65177.312+64996.492j
65224.613+65187.57j 65314.15 +65338.6j 65147.656+65344.566j
65170.39 +65166.43j 65284.086+65347.312j 65140.418+65056.22j
65270.562+65298.984j 65264.14 +65223.04j 65176.613+65201.1j
65215.47 +65253.18j 65356.883+65136.64j 65340.508+65317.445j
65285.664+65130.13j 65281.53 +65134.17j 65449.203+65235.1j
65069.51 +65198.688j 65286.984+65305.812j 65128.43 +65151.52j
65267.33 +65226.414j 65227.348+65078.125j 65158.664+65170.664j
65072.117+65127.188j 65256.84 +65116.508j 65306.344+65215.44j
65110.727+65133.85j 65126.82 +65289.82j 65181.977+65372.43j
65108.867+65202.812j 65083.97 +65196.074j 64985.957+65257.027j
65164.42 +65163.656j 65127.656+65188.01j 65121.15 +65099.04j
65224.57 +65233.22j 65273.414+65070.812j 65091.195+65293.54j
65122.22 +65185.887j 65043.633+65203.965j 65235.75 +65386.07j
65361.81 +65140.242j 65233.28 +65293.64j 65242.613+65347.305j
65235.4 +65294.477j 65232. +65352.223j 65089.562+65282.727j
65269.953+65192.168j 65116.46 +65109.34j 65126.63 +65325.508j
65098.48 +65027.402j 65180.312+65130.406j 65115.176+65095.074j
65349.234+65217.395j 65058.203+65263.215j 65230.21 +65226.797j
65397.12 +65272.043j 65447.234+65210.668j 65101.125+65398.227j
65140.82 +65309.32j 65289.61 +65348.215j 65219.492+65221.88j
65204.883+65104.527j 65321.914+65057.203j 65170.71 +65395.547j
65308.098+65301.617j 65251.414+65135.688j 65381.1 +65052.81j
65120.51 +65038.195j 65165.11 +65226.37j 65270.75 +65313.02j
65135.336+65280.863j 65315.26 +65280.418j 65017.05 +65338.773j
65404.258+65147.223j 65368.113+64879.72j 65126.57 +65152.5j
65205.844+65197.344j 65061.72 +65145.69j 65192.465+65223.29j
65285.633+65066.71j 65358.51 +65147.688j 65196.67 +65307.04j
65150.723+65298.555j 65131.4 +65077.805j 65100.883+65366.96j
65188.19 +65150.31j 65165.426+65262.816j 64952.125+65060.832j
65307.516+65211.21j 65107.035+65252.156j 65233.336+65139.34j
65104.086+65164.47j 65075.176+65326.38j 65282.25 +65212.67j
65005.17 +65282.598j 65294.805+65261.j 65083.047+65010.754j
65160.125+65091.93j 65052.54 +65043.266j 65311.547+65014.227j
65153.305+65039.86j 65328.71 +65063.04j 65255.258+65066.69j
65312.72 +65021.793j 65321.87 +65167.09j 65261.46 +65262.727j
65058.53 +65185.89j 65286.6 +65111.156j 65375.47 +65120.33j
64960.203+65133.453j 65278.54 +65225.324j 65232.58 +65241.8j
65199.484+65308.027j 65161.344+65087.562j 65181.836+65214.45j
65308.438+65397.33j 65242.66 +65344.504j 65178.836+64996.21j
65286.844+65174.64j 65308.1 +65146.703j 65355.15 +65279.137j
65120.03 +65279.45j 65195.45 +65307.027j 65070.617+65111.973j
65256.766+65073.06j 65313.32 +65010.457j 65162.406+65336.848j
65297.605+65196.777j 65276.484+65301.305j 65329.418+65249.47j
65368.273+65068.59j 65038.383+65442.312j 65181.723+65163.555j
65313.766+65182.96j 64997.31 +65328.15j 65221.523+65137.26j
65140.906+65131.824j 65274.016+65217.67j 65066.566+65309.312j
65173.82 +65126.816j 65184.883+65290.258j 65211.13 +65217.j
65131.348+65142.703j 65014.523+65148.055j 65311.63 +65348.3j
65231.09 +65018.53j 65197.867+65253.258j 65238.953+65378.49j
65218.242+65297.258j 65131.027+65239.32j ]]] [[[65186.664+65182.293j 65242.582+65049.9j 65205.8 +65214.957j
65047.76 +65278.89j 65198.117+65209.44j 65135.95 +65215.414j
65327.387+65207.84j 65108.16 +65140.83j 65333.125+65136.79j
65341.938+65213.203j 65343.348+65298.938j 65178.03 +65312.938j
65148.645+65119.004j 65294.133+65268.46j 65198.96 +65202.04j
65269.89 +65004.008j 65027.01 +65103.047j 65293.65 +65337.6j
65036.19 +65326.832j 65069.965+65122.598j 65170.03 +65182.074j
65087.344+65204.465j 65245.266+65350.035j 65278.414+65176.305j
65383.016+65328.375j 65373.43 +65076.684j 65340.67 +65295.527j
65259. +65019.723j 65179.645+65094.766j 65114.746+65139.6j
65228.836+65214.13j 65028.023+65035.53j 65158.55 +65270.76j
65299.91 +65310.258j 65313.76 +65215.97j 65152.2 +65164.15j
64984.76 +65205.156j 65324.7 +65243.957j 65309.062+65173.52j
65165.12 +65330.707j 65126.715+65090.336j 65136.426+65150.02j
65072.566+65090.72j 65156.85 +65377.05j 65243.69 +65084.406j
65335.32 +65138.906j 65354.33 +65371.133j 65277.465+65176.61j
65213.746+65281.777j 65096.402+65255.492j 65214.42 +65097.895j
65305.008+65323.586j 65336.6 +65143.145j 65110.023+65269.555j
65259.684+65268.332j 65063.008+65238.53j 65106.668+65017.973j
65121.74 +65097.12j 65187.633+65272.68j 65071.395+65308.08j
64950.992+65187.008j 65391.656+65344.75j 64989.65 +65235.098j
65107.703+65159.242j 65278.266+65234.137j 65300.723+65031.758j
65155.63 +65309.785j 65247.406+65235.44j 65272.594+65041.453j
65176.863+65342.516j 65235.523+65273.426j 65082.04 +65236.004j
65384.824+65267.285j 65181.914+65253.754j 64941.266+65254.5j
65218.625+65367.098j 65280.523+65306.793j 65180.46 +65182.31j
65346.062+65134.51j 65168.895+65002.758j 65210.26 +65194.312j
65188.24 +65256.844j 65191.652+65144.344j 65331.09 +65215.293j
65241.67 +65341.055j 65280.055+65152.94j 65297.836+65283.86j
65348.57 +65193.625j 64998.93 +65260.33j 65157.074+65223.074j
65131.66 +65198.867j 65172.5 +64953.434j 65136.234+65222.25j
65380.703+65251.027j 65325.188+64980.176j 65281.984+65105.902j
65454.78 +65341.68j 65183.65 +65015.29j 65266.707+65140.j
65088.645+65313.363j 65362.16 +65227.52j 65148.23 +65163.52j
65253.887+65241.293j 65169.01 +65253.586j 65015.98 +65285.26j
65027.035+64951.52j 65211.344+65116.01j 65158.07 +65244.57j
65213.96 +65142.83j 65221.113+65484.766j 64716.88 +65213.31j
65112.016+65131.637j 65283.305+65232.15j 65198.984+65133.113j
65301.02 +65245.38j 65212.87 +65222.82j 65311.69 +65423.48j
65248.707+65293.723j 65215.76 +65093.105j 65100.027+65147.387j
65112.9 +65067.348j 65221.918+65091.285j 65397.688+65159.39j
65272.047+65186.215j 65224.46 +65424.316j 65287.293+65154.42j
65157.67 +65352.21j 65062.04 +65004.652j 65319.77 +65241.01j
65189.234+65011.28j 65096.43 +65292.605j 65366.164+65340.08j
65204.02 +65313.88j 65164.77 +65089.98j 65364.34 +65104.895j
65210.246+65177.645j 65130.777+65384.023j 65177.535+64996.13j
65225.168+65187.734j 65313.97 +65338.477j 65147.49 +65344.17j
65170.797+65166.215j 65284.4 +65347.785j 65140.57 +65056.094j
65270.156+65298.664j 65264.152+65222.63j 65176.902+65200.996j
65215.39 +65253.887j 65356.484+65136.887j 65340.31 +65317.46j
65285.74 +65129.41j 65281.49 +65134.38j 65448.992+65235.008j
65069.63 +65199.1j 65287.055+65305.51j 65128.176+65151.844j
65267.273+65225.863j 65227.12 +65077.68j 65158.723+65170.797j
65072.26 +65127.195j 65257.086+65116.566j 65306.535+65215.52j
65111.49 +65133.59j 65126.69 +65289.844j 65182.35 +65372.855j
65108.535+65202.574j 65083.926+65196.133j 64985.72 +65256.83j
65164.42 +65163.65j 65128.11 +65188.49j 65121.094+65099.547j
65225.07 +65233.113j 65273.02 +65071.32j 65091.652+65293.81j
65122.56 +65185.633j 65043.355+65203.723j 65235.62 +65385.926j
65361.965+65139.914j 65233.855+65293.844j 65242.83 +65347.04j
65235.418+65294.39j 65231.715+65352.277j 65089.227+65282.79j
65270.2 +65192.484j 65116.113+65109.19j 65126.848+65325.945j
65098.14 +65027.234j 65180.434+65130.574j 65115.19 +65095.434j
65349.438+65217.96j 65057.887+65263.168j 65230.195+65226.83j
65397.56 +65272.01j 65446.703+65210.484j 65100.883+65398.22j
65140.59 +65309.32j 65289.555+65347.79j 65219.7 +65221.56j
65205.086+65104.023j 65322.035+65057.1j 65170.19 +65395.54j
65307.965+65300.742j 65251.168+65135.992j 65380.82 +65052.605j
65120.418+65038.52j 65164.625+65226.363j 65270.918+65312.81j
65135.195+65280.934j 65315.26 +65280.516j 65016.99 +65338.73j
65404.426+65148.023j 65368.33 +64879.938j 65126.477+65152.457j
65205.633+65197.01j 65061.52 +65146.04j 65192.785+65223.695j
65285.594+65066.82j 65358.523+65147.285j 65197.29 +65307.402j
65150.992+65298.633j 65131.47 +65077.71j 65100.273+65367.51j
65188.617+65149.75j 65165.38 +65263.45j 64952.477+65061.312j
65307.48 +65210.91j 65107.41 +65252.992j 65233.242+65139.305j
65104.312+65164.72j 65075.605+65326.2j 65282.16 +65212.58j
65005.188+65282.152j 65295.234+65260.926j 65082.79 +65010.73j
65160.4 +65092.027j 65052.523+65043.293j 65311.438+65014.258j
65153.266+65039.3j 65328.203+65062.445j 65255.258+65067.086j
65312.555+65021.754j 65321.914+65166.96j 65261.867+65262.285j
65058.324+65185.867j 65286.355+65111.082j 65375.37 +65120.355j
64960.34 +65133.445j 65278.29 +65225.168j 65232.746+65241.777j
65199.92 +65308.184j 65161.457+65088.j 65181.742+65214.035j
65308.707+65397.023j 65242.73 +65344.92j 65179.18 +64996.117j
65286.523+65175.07j 65308.207+65146.984j 65355.18 +65279.004j
65120.06 +65279.19j 65195.152+65307.133j 65070.89 +65112.17j
65256.79 +65073.36j 65313.33 +65010.973j 65162.688+65337.055j
65297.51 +65196.73j 65276.305+65300.71j 65329.098+65249.12j
65367.688+65068.473j 65038.86 +65442.4j 65181.68 +65163.61j
65313.508+65183.29j 64996.754+65328.06j 65221.57 +65136.816j
65141.156+65131.523j 65274.297+65218.03j 65066.8 +65309.203j
65173.91 +65126.945j 65184.523+65289.848j 65211.54 +65217.41j
65131.734+65142.906j 65014.73 +65147.723j 65311.312+65348.09j
65231.156+65018.02j 65198.21 +65252.832j 65239.105+65378.1j
65218.16 +65297.176j 65131.01 +65239.188j]]]
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (5 by maintainers)
Top GitHub Comments
I think NumPy does pairwise summation to improve accuracy in some cases. They describe it in the notes section here. It sounds like your case when providing axes=(0, 1) would fall under one of the cases where it is NOT used, though. Unfortunately I won’t have time to look into this in the immediate future, but I think it is a potential starting point for looking into any discrepancy.
Yes I think. It’s hard to match the behavior. Maybe at most we can do is to document this observation?