Reflected-list and typed List produce slow concatenate for list of equal sized arrays
See original GitHub issueHi, I am trying to take advantage of having N
numpy arrays of shape=(1, 2)
and implement a faster version of np.concatenate
using Numba (for this particular case). I have tried this implementation:
import numpy as np
from numba import njit
from numba.typed import List
@njit
def _concat_equal1(arrays, out):
for i in range(len(arrays)):
out[i] = arrays[i][0]
return out
def concat_equal1(arrays):
out = np.empty(shape=(len(arrays), 2), dtype=float)
return _concat_equal1(arrays, out)
The problem is that when arrays
is a reflected-list the performance is really slow (especially when compared with numpy’s “optimized” concatenate function:
a = np.random.random(size=(1000, 2))
la = [ai[np.newaxis] for ai in a]
>>> %timeit np.concatenate(la)
361 µs ± 7.14 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
>>> %timeit concat_equal1(la)
10.7 ms ± 187 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
However, when I use a numba.typed.List
I get a much better performance:
>>> tla = List(la)
>>> %timeit concat_equal1(tla)
77.9 µs ± 1.23 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Which seems great! But the time that takes to construct a typed List is orders of magnitude worse:
>>> %timeit tla = List(la)
1.6 ms ± 23.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
This calls append
for each item and performs real poorly smh.
Am I doing something wrong? Can this be achieved in other ways?
Thanks, Michael
Issue Analytics
- State:
- Created 3 years ago
- Comments:13 (9 by maintainers)
Top Results From Across the Web
Numpy concatenate is slow: any alternative approach?
This is basically what is happening in all algorithms based on arrays. Each time you change the size of the array, it needs...
Read more >Supported Python features - Numba
Lists must be strictly homogeneous: Numba will reject any list containing objects of different types, even if the types are compatible (for example,...
Read more >Concat and Concatenate functions in Power Apps
When you use this function with individual strings, it's equivalent to using the & operator. The Concat function concatenates the result of ...
Read more >Concatenate Lists in C# - Code Maze
Let's create a UsingAdd method to concatenate two lists: ... First, we instantiate a new array ( combinedArray ) and set its length...
Read more >To use or not to use the ++ operator in Elixir - WyeWorks
You may have already been warned about the risks of using the operator to concatenate lists in Elixir. In fact, a good piece...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@mishana great! Thanks for the updates and new benchamarks, I’ll hopefully update the reallocation strategy soon!
Nice to know, I’ll definitely take a look at them.
I kinda disagree, because I think the superior performance of
numba.typed.List
in the@njit
compiled “habitat” has more to do with the LLVM optimizations (e.g., loop unrolling) than theappend()
itself. To showcase my theory, let us look at the following timings:As you can see (and probably already know) JIT-compiling with LLVM does wonders to python loops performance (in part, using loop-unrolling). Seems to me, that the raw run-time of typed-list’s
append()
(in an@njit
compiled function) is still 50% slower than the cpythonlist
’s implementation (in the interpreter).Yes, I think the implementation has to be updated according to the cpython one. Take a look here for the description and rationale of this “List overallocation strategy” change.