Cython and memviews creation
See original GitHub issueNot an issue, just a Cython-related PSA that we need to keep in mind when reviewing PRs:
We shouldn’t create 1d views for each sample, this is slow:
cdef float X[:, :] = ... # big 2d view
for i in range(n_samples): # same with prange, same with or without the GIL
f(X[i])
do this instead, or use pointers, at least for now:
for i in range(n_samples):
f(X, i) # and work on X[i, :] in f's code
This is valid for any pattern that generates lots of views so looping over features might not be a good idea either if we expect lots of features. There might be a “fix” in https://github.com/cython/cython/issues/2227 / cython/cython#3617
The reason is that there’s a significant overhead when creating all these 1d views, which comes from Cython internal ref-counting (details at https://github.com/cython/cython/issues/2987). In the hist-GBDT prediction code, this overhead amounts for more than 30% of the runtime so it’s not negligible.
Note that:
- Doing this with
prange
used to generate some additional Python interactions, but this was fixed in https://github.com/cython/cython/commit/794d21d929a60c0ff9f1aa92fc79cc79c1d4753d and backported to Cython 0.29 - Now that no Python interactions are generated, we need to be extra careful with this because we won’t even see it in Cython annotated files
Issue Analytics
- State:
- Created 3 years ago
- Reactions:9
- Comments:5 (4 by maintainers)
Top Results From Across the Web
Typed Memoryviews — Cython 3.0.0a11 documentation
A memoryview can be used in any context (function parameters, module-level, cdef class attribute, etc) and can be obtained from nearly any object...
Read more >Cython typed memoryviews: what they really are?
If the data is owned by a Python object then memview holds a reference to that and ensures the Python object that holds...
Read more >Memoryview Benchmarks - Pythonic Perambulations
Cython + memviews (no slicing): 2.45 ms. So what have we learned here? First of all, typed memoryviews are fast. Blazing fast. If...
Read more >creating a memoryview from scratch
You can use a cython array, e.g.. from cython cimport view my_array = view.array(shape=(10, 2), ... cdef int[:,:] memview = <int[:m,:n]> pointer
Read more >Dynamic arrays: allocate memory
Dynamic arrays: allocate memory · Accept either via memory view · Creating dynamic arrays with Cython itself · Bonus: memview also works with...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
FYI I don’t think https://github.com/cython/cython/pull/3617 will really help with speed here - it just makes
equivalent (speed-wise) to
You’d still be better off using your second version if you really need the best performance.
Shall we pin this issue (at least to easily come back to the discussion)?