Filling np.array very slow because of Carbontracker
See original GitHub issueFilling a pre-allocated array is slowed down by a factor of ~70 when using carbontracker
. See minimum code below.
Am I doing anything wrong ? How can we avoid this ?
import time
import numpy as np
from carbontracker.tracker import CarbonTracker
def load_data(length, data_shape):
data = np.zeros((length, *data_shape))
for i in range(length):
data[i] = np.random.random(data_shape)
return data
if __name__ == '__main__':
l = 10000
shape = (16000, )
tt = time.time()
data = load_data(l, shape)
print(f'Without CT : {time.time() - tt} seconds')
tracker = CarbonTracker(epochs=1, monitor_epochs=1, log_dir='./')
tt = time.time()
data = load_data(l, shape)
print(f'With CT : {time.time() - tt} seconds')
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
Issues · lfwa/carbontracker - GitHub
Used carbon intensity is not reported when live intensity is unavailable and prediction is before actual bug Something isn't working.
Read more >python - Why is numpy.array so slow? - Stack Overflow
Numpy is optimised for large amounts of data. Give it a tiny 3 length array and, unsurprisingly, it performs poorly.
Read more >Python Lists Are Sometimes Much Faster Than NumPy. Here's ...
NumPy is indeed ridiculously fast, though Python is known to be slow. This is because NumPy serves as a wrapper around C and...
Read more >numpy.ndarray.fill — NumPy v1.24 Manual
Fill expects a scalar value and always behaves the same as assigning to a single array element. The following is a rare example...
Read more >Look Ma, No For-Loops: Array Programming With NumPy
How to take advantage of vectorization and broadcasting so you can use NumPy to its full capacity. In this tutorial you'll see step-by-step...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Reopening this issue as a reminder that instantiating
CarbonTracker
and not starting the tracker usingtracker.epoch_start()
will slow down other code.The issue will be closed once it is fixed.
adding a time.sleep() with a small but big enough time (best value to be determined, I use a quite long time of 1ms but can be shorter I think) in the CarbonTrackerThread() (see below) solved the problem for me. probably avoids clogging up the CPU with billions of accesses to the self.measuring attribute 😃 I’m not sure if this solution is scalable and fault proof for the whole tool, but it could be a hint !
edit: replaced screen capture with code