Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Filling np.array very slow because of Carbontracker

See original GitHub issue

Filling a pre-allocated array is slowed down by a factor of ~70 when using carbontracker. See minimum code below. Am I doing anything wrong ? How can we avoid this ?

import time
import numpy as np
from carbontracker.tracker import CarbonTracker


def load_data(length, data_shape):
    data = np.zeros((length, *data_shape))
    for i in range(length):
        data[i] = np.random.random(data_shape)
    return data
    
if __name__ == '__main__':
    l = 10000
    shape = (16000, )
    tt = time.time()
    data = load_data(l, shape)
    print(f'Without CT : {time.time() - tt} seconds')


    tracker = CarbonTracker(epochs=1, monitor_epochs=1, log_dir='./')
    tt = time.time()
    data = load_data(l, shape)
    print(f'With CT : {time.time() - tt} seconds')

Issue Analytics

State:
Created 2 years ago
Comments:6 (3 by maintainers)

Top GitHub Comments

1reaction

lfwacommented, Jul 6, 2021

Reopening this issue as a reminder that instantiating CarbonTracker and not starting the tracker using tracker.epoch_start() will slow down other code.

The issue will be closed once it is fixed.

0reactions

rhosch97commented, Jun 9, 2022

adding a time.sleep() with a small but big enough time (best value to be determined, I use a quite long time of 1ms but can be shorter I think) in the CarbonTrackerThread() (see below) solved the problem for me. probably avoids clogging up the CPU with billions of accesses to the self.measuring attribute 😃 I’m not sure if this solution is scalable and fault proof for the whole tool, but it could be a hint !

def run(self):
        """Thread's activity."""
        try:
            self.begin()
            while self.running:
                if not self.measuring:
                    time.sleep(0.001)
                    continue
                self._collect_measurements()
                time.sleep(self.update_interval)

            # Shutdown in thread's activity instead of epoch_end() to ensure
            # that we only shutdown after last measurement.
            self._components_shutdown()
        except Exception as e:
            self._handle_error(e)

edit: replaced screen capture with code