question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unit tests use up to 20GB of memory on circleci

See original GitHub issue

Problem While debugging intermittent test failures on PR #1410, @christopherbunn and I measured memory usage of the unit tests on circleci and found a complete end-to-end run can use up to 20GB at peak.

That’s way more than I would have expected… the question is, why?

Observations We ssh’ed into a circleci box running on main and ran the following using memory-profiler:

mprof run  --include-children pytest evalml/ -n 8 --doctest-modules --cov=evalml --junitxml=test-reports/junit.xml --doctest-continue-on-failure -v

Which created the following plot, visible with mprof plot: mprof_4.png

I ran this twice and got a similar plot, so the results appear to be consistent across runs.

This is dangerously close to the max memory allowed on the circleci worker size we’re using. That’s why we started looking into this – on #1410, we saw that the memory usage went 5GB higher for some reason.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:11 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
freddyaboultoncommented, Nov 24, 2020

We noticed that we can shave 1.5gb from just the automl tests (almost half!) by manually setting n_jobs=1 for all estimators used by automl (plots below). We verified that the value of n_jobs is a factor only in the few automl tests that don’t mock fit and score. Based on this, we have come up with the current plan:

  1. for each component which accepts n_jobs as a parameter (i.e. the sklearn-based estimators), make sure we have one unit test which sets n_jobs=-1, to verify that works properly for that component.
  2. for all other unit tests which don’t mock the underlying fit , set n_jobs=1 for all components to avoid memory and threading issues
  3. make sure that in looking glass we’re running with n_jobs=-1, which i believe we currently are since the default value of n_jobs for relevant estimators is -1

Hopefully once this is done, we’ll see some nice improvements upon the overall memory footprint of the unit tests!

automl_tests_jobs-1 automl_tests_njobs_1

1reaction
gshenicommented, Nov 20, 2020

@thehomebrewnerd @freddyaboulton We could have an inline import for the sklearn import (so it only runs when you call the mutual info function). I have seen us explicitly do this in a few in other libraries. We generally do it to avoid circular imports. It would feel weird to do it just to save memory…

Read more comments on GitHub >

github_iconTop Results From Across the Web

Collecting test data - CircleCI
A guide to collecting test data in your CircleCI projects.
Read more >
Running GitLab in a memory-constrained environment
GitLab requires a significant amount of memory when running with all features enabled. There are use-cases such as running GitLab on smaller installations ......
Read more >
docker using too much memory
Docker Using Too Much MemoryAs of right now, there are 6145 Docker containers in the Registry tab of the Docker …. If they...
Read more >
Fixing Jest Memory Usage on CircleCI | hey it's violet
Why? On my current project, we're using Jest and Enzyme to create unit tests which we run as a step in our CircleCI...
Read more >
Free for Developers
API Mocha - Completely free online API mocking for testing and prototyping. ... cloud offers management of 1 cluster with up to 10...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found