Performance speed-up options?
See original GitHub issueHello Miles! Thank you for open-sourcing this powerful tool! I am working on including PySR in my own research, and running into some performance bottlenecks.
I found regressing a simple equation (e.g. the quick-start example) takes roughly 2 minutes. Ideally, I am aiming to reduce that time to ~30 seconds. Would you give me some pointers on this? Meanwhile, I will try break down the challenge in several pieces:
- Activating a new environment at each API call: I noticed that a new Julia (?) environment is created each time I call pysr() api (see terminal output below). Could we keep the environment up so we can skip this process for subsequent calls?
Running on julia -O3 /tmp/tmpe5qmgemh/runfile.jl
Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
Updating registry at `~/.julia/registries/General`
No Changes to `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
No Changes to `~/anaconda3/envs/rw/lib/python3.7/site-packages/Manifest.toml`
Activating environment on workers.
Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
Activating Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
Activating Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
Importing installed module on workers...Finished!
Started!
-
If the above wouldn’t work, then allowing y to be vector-valued (as mentioned in #35) would be a second-best option! Even better, if we could create a “batched” version of
pysr(X, y)apipysr_batched(X, y), such thatXandyare python lists, and we return the results in a list as well, so that we only generate one Julia script, and callos.system()once to keep the Julia environment up. -
Multi-threading: I noticed that increasing
procsfrom 4 to 8 resulted in slightly longer running time. I am running on a 8-core 16-tread CPU. Did I do something dumb? -
I went into
pysr/sr.pyand addedruntests=falseflag in line 438 and 440. That saved ~20 seconds.
Issue Analytics
- State:
- Created 2 years ago
- Comments:20 (13 by maintainers)

Top Related StackOverflow Question
It doesn’t say “Progress: 1 / …”, right? It’s stuck at “Progress: 0”? This is expected behaviour, although maybe I should wait for some equations before starting the printing.
By the way - on PySR 0.6.5, which will be up later today - I added a patch which boosts performance by nearly 2x. It turns out the optimization library I was using (main bottleneck) did not require a differentiable function, so I implemented a faster non-differentiable version.
More ideas, which would probably help quite a bit:
Threadsinstead ofDistributed. That would cut down on startup time quite a bit, since you are only using a single Julia process instead of one for eachprocs.julia -p {procs}, rather than create them dynamically and copy-in user definitions.