question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Investigate methods of making R builds faster

See original GitHub issue

I recently spoke with @karthik who mentioned that our R builds (with install_packages) seems to be going really slowly. There could be a couple problems which I’ll list here:

It’s possible to install some R packages in Ubuntu much faster by installing binaries. We could recommend this in the documentation for specifying R packages and such…

relevant blog post: http://dirk.eddelbuettel.com/blog/2017/12/13/

old points:

1. mybinder.org may not have enough RAM which is causing the build to be really slow for certain packages (like the tidyverse). Apparently many R packages have intermediate steps during install that use multiple gigs of RAM. 2. We aren’t using some binary packages even though they are available. repo2docker seems to be building everything from source, even though for some packages there are binaries out there. We could investigate to see if this is an option!

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:1
  • Comments:40 (18 by maintainers)

github_iconTop GitHub Comments

2reactions
LennertScheperscommented, Jan 21, 2019

My example mentioned above built successfully (allbeit without sf installed, since I hadn’t updated the MRAN date). Not sure how long it took to build (certainly over an hour), but this morning when I clicked the button it launched from the cache version nice and spiffy-like.

I just tried the @cboettig example (https://mybinder.org/v2/gh/cboettig/r/master?urlpath=rstudio), but it looks like the sf package is still not installed, maybe because sf depends on quite some spatial libraries (e.g. GDAL, GEOS,…)?

Nevertheless, having never used docker, I found it very easy to copy this small Dockerfile (linking to rocker/binder) to my repository: https://github.com/rocker-org/binder/blob/master/binder/Dockerfile. Now sf and all other packages work like a charm. I’m impressed by Binder, thank you!

1reaction
cboettigcommented, Dec 1, 2018

@choldgraf “typical” will vary widely of course, but it’s realistic or even small for large spatial analysis.

  • One strategy would be to improve the binary support: Does the apt.txt or whatever it is support users adding PPAs? Otherwise, just adding the https://launchpad.net/~marutter/+archive/ubuntu/c2d4u PPA to the base image seems like a good start so that most packages can be installed from binary. https://launchpad.net/~ubuntugis/+archive/ubuntu/ppa is another popular PPA for folks doing any spatial data. (this is the route we have taken with the r-apt stack in rocker, https://github.com/rocker-org/rocker/tree/master/r-apt)

  • Other option is to pre-install more common things on the base image (though might need more documentation to avoid having pre-installed packages just get re-installed. If users write a DESCRIPTION file for dependencies and use devtools::install() in install.r this isn’t an issue, but if they write direct calls toinstall.packages(), the default beahvior will re-install packages explicitly requested). Of course pre-building means identifying such a ‘common stack’ and then doing more maintenance on the binder end. (as you know, this is the route we’ve taken with the ‘versioned’ stack in rocker)

  • Maybe I’m just out of step with the general thinking here, but really so long as builds are cached, a one-time 1 hr wait doesn’t seem so bad to me. I tickle the build the first time I put binder up, and check back later and it’s built. If the repo is getting much traffic at all, there’s almost always a cached image there. Really, I think your current system works remarkably well, and if it ain’t broke… but yeah, maybe I’m in the minority on that

Read more comments on GitHub >

github_iconTop Results From Across the Web

Chapter 13 Some Tips to make R code faster - Bookdown
This book explains the most important things you need to know while you are writing production level R code.
Read more >
Implementing Best Practices to Speed Up R Code | packtpub ...
This playlist/video has been uploaded for Marketing purposes and contains only selective videos. For the entire video course and code, ...
Read more >
24 Improving performance | Advanced R - Hadley Wickham
It's difficult to provide general advice on improving performance, but I try my best with four techniques that can be applied in many...
Read more >
Fast R code - DARTISTICS!
Tips for speed · 1. Use Vectorisation · 2. Avoid creating objects in a loop · 3. Get a bigger computer · 4....
Read more >
12 Speed | Hands-On Programming with R - RStudio Education
This book will teach you how to program in R, with hands-on examples. I wrote it for non-programmers to provide a friendly introduction...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found