make `buck fetch` smarter by keeping previous fetched artifacts
See original GitHub issueOne important feature that is still missing in current buck fetch
implementation, is to preserve previous downloaded artifacts. This is needed to support build for multiple branches. Suppose, we have 100 libraries, version lib-1-v1.jar ,.., lib-100-v1.jar
on stable-v1
branch and after switching the branch to stable-v2
to we would need lib-1-v2.jar, ..., lib-2-v2.jar
. Now, after fetching them and building, we switch back to stable-v1 and the expectation would be that because of previous download, exact 0 bytes are re-fetched.
Gerrit Code Review maven machinery implements that by maintaining its own cache of downloaded artifacts and linking them when they are requested:
https://github.com/gerrit-review/gerrit/blob/master/tools/download_file.py#L192 That way, we get 0 bytes re-fetch when we switch between branches, e.g.:
$ ls -all /home/davido/.gerritcodereview/buck-cache/downloaded-artifacts/closure-compiler-v*
-rw-r--r-- 1 davido users 3148980 Oct 21 20:30 /home/davido/.gerritcodereview/buck-cache/downloaded-artifacts/closure-compiler-v20141120.jar-369618bf5a96f73e32655dc48919c0f97558d3b1
-rw-r--r-- 1 davido users 1890852 Oct 21 20:30 /home/davido/.gerritcodereview/buck-cache/downloaded-artifacts/closure-compiler-v20141120-src.jar-a36e3c2823d1a09f458c858f7d4cac7759e05079
-rw-r--r-- 1 davido users 3110607 Nov 3 08:39 /home/davido/.gerritcodereview/buck-cache/downloaded-artifacts/closure-compiler-v20150505.jar-4078b23fd07d9c40b99a3996519666e935905103
-rw-r--r-- 1 davido users 1876499 Nov 3 08:39 /home/davido/.gerritcodereview/buck-cache/downloaded-artifacts/closure-compiler-v20150505-src.jar-f23318a8bef213d277da40a295f256f00b535d6f
-rw-r--r-- 2 davido users 6536925 Jan 5 08:24 /home/davido/.gerritcodereview/buck-cache/downloaded-artifacts/closure-compiler-v20151216.jar-b5cd14a356cd9079791ba10b6d9623ef4ae4df6e
-rw-r--r-- 2 davido users 2047215 Jan 5 08:24 /home/davido/.gerritcodereview/buck-cache/downloaded-artifacts/closure-compiler-v20151216-src.jar-670cac7a41bcdbc563de0a1d450000036f802f5c
And the actual artifact that Buck sees is a hard link:
davido@linux-ucwl:~/projects/gerrit/buck-out/gen/lib/codemirror/compiler-jar__download_bin/compiler-jar (master>)$ ls -l
total 6384
-rw-r--r-- 2 davido users 6536925 Jan 5 08:24 closure-compiler-v20151216.jar
Because Windows OS doesn’t support symlinks, this would require alternative implementation on this OS, by copying the artifacts instead of linking them. Gerrit Code Review currently doesn’t have build option on this patform.
Issue Analytics
- State:
- Created 8 years ago
- Comments:7 (7 by maintainers)
Top GitHub Comments
Given that we regularly upgrade Buck version, every couple of weeks, it wouldn’t work for us. To save bandwidth downloaded artifacts storage should be de-coupled from the Buck cache. Even after cache invalidation, previously downloaded artifacts must not be re-fetched, but perserved and used on new
buck fetch
invocation. The argumentation for that is obvious: If you re-fetch the same artifact again, it would be the same, no matter with what version of Buck you did it. The bandwith is just too valuable resource to throw away previously downloaded artifacts and re-fetch the gigabytes of data again.Correct, but if you downloaded it at that version, it’d be saved too.