Build context scan feels slow when directory contains large file tree
See original GitHub issueEnvironment: IntelliJ IDEA 2020.2 EAP Build #IU-202.5103.13 with bundled Docker plugin 202.5103.13. Docker plugin must use docker-java, connection to Docker Desktop for Windows via npipe. Cannot find out the exact version of docker-java that is used.
Given the following minimal project setup:
node_modules
with +67k files
Dockerfile
with single line FROM hello-world
.dockerignore
with single line node_modules
Then, run docker build .
in the root directory.
Expected behaviour: The image is built instantly, because build context contains only the Dockerfile (large node_modules directory is ignored), and there are almost no instructions in the Dockerfile itself.
Observed behaviour: For +30 seconds, the screen shows “Building image…” , and after that only one file is added to the build context archive.
I assume, the build traverses all +67k files in the node_modules directory, only to find none of them have to be included. This would be very inefficient. When there are no exceptions, that allow a file below node_modules, it might skip the whole directory.
Update:
Unit test in docker-java-core
, testdir
has the large folder as child, .dockerignore
absent:
@Test
public void can_parse_dockerfile() throws IOException {
File dockerFile = new File(".\\testdir\\Dockerfile");
File baseDirectory = new File(".\\testdir");
Dockerfile dockerfile = new Dockerfile(dockerFile, baseDirectory);
Stopwatch stopwatch = Stopwatch.createStarted();
Dockerfile.ScannedResult parse = dockerfile.parse();
System.out.println("Parse took " + stopwatch.elapsed(TimeUnit.MILLISECONDS) + "ms");
System.out.println(parse.filesToAdd.size() + " files added");
}
Output:
Parse took 28008ms 4 files added
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (4 by maintainers)
I did a bit of research on this.
At first glance, the idea of @viatra2 in https://github.com/docker-java/docker-java/issues/1409#issuecomment-640615130 sounds correct and promising.
Following, I implemented a file walk with
java.nio.file.Files.walkFileTree
, but it did not traverse the tree considerably faster.Then, I did a short circuit that prevents the
node_modules
to be traversed further if it is ignored. But the exception rules!
rules bit me. There is a lot of combinations of exclude and include lines possible, plus their order in.dockerignore
matters. It may thus be inevitable to visit each file and check their effective inclusion/exclusion.Nevertheless, one possible improvement I found is #1412.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.