question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Build context scan feels slow when directory contains large file tree

See original GitHub issue

Environment: IntelliJ IDEA 2020.2 EAP Build #IU-202.5103.13 with bundled Docker plugin 202.5103.13. Docker plugin must use docker-java, connection to Docker Desktop for Windows via npipe. Cannot find out the exact version of docker-java that is used.

Given the following minimal project setup: node_modules with +67k files Dockerfile with single line FROM hello-world .dockerignore with single line node_modules

Then, run docker build . in the root directory.

Expected behaviour: The image is built instantly, because build context contains only the Dockerfile (large node_modules directory is ignored), and there are almost no instructions in the Dockerfile itself.

Observed behaviour: For +30 seconds, the screen shows “Building image…” , and after that only one file is added to the build context archive.

I assume, the build traverses all +67k files in the node_modules directory, only to find none of them have to be included. This would be very inefficient. When there are no exceptions, that allow a file below node_modules, it might skip the whole directory.

Update: Unit test in docker-java-core, testdir has the large folder as child, .dockerignore absent:

@Test
    public void can_parse_dockerfile() throws IOException {
        File dockerFile = new File(".\\testdir\\Dockerfile");
        File baseDirectory = new File(".\\testdir");
        Dockerfile dockerfile = new Dockerfile(dockerFile, baseDirectory);
        Stopwatch stopwatch = Stopwatch.createStarted();
        Dockerfile.ScannedResult parse = dockerfile.parse();
        System.out.println("Parse took " + stopwatch.elapsed(TimeUnit.MILLISECONDS) + "ms");
        System.out.println(parse.filesToAdd.size() + " files added");
    }

Output:

Parse took 28008ms 4 files added

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
swiedenfeldcommented, Jun 12, 2020

I did a bit of research on this.

At first glance, the idea of @viatra2 in https://github.com/docker-java/docker-java/issues/1409#issuecomment-640615130 sounds correct and promising.

Following, I implemented a file walk with java.nio.file.Files.walkFileTree, but it did not traverse the tree considerably faster.

Then, I did a short circuit that prevents the node_modules to be traversed further if it is ignored. But the exception rules ! rules bit me. There is a lot of combinations of exclude and include lines possible, plus their order in .dockerignore matters. It may thus be inevitable to visit each file and check their effective inclusion/exclusion.

Nevertheless, one possible improvement I found is #1412.

0reactions
stale[bot]commented, Jul 12, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Read more comments on GitHub >

github_iconTop Results From Across the Web

docker-compose --build taking forever, takes massive amount ...
This takes time to transfer across the socket (20 minutes to move 75 GB is believable) and it needs to be saved and...
Read more >
Recursive File Listing is miserably slow for large directories
The directory contains hundreds of files and takes 15 minutes to run this VI thru. Is there a better more faster way of...
Read more >
Synchronizing very large folder structures - Server Fault
We have tried Rsync, but that can take as long as eight to twelve hours just to complete the "building file list" operation....
Read more >
How to Include Files Outside of Docker's Build Context
Here, the build context is set to the current directory via the “.” argument. It's a common practice to keep the Dockerfile at...
Read more >
#2158 (gdal becomes painfully slow when used in directories ...
I would add that 54000 is an unusually large number of files to have in one directory and I'm surprised you don't find...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found