Split debug symbols into a separate filesystem layer
See original GitHub issueSo in the lead up to the release of Unreal Engine 4.27 and the impending availability of prebuilt container images for all licensees, I’ve been thinking about the experience for developers who are pulling these images and what will happen when they pull different image variants. As can be seen in the generation script and build script for the official images, there will be two development images available which are based on ue4-minimal:
ghcr.io/epicgames/unreal-engine:dev-4.27.0
, which includes debug symbols and templatesghcr.io/epicgames/unreal-engine:dev-slim-4.27.0
, which excludes debug symbols and templates
As the Dockerfile currently stands, the majority of the data in the largest filesystem layer (the one including the Installed Build of the Engine) is going to be duplicated when pulling the two image variants, since the removal of debug symbols takes place prior to copying the Installed Build into a new build stage, leaving the version of the COPY
layer with debug symbols included completely unrelated to the version without debug symbols. This isn’t ideal, since users who want both image variants will be pulling the same data twice.
The ideal situation is one where the debug symbols are in a separate layer that stacks on top of the Installed Build, ensuring any given file is only pulled once. This should be feasible if we move the files to a separate directory rather than deleting them, and then merge them in using a second COPY
directive in the subsequent build stage. Unfortunately, we can’t just perform a straight merge this way, as evidenced when testing a simple example Dockerfile:
# Create some files and directories
FROM ubuntu:20.04 as first
RUN mkdir /root
RUN mkdir /root/a && mkdir /root/b && mkdir /root/a/c
RUN echo '1' > /root/a/1.txt
RUN echo '2' > /root/b/2.txt
RUN echo '3' > /root/a/c/3.txt
# Create additional files and directories under the same parent directory
FROM ubuntu:20.04 as second
RUN mkdir /root
RUN mkdir /root/d && mkdir mkdir /root/a/e
RUN echo '4' > /root/d/4.txt
RUN echo '5' > /root/a/e/5.txt
# Copy the first set of files and directories
FROM ubuntu:20.04 as final
COPY --from=first /root /root
# Attempt to merge the second set of files and directories with the first
# (These directives both fail with the error `mkdir: cannot create directory '/root': File exists`)
COPY --from=second /root /root
COPY --from=second /root/**/*.txt /root
I can think of two potential approaches to work around this limitation:
-
Copy the files to a separate parent directory in the
COPY
directive and then hardlink or symlink them into the correct locations using aRUN
directive, since performing a move operation here will duplicate the file data in the new filesystem layer. (OCI container image filesystem layers have no concept of renaming a file or directory, just adding, modifying and removing.) This approach should be compatible across both Linux and Windows containers. -
Use the –mount=type=bind feature from BuildKit to mount the separate parent directory from the previous build stage and then copy the files into place. This approach is far cleaner, but is only compatible with Linux containers until the efforts to add Windows container support to BuildKit are complete.
It is worth noting that this improvement will do nothing to address the frustrations related to Windows filesystem size limit bugs, since the debug symbols will still be committed in the same filesystem layer as the Installed Build of the Engine in the build stage that creates them, and the split layers only exist in the final image.
@slonopotamus @TBBle what are your thoughts on the best way to approach this? Is there value in using the same approach across Linux and Windows for the sake of consistency, or should we use what works best for each platform? Are there any alternative approaches that I’ve overlooked?
Issue Analytics
- State:
- Created 2 years ago
- Comments:15 (5 by maintainers)
Initial implementation is in commit 4e6e64b, currently testing to ensure I haven’t inadvertently broken anything.
As I noted in my initial comment, the debug symbols are still committed in the same filesystem layer as the Installed Build itself when they’re created, it’s only afterwards that we split them out, so sadly I think we’ll still be bottlenecked by those bugs on the original filesystem layer before we ever get to the point where we split them out.