Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Split debug symbols into a separate filesystem layer

See original GitHub issue

So in the lead up to the release of Unreal Engine 4.27 and the impending availability of prebuilt container images for all licensees, I’ve been thinking about the experience for developers who are pulling these images and what will happen when they pull different image variants. As can be seen in the generation script and build script for the official images, there will be two development images available which are based on ue4-minimal:

ghcr.io/epicgames/unreal-engine:dev-4.27.0, which includes debug symbols and templates
ghcr.io/epicgames/unreal-engine:dev-slim-4.27.0, which excludes debug symbols and templates

As the Dockerfile currently stands, the majority of the data in the largest filesystem layer (the one including the Installed Build of the Engine) is going to be duplicated when pulling the two image variants, since the removal of debug symbols takes place prior to copying the Installed Build into a new build stage, leaving the version of the COPY layer with debug symbols included completely unrelated to the version without debug symbols. This isn’t ideal, since users who want both image variants will be pulling the same data twice.

The ideal situation is one where the debug symbols are in a separate layer that stacks on top of the Installed Build, ensuring any given file is only pulled once. This should be feasible if we move the files to a separate directory rather than deleting them, and then merge them in using a second COPY directive in the subsequent build stage. Unfortunately, we can’t just perform a straight merge this way, as evidenced when testing a simple example Dockerfile:

# Create some files and directories
FROM ubuntu:20.04 as first
RUN mkdir /root
RUN mkdir /root/a && mkdir /root/b && mkdir /root/a/c
RUN echo '1' > /root/a/1.txt
RUN echo '2' > /root/b/2.txt
RUN echo '3' > /root/a/c/3.txt

# Create additional files and directories under the same parent directory
FROM ubuntu:20.04 as second
RUN mkdir /root
RUN mkdir /root/d && mkdir mkdir /root/a/e
RUN echo '4' > /root/d/4.txt
RUN echo '5' > /root/a/e/5.txt

# Copy the first set of files and directories
FROM ubuntu:20.04 as final
COPY --from=first /root /root

# Attempt to merge the second set of files and directories with the first
# (These directives both fail with the error `mkdir: cannot create directory '/root': File exists`)
COPY --from=second /root /root
COPY --from=second /root/**/*.txt /root

I can think of two potential approaches to work around this limitation:

Copy the files to a separate parent directory in the COPY directive and then hardlink or symlink them into the correct locations using a RUN directive, since performing a move operation here will duplicate the file data in the new filesystem layer. (OCI container image filesystem layers have no concept of renaming a file or directory, just adding, modifying and removing.) This approach should be compatible across both Linux and Windows containers.
Use the –mount=type=bind feature from BuildKit to mount the separate parent directory from the previous build stage and then copy the files into place. This approach is far cleaner, but is only compatible with Linux containers until the efforts to add Windows container support to BuildKit are complete.

It is worth noting that this improvement will do nothing to address the frustrations related to Windows filesystem size limit bugs, since the debug symbols will still be committed in the same filesystem layer as the Installed Build of the Engine in the build stage that creates them, and the split layers only exist in the final image.

@slonopotamus @TBBle what are your thoughts on the best way to approach this? Is there value in using the same approach across Linux and Windows for the sake of consistency, or should we use what works best for each platform? Are there any alternative approaches that I’ve overlooked?

Issue Analytics

State:
Created 2 years ago
Comments:15 (5 by maintainers)

Top GitHub Comments

2reactions

adamrehncommented, Aug 11, 2021

Initial implementation is in commit 4e6e64b, currently testing to ensure I haven’t inadvertently broken anything.

1reaction

adamrehncommented, Aug 12, 2021

As I noted in my initial comment, the debug symbols are still committed in the same filesystem layer as the Installed Build itself when they’re created, it’s only afterwards that we split them out, so sadly I think we’ll still be bottlenecked by those bugs on the original filesystem layer before we ever get to the point where we split them out.

Top Results From Across the Web

Extract debug information in separate file with Visual C++

I would like to extract the debug information from my binary and store it into a separate file in case I need to...

Split debugging info -- symbols - Technovelty

In a previous post I mentioned split debugging info. One addendum to this is how symbols are handled. Symbols are separate to debugging...

Enable debug symbols for all packages · Issue #18530 - GitHub

Currently, a large number of libraries are lacking debug symbols even though environment.enableDebugInfo = true; is specified in my /etc/nixos/ ...

Separate Debug Files (Debugging with GDB) - sourceware.org

GDB allows you to put a program's debugging information in a file separate from the executable itself, in a way that allows GDB...

Preparing Yocto Development Environment for Debugging

Adds the symbols and debug info files onto the filesystem as separate files instead of having them embedded with the executables. Finally, for ......

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Split debug symbols into a separate filesystem layer

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

4.27.0 on Windows fails with error C4668: 'WINAPI_PARTITION_GAMES' is not defined as a preprocessor macro, replacing with '0' for '#if/#elif'

To open this project you must first install NullSourceCodeAccessor. Would you like to install it now?