question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Docker multi-stage build Dockerfile best practices

See original GitHub issue
  • I have searched the issues of this repo and believe that this is not a duplicate.
  • I have searched the documentation and believe that my question is not covered.

Question

I continue to try to replace our pip/setuptools-based system with poetry, but hit a new snag when it comes to how we build our Docker images.

Here’s the basic pattern we use for our Docker images, in a build and a deploy stage:

  1. (build) Resolve all dependencies and build wheels for them
  2. (build) Build the actual project as a wheel too
  3. (deploy) Take all of those wheels we built and install them into a lightweight image (that has no build tools)

Here’s how this translates into a Dockerfile:

# ---------------------------------------------------------------------------------------
# BUILD
# ---------------------------------------------------------------------------------------

FROM gitlab.example.com:4567/namespace/build-base:py37-1.0.4 as builder

RUN mkdir -p /build/wheels

# This is separated out to take advantage of caching
ADD requirements.txt /tmp/requirements.txt

RUN pip3.7 wheel --trusted-host pypi.example.com  \
    --wheel-dir=/tmp/python-wheels --index-url http://pypi.example.com/simple/ \
    -r /tmp/requirements.txt

ADD . /src
WORKDIR /src

RUN pip3.7 wheel --find-links /tmp/python-wheels --trusted-host=pypi.example.com --wheel-dir=/build/wheels .

# ---------------------------------------------------------------------------------------
# DEPLOY
# ---------------------------------------------------------------------------------------

FROM gitlab.example.sec:4567/namespace/deploy-base:py37-1.0.0 as deploy

WORKDIR /opt/app

# Copy the already-built wheels
COPY --from=builder /build/wheels /tmp/wheels

# Install into main system python.
RUN pip3.7 install --no-cache-dir /tmp/wheels/* && rm -rf /tmp/wheels

CMD ["myproject-server"]

How do I do this with poetry? – in the most short-sighted form, I’d like to know how to collect all dependencies as wheels in order to match this pattern.

However, my real requirement here is just to have separate build and deploy stages where the deploy image has no python (or lower-level) build-related tools installed (but does have pip) and simply takes artifacts from the build image.

(I suppose one idea would be to treat the entire virtualenv from the build stage as an artifact? That seems a little dirty, but provided the base OS images were the same, might work?)

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:8
  • Comments:13 (2 by maintainers)

github_iconTop GitHub Comments

17reactions
foosinncommented, Sep 7, 2022

Hey,

I found this solution for me to work best:

FROM python:3.9-alpine AS builder
WORKDIR /app
ADD pyproject.toml poetry.lock /app/

RUN apk add build-base libffi-dev
RUN pip install poetry
RUN poetry config virtualenvs.in-project true
RUN poetry install --no-ansi

# ---

FROM python:3.9-alpine
WORKDIR /app

COPY --from=builder /app /app
ADD . /app

RUN addgroup -g 1000 app
RUN adduser app -h /app -u 1000 -G 1000 -DH
USER 1000

# change this to match your application
CMD /app/.venv/bin/python -m module_name
# or
CMD /app/.venv/bin/python app.py

Dont forget a .dockerignore:

.git/
__pycache__/
**/__pycache__/
*.py[cod]
*$py.class

Ticks all my boxes:

  • no need for a requirements file
  • the virtualenv is managed by poetry
  • no poetry in the final image
  • application and venv contained in one folder
  • python application can not write to its files or the virtualenv
  • virtual env is only rebuild if pyproject.toml or poetry.lock change

Just make sure to use the same path in the builder and the final image, virtualenv uses some hardcoded paths. Change the CMD to match your application.

EDIT: Updated to reflect some of @alexpovel’s critics

13reactions
dpraulcommented, Aug 2, 2019

We’ve had relatively good success copying virtualenvs between images with pip. I’m just beginning to see if we can transition to poetry - we’ve just been using it for small utilities right now but we’re using the following code to do what you’re describing:

FROM python:3.7.4-slim as python-base
ENV PIP_NO_CACHE_DIR=off \
    PIP_DISABLE_PIP_VERSION_CHECK=on \
    PIP_DEFAULT_TIMEOUT=100 \
    POETRY_PATH=/opt/poetry \
    VENV_PATH=/opt/venv \
    POETRY_VERSION=0.12.17
ENV PATH="$POETRY_PATH/bin:$VENV_PATH/bin:$PATH"

FROM python-base as poetry
RUN apt-get update \
    && apt-get install --no-install-recommends -y \
        # deps for installing poetry
        curl \
        # deps for building python deps
        build-essential \
    \
    # install poetry - uses $POETRY_VERSION internally
    && curl -sSL https://raw.githubusercontent.com/sdispater/poetry/master/get-poetry.py | python \
    && mv /root/.poetry $POETRY_PATH \
    && poetry --version \
    \
    # configure poetry & make a virtualenv ahead of time since we only need one
    && python -m venv $VENV_PATH \
    && poetry config settings.virtualenvs.create false \
    \
    # cleanup
    && rm -rf /var/lib/apt/lists/*

COPY poetry.lock pyproject.toml ./
RUN poetry install --no-interaction --no-ansi -vvv

FROM python-base as runtime
WORKDIR /app

COPY --from=poetry $VENV_PATH $VENV_PATH
COPY . ./

ENTRYPOINT ["python", "-m", "app"]

Haven’t figured out a clean way to work with prod vs. dev dependencies

Read more comments on GitHub >

github_iconTop Results From Across the Web

Best practices for writing Dockerfiles - Docker Documentation
Multi-stage builds allow you to drastically reduce the size of your final image, without struggling to reduce the number of intermediate layers and...
Read more >
Understanding Docker Multistage Builds - Earthly Blog
One approach to keeping Docker images small is using multistage builds. A multistage build allows you to use multiple images to build a...
Read more >
Top 20 Dockerfile best practices for security - Sysdig
#2 Reduce attack surface · #2.1 Multistage builds · #2.2 Distroless, from scratch · #2.3 Use trusted base images · #2.4 Update your...
Read more >
Docker Best Practice, Multi-Stage Build
Docker Best Practice, Multi-Stage Build · Build your assets in a separated build (multi-stage), and copy them in the final build. · Install ......
Read more >
Docker Best Practices for Python Developers - TestDriven.io
Use Multi-stage Builds · Order Dockerfile Commands Appropriately · Use Small Docker Base Images · Minimize the Number of Layers · Use ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found