question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[FR] reduce docker image size

See original GitHub issue

Willingness to contribute

Yes. I would be willing to contribute this feature with guidance from the MLflow community.

Proposal Summary

The command mlflow models build-docker -m "runs:/$(RUN_ID)/sklearn-model/" -n "my-image-name" --env-manager virtualenv generate an docker image.

The size is about 3 GB. It’s a pain point to use these big images in the cloud at large scale. With the env-manager conda, the size is 3.4 GB. The difference between the two env-manager is not so huge. We should expect a very smaller size with the virtualenv.

Motivation

What is the use case for this feature?

Why is this use case valuable to support for MLflow users in general?

Why is this use case valuable to support for your project(s) or organization?

It’s more easy to convince that mlflow could be a solution to serve a model if the docker image size is small.

Why is it currently difficult to achieve this use case?

Even without conda, the image size is still big.

Details

Optimize the temporary dockerfile.

Some options:

  • reduce the number of layers: aggregate the RUN commands
  • remove temporary files rm -rf /var/lib/apt/lists/* after the last install & update
  • use a more compact initial docker image ?
  • is java useful to serve a model ?
  • apt-get clean ?
  • use a stating image with the adequate python version already installed ?

Current docker steps:

Step 1/28 : FROM ubuntu:18.04
 ---> c6ad7e71ba7d
Step 2/28 : RUN apt-get -y update
 ---> Using cache
 ---> 465ef70c5320
Step 3/28 : RUN apt-get install -y --no-install-recommends          wget          curl          nginx          ca-certificates          bzip2          build-essential          cmake          openjdk-8-jdk          git-core          maven     && rm -rf /var/lib/apt/lists/*
 ---> Using cache
 ---> bde17bfdce18
Step 4/28 : RUN apt -y update
 ---> Using cache
 ---> c9e7b69d3116
Step 5/28 : RUN DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get -y install tzdata
 ---> Using cache
 ---> 137c0be7b82a
Step 6/28 : RUN apt-get install -y     libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm     libncursesw5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev
 ---> Using cache
 ---> ac75a70a2d1a
Step 7/28 : RUN git clone     --depth 1     --branch $(git ls-remote --tags https://github.com/pyenv/pyenv.git | grep -o -E 'v[1-9]+(\.[1-9]+)+$' | tail -1)     https://github.com/pyenv/pyenv.git /root/.pyenv
 ---> Using cache
 ---> 0d03bcce2efa
Step 8/28 : ENV PYENV_ROOT="/root/.pyenv"
 ---> Using cache
 ---> 92fb9793c77b
Step 9/28 : ENV PATH="$PYENV_ROOT/bin:$PATH"
 ---> Using cache
 ---> 8642a6c16205
Step 10/28 : RUN apt install -y python3.7
 ---> Using cache
 ---> d588276163ce
Step 11/28 : RUN ln -s -f $(which python3.7) /usr/bin/python
 ---> Using cache
 ---> 53ca1893b359
Step 12/28 : RUN wget https://bootstrap.pypa.io/get-pip.py -O /tmp/get-pip.py
 ---> Using cache
 ---> 1fe30bf7f007
Step 13/28 : RUN python /tmp/get-pip.py
 ---> Using cache
 ---> a05fff5fc7a5
Step 14/28 : RUN pip install virtualenv
 ---> Using cache
 ---> 8bbf4438031a
Step 15/28 : ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
 ---> Using cache
 ---> 6df68b7bba6b
Step 16/28 : ENV GUNICORN_CMD_ARGS="--timeout 60 -k gevent"
 ---> Using cache
 ---> 3d3742873a5b
Step 17/28 : WORKDIR /opt/mlflow
 ---> Using cache
 ---> e88d7c3168ae
Step 18/28 : RUN pip install mlflow==1.26.0
 ---> Using cache
 ---> dc429cc74965
Step 19/28 : RUN mvn  --batch-mode dependency:copy -Dartifact=org.mlflow:mlflow-scoring:1.26.0:pom -DoutputDirectory=/opt/java
 ---> Using cache
 ---> 44d0a3a04830
Step 20/28 : RUN mvn  --batch-mode dependency:copy -Dartifact=org.mlflow:mlflow-scoring:1.26.0:jar -DoutputDirectory=/opt/java/jars
 ---> Using cache
 ---> 2435b92cee67
Step 21/28 : RUN cp /opt/java/mlflow-scoring-1.26.0.pom /opt/java/pom.xml
 ---> Using cache
 ---> 0928aa6feb91
Step 22/28 : RUN cd /opt/java && mvn --batch-mode dependency:copy-dependencies -DoutputDirectory=/opt/java/jars
 ---> Using cache
 ---> ad8a2bede9fd
Step 23/28 : COPY model_dir/ /opt/ml/model
 ---> Using cache
 ---> 5e2d64709562
Step 24/28 : RUN python -c                 'from mlflow.models.container import _install_pyfunc_deps;                _install_pyfunc_deps(                    "/opt/ml/model",                     install_mlflow=False,                     enable_mlserver=False,                     env_manager="virtualenv")'
 ---> Using cache
 ---> 8f9d56ff5803
Step 25/28 : ENV MLFLOW_DISABLE_ENV_CREATION="true"
 ---> Using cache
 ---> 2d0523944c25
Step 26/28 : ENV ENABLE_MLSERVER=False
 ---> Using cache
 ---> d061c9805c17
Step 27/28 : RUN chmod o+rwX /opt/mlflow/
 ---> Using cache
 ---> 4ef8e53d018a
Step 28/28 : ENTRYPOINT ["python", "-c", "from mlflow.models import container as C;C._serve('virtualenv')"]
 ---> Using cache
 ---> 47d5dd9a9f94
Successfully built 47d5dd9a9f94
Successfully tagged my-image-name-venv:latest

What component(s) does this bug affect?

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow’s components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

What language(s) does this bug affect?

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

What integration(s) does this bug affect?

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

3reactions
rafaelvp-dbcommented, Jun 10, 2022

@BenWilson2 @sebastien-genete I’d be willing to help here as well. Multi-stage builds could also be an option - will run couple of tests and report back

0reactions
sebastien-genetecommented, Jul 19, 2022

I am no more time to spend on this topic. Maybe someone else can try to solve this issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Reduce Docker Image Size: 6 Optimization Methods
How to Reduce Docker Image Size? · Method 1: Use Minimal Base Images · Method 2: Use Docker Multistage Builds · Method 3:...
Read more >
How to Reduce Docker Image Size - Nebula Graph
This post shows you several ways to reduce Docker image size. These tips helped NebulaGraph devs reduce the image from 1.3G to 0.3G....
Read more >
How to Reduce Docker Image Size in Docker Containers
Reducing a Docker Image Size · Method 1: Applying Multi-Stage Builds · Method 2: Using a Lightweight Parent Image · Method 3: Creating...
Read more >
BEST PRACTICES TO REDUCE DOCKER IMAGES SIZE
Best Practices to Reduce Docker Images Size · 1. USE A SMALLER BASE IMAGE · 2. DON'T INSTALL DEBUG TOOLS LIKE curl/vim/nano ·...
Read more >
How to Reduce Docker Image Size - CloudHedge
Used Docker Squash to reduce the size of the final image. This is effective if your image has multiple layers created using RUN...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found