question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Proposal PIN: Build Serializer and Environment Separation

See original GitHub issue

Status

Proposed

Context

Currently environments contain all of their own building and execution logic. This leads to frictions in customizability of environments. For example, the current Kubernetes based environments all implement the same build function which takes in specifications around building a Docker image that is used as a means of storing and retrieving the serialized flow. This leads to confusion and lock-in for the environments.

This new separation of moving the build step away from the execution environment itself will make it increasingly easier for users to supply their own Docker image.

Decision

Break out the build aspect of an environment into its own segment that becomes an aspect of the environment itself.

An example environment definition would now look something like:

from prefect.environments.kubernetes import DaskOnKubernetesEnvironment
from prefect.environments.build_serializers import DockerBuildSerializer

build_serializer = DockerBuildSerialized(registry_url="url", custom_dockerfile="dockerfile_info")
env = DaskOnKubernetesEnvironment(build_serializer=build_serializer)

with Flow("my flow", env=env) as f:
  flow tasks here

f.deploy(project_id="id")

The build serializer would contain the information for building/serializing/storing the Flow. Once flow.serialize(build=True) happens it will take the environment and call the build serializer’s build function which will return the relative information needed to retrieve it and then that will be serialized and populated in the environment’s metadata.

In the context of a Docker build serializer with a k8s related environment the build serializer will have values such as image_name, image_tag, registry_url, etc… and once the k8s related environment ingests that metadata on setup it expects the build serializer metadata to contain the fields related to creating resources on k8s.

Consequences

The only immediate consequence is the need to change some of the setup and execute functionality of the environments.

REQUEST FOR COMMENTS HERE The name Build Serializer makes sense from the serialization standpoint but the actual build related classes need a better name.

This is ongoing, I am going to update this over the next few hours, just currently organizing thoughts.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
cicdwcommented, Mar 28, 2019

I think for “context” you could also add a blurb about making the interface easier for users who want to provide their own Docker image? I think this separation of “serializing the flow into a known location” from “defining its execution environment” well make that much simpler.

0reactions
jlowincommented, Apr 12, 2019

Yes, but let’s link to this issue in the new PIN as a record of other considerations.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Model Your Gitops Environments - Codefresh
Learn how to model your GitOps environments using different folders on the same Git branch, and how to handle environment promotion.
Read more >
Next gen conda recipe spec - HackMD
Specification design process ... This means: For the overall build process: Identify and separate distinct phases, e.g. (just as an example, not meant...
Read more >
AIP-43 DAG Processor separation
As a solution I propose to use Airflow DagPolicy for this purpose - users will be able to define a policy that checks...
Read more >
Binary Refactoring: Improving Code Behind the Scenes
such as C++, Java, and C# offer built-in refactoring support in the form of refactoring browsers. In this paper we propose the concept...
Read more >
Tutorial 1: Serialization - Django REST framework
Tutorial 1: Serialization · Introduction · Setting up a new environment · Getting started · Creating a model to work with · Creating...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found