question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Feature]Automatic documentation in Flyte

See original GitHub issue

Motivation: Why do you think this is important? Flyte is a type safe orchestration platform. It understands the flow of data and quickly becomes a one stop shop for most teams pipelining usecases. As the users build a repository of workflows, tasks and launch plans, it is essential to associate documentation with these entities. The documentation would help new team members quickly ramp up on the projects and individual entities. It also lays the foundation for improved discoverability and shareability. E.g. a workflow name or a task name is not enough in describing what the intention is, but associated documentation could provide a detailed description and insight into the algorithms, business usecase etc.

This document proposes a simple and extensible way to support documentation with all Flyte entities, which is captured in Flyte Console and keeps constantly getting updated with versions.

Goal: What should the final outcome look like, ideally?


@workflow(docs=Documentation(
                     short="This pipeline is used to target the right set of drivers for incentives",
                     long_file="/path/to/file",
                     long_format=Documentation.MARKDOWN,
                     source_code=Documentation.source_code_from_config(),
                     icon="http://....",
                     tags=["planning", "campaign"]
                     )
          )
class DriverTargetingWorkflow():
  ....


@workflow(docs=Documentation(
                     short="This pipeline is used to target the right set of drivers for incentives",
                     # Defaults to source_code=Documentation.source_code_from_config(),
                     #Also long defaults to using the docstring in the format rST
                   )
          )
class DriverTargetingWorkflow():
  """
    My RST documentation
  """
  ....


Config:
[docs]
  Repo: github.com/flyteorg/xyz
  project_icon: http://...
  tags: python, java, spark, dnn

Describe alternatives you’ve considered NA

Flyte component

  • Overall
  • Flyte Setup and Installation scripts
  • Flyte Documentation
  • Flyte communication (slack/email etc)
  • FlytePropeller
  • FlyteIDL (Flyte specification language)
  • Flytekit (Python SDK)
  • FlyteAdmin (Control Plane service)
  • FlytePlugins
  • DataCatalog
  • FlyteStdlib (common libraries)
  • FlyteConsole (UI)
  • Other

[Optional] Propose: Link/Inline Specification of the protobuf in FlyteIDL that will be added to all entities - Workflow, Tasks, LaunchPlans, Project

message SourceCode {
    // File where the code is located
    string file = 1;
    // Line number where the task definition, workflow definition, etc starts at
    int line_number = 2;
    // git repository
    string repo = 3;
    // branch of the repository
    string branch = 4;
    // link to the original repository
    string link = 5;
    // language of the code
    string langugae = 6;
}

message Documentation {
    // short description - no more than 256 characters
    string short = 1;
    // Optional information about the source code
    SourceCode info = 3;
    // Optional Tags for easy searching, categorizing etc
    repeated string tags = 4;
}

message LongDocumentation {
    // long description - no more than 4kb
    string long = 1;
    enum DescriptionFormat {
         UNKNOWN = 0;
         MARKDOWN = 1;
         HTML = 2;
         // python default documentation - comments is rst
         RST = 3;
    }
     // format of the long description
    DescriptionFormat long_format = 2;
    // Optional link to an icon for the entity
    string icon_link = 5;
}


Create*Request(
....
   Documentation docs = ...,
...
)

// We will add a special API to create and associate long form documentation
CreateDocs(
  Identifier id = 1;
  LongDocumentation docs = 2;
)

// The Long documentation will be stored in the Blob store and reference will be added to the Metastore

Additional context The UI should be able to show this information. Also the documentation is implicitly versioned with the entities themselves

Is this a blocker for you to adopt Flyte NA

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:1
  • Comments:17 (16 by maintainers)

github_iconTop GitHub Comments

1reaction
kumare3commented, Mar 12, 2021

@kumare3 that’s what the NamedEntity stuff already covers, right? +1 we should improve how we expose it to users to edit and update

@katrogan yup, but I think the git links, line no etc should be added to every taskspec?

1reaction
honnixcommented, Sep 30, 2020

This looks great and I am all for attaching doc to workflow. Just a minor concern that this could potentially easily make the payload a few times bigger. Will the size be enforced at the client side or server?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Flyte Documentation
Flyte is a structured programming and distributed processing platform created at Lyft that enables highly concurrent,.
Read more >
Data Documentation Tools - Datalogz
Datalogz securely connects to your actual data, allowing for automatic document updating. · Complete control over who can read, write, or do both...
Read more >
ExpediaGroup/flyte: Flyte binds together the tools you ... - GitHub
Flyte binds together the tools you use into easily defined, automated workflows. It is a request-based, decoupled automation engine which allows you to ......
Read more >
How to Use Great Expectations in Flyte
The power of data validation in Great Expectations can be integrated with Flyte to validate the data moving in and out of the...
Read more >
Docs as Code: An introduction for beginners - KnowledgeOwl
Documentation can be anything from webhelps, to auto-generated API docs, ... This can affect what documentation features are available to you, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found