Would like a guide for How-To deploy Amundsen in production
See original GitHub issuePlease add points on what you expect from such a guide in a comment below. I will then try to consolidate input and draft up an outline in this comment.
The guide can end up as is /docs/deployment.md
/docs/owners_manual.md
better?
Initial outline:
-
Basic install of services (in different environments)
- Docker-compose “vanilla”, but with Gunicorn (WIP #109)
data in volumes etc. - AWS ECS. original PR: https://github.com/lyft/amundsenfrontendlibrary/pull/216 (or EC2): https://github.com/lyft/amundsenfrontendlibrary/issues/186
- Kubernetes helm chart install
(convert from Compose using https://kompose.io?)(upcoming PR see https://github.com/lyft/amundsen/issues/53#issuecomment-538575978 below)
- Docker-compose “vanilla”, but with Gunicorn (WIP #109)
-
Setting up ingest (with or without Airflow, see https://github.com/lyft/amundsen/issues/53#issuecomment-617370073)
Figure out which parts of this belongs with Architecture.md and which in Databuilder repo?
- Compared to Quickstart ingest (https://github.com/lyft/amundsen/issues/75)
- Then mention source by source; Extractor(s), Model, Metadata - Table Metadata: - Users - Table Usage: (_How it works and why in https://github.com/lyft/amundsen/issues/381#issuecomment-613387814_) - …
-
Configuration - custom build of frontend (to not have to maintain a fork we need to get https://github.com/lyft/amundsen/issues/408 transmogrified into proper documentation/tooling)
- Small tweaks to turn on/off features, adding logo etc. (mostly Done) https://github.com/lyft/amundsenfrontendlibrary/commit/c256115f7d64da121de4ea36ea9c55592c11f9d5 in PR https://github.com/lyft/amundsenfrontendlibrary/pull/255
- Config of email notification/feedback Done in PR https://github.com/lyft/amundsenfrontendlibrary/pull/291
- Data preview (integration to SuperSet) - https://github.com/lyft/amundsen/issues/27#issuecomment-517477074 has some draft contextual lead in and reasoning and a link to example setup. But ultimately what ticks off the box for this is Taos Guide in https://github.com/lyft/amundsen/blob/master/docs/tutorials/data-preview-with-superset.md (or on the https://lyft.github.io/amundsen/ site, search for SuperSet!)
-
Security
- Auth - passwords etc.
- secure communication
- production grade docker as per Production-ready Docker images (via https://www.youtube.com/watch?v=cDzFm68aMao)
-
Backup - initial WiP in https://github.com/lyft/amundsen/issues/53#issuecomment-516159598 below … current result in https://github.com/lyft/amundsen/issues/381#issuecomment-614534794 - and restore (on K8s) implemented in https://github.com/lyft/amundsen/pull/394
-
Monitoring (statsd etc.?)
-
Handling upgrades
-
…
Issue Analytics
- State:
- Created 4 years ago
- Reactions:11
- Comments:23 (19 by maintainers)
I’m going to pick this up. I think this will be a nontrivial project, mostly in the form of soliciting feedback from the community. Part of the appeal of Amundsen is its flexibility: there’s no one right way to install it. However, for a guide to be broadly useful, I believe it needs to have concrete steps. As a result, we’ll need to make some opinionated decisions in order for the guide to be useful.
Here’s how I’m planning on structuring this project:
If anyone has thoughts about this process, happy to hear.
There’s some question as to which docs should be in the top repo vs service repos. My only strong feeling is that there be a single top-level doc that one can follow and find everything they need. Procedurally, it’s much easier to make changes to the docs if they’re all in one repo, rather than scattered between them. And given that the individual components aren’t super useful when used independently, I default to just putting it into the larger repo. Open to feedback.
hey – we’ve packaged some of the learnings from this thread and other places into a recommended pathway https://medium.com/stemma/amundsen-deployment-best-practices-740a1800518e – would love anyone who’s worked through this stuff to try it out and give feedback, we’d like to eventually get this upstreamed into main repo once it’s better battle tested