[RFC] Automated Entity Creation from AWS
See original GitHub issueRFC
Status: Open for Comments
Need
Keeping Backstage entities up to date can be challenging. We attempt to reduce to likelihood of stale or inaccurate catalog data by automating its creation from its source of truth. For “resource” entities, this can be done by leveraging APIs from the cloud hosting provider; AWS in our case. Our organization has standardized tagging, which we can leverage to discover AWS resources and create them in Backstage’s catalog. We are looking for feedback from other organizations and whether there’s an interest in this contribution.
Proposal
AWS provides an API (Resource Groups Tagging API) which can be used to find all resources: https://docs.aws.amazon.com/resourcegroupstagging/latest/APIReference/API_GetResources.html.
Resource Entities are fairly basic in terms of minimal requirements: https://backstage.io/docs/features/software-catalog/descriptor-format#kind-resource. They effectively require owner
, type
, and system
.
The plugin would specify what AWS services are considered as resources. For example:
catalog:
providers:
aws:
awsServicesAsReources:
- "sns"
- "rds"
- "s3"
This would be included as the ResourceTypeFilters API request, in which we filter the response to include only information related to these type of AWS resources.
Another configurations could be:
catalog:
providers:
aws:
awsTagForOwner: "org-team"
awsTagForSystem: "org-system"
This would tell the entity provider to use the value of the tag org-team
as the owner
of the resource. A similar concept can be applied to the system
. Type
would be the name of the service in the ARN of the response. Additional optional tags can be added for metadata.description
.
For every API call, all of the resources are constructed for a specific account and are applied with full mutation here. This handles the deletion of resources, as opposed to more complicated methods such as subscribing to events related to resource deletion.
Dependencies
Currently, the default app-config.yaml
only support S3:
https://backstage.io/docs/integrations/aws-s3/locations
There is a definitely a need for a more generic AWS authentication, and the following RFC needs to be addressed:
https://github.com/backstage/backstage/issues/12844
We need more secure ways of handling AWS authentication such as IRSA as opposed to accessKeyId
and secretAccessKey
.
We have already have some work extending the ScmIntegration to support IRSA and multi account configurations, but a standard way is needed.
Considerations
At a first glance, entity providers listed here seem to load pre-configured entities from different locations. This potential contribution seems a bit different from those providers in the sense that it dynamically constructs the Entities. If this RFC gets positive feedback and it is decided to move forward, should it be contributed to https://backstage.io/docs/integrations/?
Disclaimer: As a new potential contributors, we are still getting familiarized with backstage architecture.
Issue Analytics
- State:
- Created a year ago
- Reactions:3
- Comments:16 (11 by maintainers)
Top GitHub Comments
Yea. It’s an interesting space. And it greatly depends upon the other tools and processes you have access to. If your organization has other tools in place which bubble up this information, it’s not necessary. If you are using pretty exclusive K8, there are some resources introspection tools which they have which are very powerful for such requests. If you are using hand rolled ecs pipelines and custom CDK or legacy cloudformation templates, maybe not so much.
Another thing at play is conway’s law and culture. Different companies have different communication patterns and different ownership cultures in place. More mature backstage instances have also had a team that has grown up using a product which was even younger than it is now when it was adopted, and have developed an internal culture which has grown with it. Also, when you are attempting to map the systems that a company has, it quickly uncovers the communication patterns of the company as well. This can be painful and uncomfortable if unhealthy patterns have not been addressed or discussed much before.
All this to “tee-up” one of the things that is both Backstage’s greatest strength and hurdle. That each company can make it what they want. It is extremely customizable, and this is a great tool, not a hindrance.
So i would say that yes, that is the team at Spotify’s viewpoint and opinion, and it is a knowledgeable, and well thought out opinion, which should be taken with weight. But on the flip side, it is also coming from a company whose culture is one which chose to spend considerable resources developing a whole platform for this. And this just isn’t every company’s story.
All that to say, take advice and leadership from those that come before, but also recognize that every company’s journey will be different. Every company has a different culture. And you can leverage the flexibility which backstage provides to uniquely satisfy and serve your company.
Have you considered using Cloud Control API as a way to discover entities of many resource types? For example, you can do:
The list of supported resource types for Cloud Control API is here: https://docs.aws.amazon.com/cloudcontrolapi/latest/userguide/supported-resources.html