Entity has a broken parent reference chain when using GitLab discovery
See original GitHub issueExpected Behavior
When using the GitLab discovery plugin, the refresh button on the component overview page should work and the generated Locations should not have the orphan annotation
Actual Behavior
We are using GitLab discovery to discover all our repositories. Entities are discovered and added to the catalog nicely. We noticed however that after a few minutes, the refresh button on all component pages stop working with the following error:
{"error":{"name":"NotFoundError","message":"Entity component:default/app has a broken parent reference chain at location:default/generated-f6dc147878776f5631e8f704d1d96af8ce958462"},"request":{"method":"POST","url":"/refresh"},"response":{"statusCode":404}}
This also happens when starting backstage with an empty database. Looking at the generated location we see an ‘backstage.io/orphan’ annotation and also the warning ‘This entity is not referenced by any location and is therefore not receiving updates’.
Steps to Reproduce
- Setup gitlab-discovery and run backstage
- Entities are added to the catalog and in the first minute, Locations are not yet marked as orphaned and the refresh button is also working
- Wait a few minutes
- Locations (type: url) are marked orphaned
- Refresh button stops working
Context
Discovery config
catalog:
locations:
- type: gitlab-discovery
target: https://gitlab.company.com/blob/*/catalog-info.yaml
Resulting entities
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
annotations:
backstage.io/managed-by-location: url:https://gitlab.company.com/tstgroup/app/-/blob/main/catalog-info.yaml
backstage.io/managed-by-origin-location: gitlab-discovery:https://gitlab.company.com/blob/*/catalog-info.yaml
name: app
apiVersion: backstage.io/v1alpha1
kind: Location
metadata:
namespace: default
annotations:
backstage.io/managed-by-location: gitlab-discovery:https://gitlab.company.com/blob/*/catalog-info.yaml
backstage.io/managed-by-origin-location: gitlab-discovery:https://gitlab.company.com/blob/*/catalog-info.yaml
backstage.io/orphan: "true"
name: generated-f6dc147878776f5631e8f704d1d96af8ce958462
relations: []
spec:
type: url
target: https://gitlab.company.com/tstgroup/app/-/blob/main/catalog-info.yaml
presence: optional
apiVersion: backstage.io/v1alpha1
kind: Location
metadata:
namespace: default
annotations:
backstage.io/managed-by-location: gitlab-discovery:https://gitlab.company.com/blob/*/catalog-info.yaml
backstage.io/managed-by-origin-location: gitlab-discovery:https://gitlab.company.com/blob/*/catalog-info.yaml
name: generated-86ff86809f940627ce1acae8a4ad92a0f1584ba6
relations: []
spec:
type: gitlab-discovery
target: https://gitlab.company.com/blob/*/catalog-info.yaml
Stack trace
{
"error": {
"name": "NotFoundError",
"message": "Entity component:default/app has a broken parent reference chain at location:default/generated-f6dc147878776f5631e8f704d1d96af8ce958462",
"stack": "NotFoundError: Entity component:default/app has a broken parent reference chain at location:default/generated-f6dc147878776f5631e8f704d1d96af8ce958462
at DefaultProcessingDatabase.listAncestors (/app/node_modules/@backstage/plugin-catalog-backend/dist/index.cjs.js:1359:15)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async /app/node_modules/@backstage/plugin-catalog-backend/dist/index.cjs.js:2888:30
at async options.database.transaction.doNotRejectOnRollback (/app/node_modules/@backstage/plugin-catalog-backend/dist/index.cjs.js:1388:18)"
},
"request": {
"method": "POST",
"url": "/refresh"
},
"response": {
"statusCode": 404
}
}
Your Environment
- Backstage version 1.0.3
- Issue occurs when running locally using sqlite and also on Kubernetes using PostgreSQL
- On-prem hosted GitLab
OS: Linux 5.10.102.1-microsoft-standard-WSL2 - linux/x64
node: v14.18.2
yarn: 1.22.10
cli: 0.16.0 (installed)
Dependencies:
@backstage/app-defaults 1.0.0
@backstage/backend-common 0.13.1
@backstage/backend-tasks 0.2.1
@backstage/catalog-client 1.0.0
@backstage/catalog-model 1.0.0
@backstage/cli-common 0.1.8
@backstage/cli 0.16.0
@backstage/config-loader 1.0.0
@backstage/config 1.0.0
@backstage/core-app-api 1.0.0
@backstage/core-components 0.9.2
@backstage/core-plugin-api 1.0.0
@backstage/errors 1.0.0
@backstage/integration-react 1.0.0
@backstage/integration 1.0.0
@backstage/plugin-allure 0.1.19
@backstage/plugin-api-docs 0.8.3
@backstage/plugin-app-backend 0.3.30
@backstage/plugin-auth-backend 0.12.3
@backstage/plugin-auth-node 0.1.6
@backstage/plugin-catalog-backend-module-gitlab 0.1.1
@backstage/plugin-catalog-backend-module-ldap 0.4.1
@backstage/plugin-catalog-backend 1.0.0
@backstage/plugin-catalog-common 1.0.0
@backstage/plugin-catalog-graph 0.2.15
@backstage/plugin-catalog-graphql 0.3.7
@backstage/plugin-catalog-import 0.8.6
@backstage/plugin-catalog-react 1.0.0
@backstage/plugin-catalog 1.0.0
@backstage/plugin-explore-react 0.0.15
@backstage/plugin-explore 0.3.34
@backstage/plugin-github-actions 0.5.3
@backstage/plugin-graphiql 0.2.35
@backstage/plugin-graphql-backend 0.1.20
@backstage/plugin-jenkins-backend 0.1.19
@backstage/plugin-jenkins-common 0.1.2
@backstage/plugin-jenkins 0.7.2
@backstage/plugin-org 0.5.3
@backstage/plugin-permission-common 0.5.3
@backstage/plugin-permission-node 0.5.5
@backstage/plugin-permission-react 0.3.4
@backstage/plugin-proxy-backend 0.2.24
@backstage/plugin-scaffolder-backend 1.0.0
@backstage/plugin-scaffolder-common 1.0.0
@backstage/plugin-scaffolder 1.0.1
@backstage/plugin-search-backend-module-elasticsearch 0.1.2
@backstage/plugin-search-backend-node 0.5.2
@backstage/plugin-search-backend 0.4.8
@backstage/plugin-search-common 0.3.2
@backstage/plugin-search 0.7.4
@backstage/plugin-shortcuts 0.2.4
@backstage/plugin-sonarqube 0.3.3
@backstage/plugin-tech-insights-backend-module-jsonfc 0.1.14
@backstage/plugin-tech-insights-backend 0.2.10
@backstage/plugin-tech-insights-common 0.2.4
@backstage/plugin-tech-insights-node 0.2.8
@backstage/plugin-tech-insights 0.1.13
@backstage/plugin-tech-radar 0.5.10
@backstage/plugin-techdocs-backend 1.0.0
@backstage/plugin-techdocs-node 1.0.0
@backstage/plugin-techdocs 1.0.1
@backstage/plugin-user-settings 0.4.2
@backstage/release-manifests 0.0.2
@backstage/search-common 0.3.2
@backstage/test-utils 1.0.0
@backstage/theme 0.2.15
@backstage/types 1.0.0
@backstage/version-bridge 1.0.0
Issue Analytics
- State:
- Created a year ago
- Reactions:6
- Comments:7 (6 by maintainers)
Top Results From Across the Web
GitLab Discovery - Backstage.io
Automatically discovering catalog entities from repositories in GitLab.
Read more >Product Processes | GitLab
As a Product Organization, we work to create a flexible yet concise product development framework for developing products that customers love and value....
Read more >GitLab Direction | Hacker News
Having every employee being able to use GitLab is essential to my idea of ... There's no way for me to discover the...
Read more >Gitlab S-1 - SEC.gov
For purposes of compliance with applicable requirements of the Securities Act ... refer to GitLab Inc. and our subsidiaries, and references to our...
Read more >Discovery and Service Mapping Patterns release notes
The GCP Service parent account ID was deleted after running discovery. Nutanix dashboard with ID '56107869dbde7b00e7dc7c4daf96192e' was not found. Network and ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@manuelstein Yeah you are right, the way that that processor is written it sure looks like it will lead to orphans basically all the time. 🤔 That’s unfortunate. To be able to leverage the lastActivity mechanism, it might have fit better to write it in the form of an entity provider rather than a processor. Or, to store all of the previously emitted results in the cache as well, so we can re-emit known-good old locations again on the next round, keeping them non-orphaned.
Either way, confusingly, things being orphaned is actually not the same thing as the parent reference chain being severed. Let’s see if I can explain it well enough.
Entities always ultimately originate from an entity provider, as in, they are rooted in one. When an entity provider emits an entity, it lands in the database along with a special “key” string. Specifically, we generate a kind of “bootstrapping” entry in the
refresh_state_references
table that has thesource_key
column set. This column is only set for things that are directly emitted by entity providers.When the processing loop runs and an entity emits other entities, the
source_entity_ref
column is used instead.So if you trace a lineage through the
refresh_state_references
table, you are expected to be able to track upwards and find a series of non-nullsource_entity_ref
pointers to track, and then a final entry with non-nullsource_key
. All other chain shapes are considered malformed. That’s what the error message signals has happened.It may sound surprising that this guarantee is meant to be upheld. It seems to clash with the orphaning (doesn’t that sever links in a sense too?). But the reason it works out is, when entity providers delete entity “roots”, an eager deletion happens throughout this entire subtree of things that spawned from it unless there are other roots that point to any given such child. More info here https://backstage.io/docs/features/software-catalog/life-of-an-entity#implicit-deletion
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.