"Opbeans" stage of release pipeline fails
See original GitHub issueRecently in https://github.com/elastic/apm-agent-nodejs/issues/2625 we automated releases: when a version tag (“vN.N.N”) is pushed, a Jenkins “Release” stage will build and publish the Lambda layer, do a GitHub release, npm publish
, and attempt to updating opbeans-node.git to use this new APM agent release.
That “Opbeans” stage is flaky (or perhaps fails every time), as discussed here: https://github.com/elastic/apm-agent-nodejs/issues/2625#issuecomment-1137881801 This issue is about making the release process reliable by doing something about this stage.
The Opbeans stage effectively does this: https://github.com/elastic/apm-agent-nodejs/pull/2723#issuecomment-1137922102
Options
Option 1: npm publish early and hope
Do the ‘npm publish’ step earlier in the pipeline and hope that the lambda layer publishing steps take enough time that the Opbeans stage will work then.
I don’t love this idea because relying on “hope” means that it may fail sometime, just less frequently, which just means a more subtle bug. Also see the “timeout” discussion below.
Option 2: wait for npm install to work
Add a spin loop at the start of the Opbeans stage process to retry the npm install
if it gets an ETARGET with a timeout to account for being run soon after a publish.
The “ETARGET” is referring to the specific error you get from npm install
when this issue happens:
[2022-05-25T21:23:57.820Z] + CI=true npm install --ignore-scripts elastic-apm-node@3.34.0
[2022-05-25T21:23:59.440Z] npm ERR! code ETARGET
[2022-05-25T21:23:59.440Z] npm ERR! notarget No matching version found for elastic-apm-node@3.34.0.
[2022-05-25T21:23:59.440Z] npm ERR! notarget In most cases you or one of your dependencies are requesting
[2022-05-25T21:23:59.440Z] npm ERR! notarget a package version that doesn't exist.
Theoretically this option would be straightforward to implement, but what should that timeout be? Granted the issue is old (from 2018) but user reports from https://github.com/npm/npm/issues/20574 suggest that the time for all npm servers to update could be an hour or more. That’s too long to have as a timeout in a release process.
Option 3: use dependabot to update opbeans
Configure dependabot to look for an agent update daily.
Some issues with this:
- The current “bump-version.sh” script also updates a label in the repo’s Dockerfile, which dependabot will not update. So either we drop using that label, or an option would be to have a separate lint GitHub check that fails the dependabot PR until it is manually updated to tweak the Dockerfile as well. This is pretty indirect and laborious.
- There is no way to have this process create a git tag on the opbeans repo, which the current process is currently doing. I am not sure those git tags are being used. They do result in tagged builds of the opbeans Docker image builds (see https://hub.docker.com/r/opbeans/opbeans-node/tags). However, I’m not sure if anyone uses anything but the “latest” of those docker images.
Option 4: use a Jenkins pipeline in the opbeans repo
Add a stage to the Jenkinsfile in the opbeans repo(s) on a cron(@daily)
to look for a new agent version, then do the update, commit, and tag.
I don’t see any issues with this approach other than:
- It means that a new opbeans update (and Docker image build) will take up to a day after an agent release.
- It will take some dev effort to make this work.
This is my current preferred option.
@elastic/observablt-robots @astorm Thoughts?
Issue Analytics
- State:
- Created a year ago
- Comments:12 (12 by maintainers)
Top GitHub Comments
@cachedout Thanks and understood. I’ll take a stab at it and get review from y’all.
As a sanity check, my plan is to add an optional
stage('Update Agent Dep') {
toopbeansPipeline
here: https://github.com/elastic/apm-pipeline-library/blob/main/vars/opbeansPipeline.groovy#L193 that will handle updating the APM agent dep if there is a new one available. It will be off by default so theopbeans-FOO.git
repos that are usingopbeansPipeline()
can opt into it. It will expect a new.ci/avail-agent-update-ver.sh
script (beside the existing.ci/bump-version.sh
script) in each opbeans repo that will use it. Please let me know if this sounds crazy. 😃I’d personally prefer
Option 4
as well.Opbeans is not a public artifact that is tied to this repository. It should not influence our ability to execute the release of the agent IMO. Moving the opbeans update completely out of band seems appropriate.