question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Optimise upload of deployment artifacts

See original GitHub issue

Use case description

_Side related to https://github.com/serverless/serverless/issues/8499_

Currently on service deployment, we generate and upload all artifacts to Serverless bucket, even if they remained unchanged against previous deployment.

It’s highly inefficient, as in many cases it’s upload and related resources updates that may take significant part of deploy process (although I haven’t investigated on how AWS treats the case where same zip file (with same hash) is provided for lambda from different location (different uri), but I guess it’s still treated as code update unconditionally).

While to maintain stateless nature (locally) we need to locally generate artifacts for all resources on each service deployment, having that done, we may then inspect against deployed ones whether their hash changed, and on that basis avoid unnecessary uploads.

Proposed solution

Note: This is based on implementation idea as presented in @remi00 PR, which seems to provide us with means to introduce this improvement transparently (without a need for additional flags or breaking changes). It additionally ensures that when relying on sls package and sls deploy --package steps separately, we do not accidentally produce erroneous deploy

Change the location of where artifacts are stored in S3 bucket, to common folder where artifacts from all deployments are stored, and are named after their md5 hash. That will allow to easily confirm whether given artifact is already uploaded or not.

I propose to store them in <deployment-prefix>/<service>/<stage>/code-artifacts folder.

In packaging step:

  • When configuring the lambda artifact location in CF template internally resolve hash for given artifact and return name dedicated for S3 bucket. Additionally store resolved hash name into a map, which should be stored in serverless-state.json (so at deployment step we do not need to seclude generated hash names from generated CF template as that can be problematic)
  • Ensure that hashes that we calculate for lambda versioning rely on same hashing logic, and that we do not calculate hash for same file twice

In deployment step:

  • Resolve artifact S3 location paths from has map stored in serverless-state.json file. For convenience ideally if given hash map is assigned to serverless.getProvider('aws).artifactsHashNamesMap in context of extendedValidate where actual serverless-state.json is read.
  • On old versions cleanup we should deduct from CF templates of versions to stay, which code artifacts should remain in S3 bucket, and on that basis remove all that are found in code-artifacts folder, but are not found in kept CF templates.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:2
  • Comments:49 (49 by maintainers)

github_iconTop GitHub Comments

2reactions
medikoocommented, Aug 19, 2021

@remi00 I’ve updated the above spec up to what you’ve proposed in your PR. You’ve shown that we do not necessarily need to introduce new hashing behind the flag, and that’s a big win.

Still we also need to take into account scenarios where packaging and deployment are made with distinct steps, and in such case case, service can be packaged with older version of Framework, or some manipulation could have been done to package artifacts in a meantime (although the latter seems controversial). I believe that what I proposed will handle such scenarios approprately.

@pgrzesik do you see any potential issues with newly specified approach?

@mnapoli I’ve updated the spec to direction we’re currently aiming

2reactions
remi00commented, Jul 28, 2021

I just realized that in config without serverless-webpack, ZIP artifacts created with pure serverless have last-modified timestamp zeroed (@pgrzesik change from ca. half a year ago) and serverless-webpack is going to address this issue the same way with PR serverless-heaven/serverless-webpack#911. Therefore, my main concerns will be gone and the solution will be much, much simpler, without changes to hash calculation.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Optimize upload of deployment artifacts - attempt #2 #9876
What it does? Change the artifacts naming scheme to be using content hash; Precede any S3.upload with S3.headObject to avoid re-upload of already...
Read more >
Upload Artifacts to S3 Step Settings - Harness.io Docs
This topic provides settings for the Upload Artifacts to S3 step, which uploads artifacts to AWS or other S3 providers such as MinIo....
Read more >
Best practices for speeding up builds - Google Cloud
Best practices for speeding up builds · Building leaner containers · Using Kaniko cache · Using a cached Docker image · Caching directories...
Read more >
6 optimization tips for your CI configuration - CircleCI
In this post, we will cover the six most effective ways to optimize your config file so you can build faster. You will...
Read more >
Release artifacts and artifact sources - Azure Pipelines
Azure Pipelines currently does not perform any optimization to avoid downloading the unchanged artifacts if the same release is deployed again.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found