Optimise upload of deployment artifacts
See original GitHub issueUse case description
_Side related to https://github.com/serverless/serverless/issues/8499_
Currently on service deployment, we generate and upload all artifacts to Serverless bucket, even if they remained unchanged against previous deployment.
It’s highly inefficient, as in many cases it’s upload and related resources updates that may take significant part of deploy process (although I haven’t investigated on how AWS treats the case where same zip file (with same hash) is provided for lambda from different location (different uri), but I guess it’s still treated as code update unconditionally).
While to maintain stateless nature (locally) we need to locally generate artifacts for all resources on each service deployment, having that done, we may then inspect against deployed ones whether their hash changed, and on that basis avoid unnecessary uploads.
Proposed solution
Note: This is based on implementation idea as presented in @remi00 PR, which seems to provide us with means to introduce this improvement transparently (without a need for additional flags or breaking changes). It additionally ensures that when relying on sls package
and sls deploy --package
steps separately, we do not accidentally produce erroneous deploy
Change the location of where artifacts are stored in S3 bucket, to common folder where artifacts from all deployments are stored, and are named after their md5 hash. That will allow to easily confirm whether given artifact is already uploaded or not.
I propose to store them in <deployment-prefix>/<service>/<stage>/code-artifacts
folder.
In packaging step:
- When configuring the lambda artifact location in CF template internally resolve hash for given artifact and return name dedicated for S3 bucket. Additionally store resolved hash name into a map, which should be stored in
serverless-state.json
(so at deployment step we do not need to seclude generated hash names from generated CF template as that can be problematic) - Ensure that hashes that we calculate for lambda versioning rely on same hashing logic, and that we do not calculate hash for same file twice
In deployment step:
- Resolve artifact S3 location paths from has map stored in
serverless-state.json
file. For convenience ideally if given hash map is assigned toserverless.getProvider('aws).artifactsHashNamesMap
in context ofextendedValidate
where actualserverless-state.json
is read. - On old versions cleanup we should deduct from CF templates of versions to stay, which code artifacts should remain in S3 bucket, and on that basis remove all that are found in
code-artifacts
folder, but are not found in kept CF templates.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:49 (49 by maintainers)
@remi00 I’ve updated the above spec up to what you’ve proposed in your PR. You’ve shown that we do not necessarily need to introduce new hashing behind the flag, and that’s a big win.
Still we also need to take into account scenarios where packaging and deployment are made with distinct steps, and in such case case, service can be packaged with older version of Framework, or some manipulation could have been done to package artifacts in a meantime (although the latter seems controversial). I believe that what I proposed will handle such scenarios approprately.
@pgrzesik do you see any potential issues with newly specified approach?
@mnapoli I’ve updated the spec to direction we’re currently aiming
I just realized that in config without serverless-webpack, ZIP artifacts created with pure serverless have last-modified timestamp zeroed (@pgrzesik change from ca. half a year ago) and serverless-webpack is going to address this issue the same way with PR serverless-heaven/serverless-webpack#911. Therefore, my main concerns will be gone and the solution will be much, much simpler, without changes to hash calculation.