RFC: Mechanism for end2end testing
See original GitHub issueIs this related to an existing feature request or issue?
https://github.com/awslabs/aws-lambda-powertools-python/issues/1009
Which AWS Lambda Powertools utility does this relate to?
Other
Summary
Build mechanism to run an end to end tests on Lambda Powertools library using real AWS Services (Lambda, DynamoDB). Initially, tests can be run manually by maintainers on specific branch/commit_id to ensure expected feature works. Tests should be triggered in GitHub but also maintainers/contributors should be able to run them in their local environment using their own AWS Account.
Use case
Providing mechanism to run end to end tests in real-live environment allows us to discover different class of problems we cannot find out otherwise by running unit tests or integration tests. For example, how the code base behaves during Lambda during cold and warm start, event source misconfiguration, IAM permissions, etc. It also allow us to validate integration with external services (CloudWatch Logs, X-ray) and ensure final real user experience is what we expect.
When it should be used
- Test feature from end user perspective
- Test external integration with AWS Services and applied policies this mechanism
- Test event source configurations and/or combinations
- Test whether our documented IAM permissions work as expected
Examples
- Test if structured logs generated by library is visible in AWS CloudWatch and has all necessary attributes
- Test if generated trace is visible in AWS X-Ray and has all provided metadata and annotation included
- Test if business metric generated by library is visible in CloudWatch under expected namespace and with expected value
When integration test may be more appropriate instead
Integration testing would be a better fit when we can increase confidence by covering code base -> AWS Service(s). These can give us a faster feedback loop while reducing the permutations of E2E test we might need to cover the end user perspective, permissions, etc.
Examples
- Test if pure Python function is idempotent and subsequent calls return the same value
- Test whether Feature Flags can fetch schema
- Test whether Parameters utility can fetch values from SSM, Secrets Manager, AppConfig, DynamoDB
Proposal
Overview
- Use CDK SDK (lib) to describe infrastructure: currently lambda + powertools layer
- Run tests in parallel and separate them by feature directory e.g. metrics/, tracer/
- Every feature group has a separate infrastructure deployed e.g., metrics stack, tracer stack
- Enable running them from GitHub Actions and from local machine on specified AWS Account
- Clean up all resources at the end of the test
What an E2E test would look like
More details in the
What's in a test
section

Details
Github configuration
- Integrate GitHub with AWS Account using OIDC: https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/configuring-openid-connect-in-amazon-web-services
- Specify end2end tests commands in Github Actions
Test setup
Tests will follow a common directory structure to allow us to parallelize infrastructure creation and test execution for all feature groups. Tests within a feature group, say tracer/
, are run sequentially and independently from other feature groups.
Test fixtures will provide the necessary infrastructure and any relevant information tests need to succeed, e.g. Lambda Function ARN. Helper methods will also be provided to hide integration details and ease test creation.
If there are no more tests, infrastructure resources are automatically cleaned-up and results are synchronized and return to the user.
Directory Structure
tests/e2e
βββ conftest.py
βββ logger
β βββ handlers
β βββ test_logger.py
βββ metrics
β βββ handlers
β βββ test_metrics.py
βββ tracer
β βββ handlers
β βββ test_tracer.py
βββ utils
βββ helpers.py
βββ infrastructure.py
Explanation
- We keep our end to end tests under tests/e2e directory.
- We split tests by groups matching different Powertools features - in this example we have 3 groups (logger, metrics and tracer).
- Our test mechanism parallelize tests execution by looking at those groups.
utils
directory has utilities to simplify writing tests, and an infrastructure module used for deploying infrastructure
Note: In the first phase we may reuse infrastructure helper class in all test groups. If we decide we need more infrastructure configuration granularity per groups we will create sub-classes from core infra class and overwrite method responsible for describing infrastructure in CDK.
Reasoning
Keeping infrastructure creation module separate from test groups helps in reusing infrastructure along multiple tests within a feature group. It also allows us to benchmark tests and infra separately in the future. It also help contributor to write tests without expectation to dive deep into infra creation mechanism.
General Flow Diagram
graph TD
A([Start]) -->|Run e2e tests| B[Find all tests group AND parallelize execution]
B -->|group 1|C1[Deploy infrastructure]
B --> |group 2|C2[Deploy infrastructure]
C1 --> F1{Is another test available?}
F1 --> |no|G1[Destroy infrastructure]
F1 --> |yes|I1[Run test]
I1 -->|Find next test|F1
C2 --> F2{Is another test available?}
F2 --> |no|G2[Destroy infrastructure]
F2 --> |yes|I2[Run test]
I2 -->|Find next test|F2
G1 --> K[Return results]
G2 --> K
K -->L([Stop])
Whatβs in a test
Sample test using Pytest as our test runner

execute_lambda
fixture is responsible for deploying infrastructure and run our Lambda functions. They yield back their ARNs, execution time, etc., that can be used by helper functions, tests themselves, and maybe other fixtureshelpers.get_logs
functions fetch logs from CloudWatch Logs- Tests follow
GIVEN/WHEN/THEN
structure as other parts of the project
Out of scope
- Automated tests run on PR opened/comments/labels
- Extensive set of tests
Potential challenges
Multiple Lambda Layers creation
By using pytest xdist
plugin we can easily parallelise tests per group and create infrastructure and run tests in parallel. This leads to Powertools lambda layer being created 3 times, which put a pressure on CPU/RAM/IOPS unnecessarily. We should optimise the solution to create layer only once and then run parallelised tests with reference to this layer
CDK owned S3 bucket
As a CDK prerequisite, we bootstrap account for CDK usage by issuing cdk boostrap
. Since s3 created by CDK doesnβt have a lifecycle policy to remove old artefacts, we need to customize the default template used by CDK bootstrap
command, and attach it to the feature readme file with good description how to use it.
Dependencies and Integrations
CDK
AWS CDK is responsible for synthesizing provided code into a CloudFormation stack, not deployment. We will use AWS SDK to deploy the generated CloudFormation stack instead.
During evaluation (see: Alternative solutions
section), this approach combined the best compromise in ensuring a good deployment speed, infrastructure code readability and maintainability.
Helper functions
Helper functions will be testing utilities to integrate with necessary AWS services tests need, hiding unnecessary complexity.
Examples
- Fetch structured logs from Amazon CloudWatch Logs
- Fetch newly created metrics in Amazon CloudWatch Metrics
- Fetch newly emitted traces in AWS X-Ray
Alternative solutions
-
Use CDK CLI to deploy infrastructure directly, not custom code to synthesise cdk code, then deploy assets and run AWS CloudFormation deployment - dropped to avoid running CLI from subprocess (more latency added) + to avoid additional node dependencies
-
Write AWS CodeBuild pipeline on AWS Account side that would run tests stored somewhere else outside of the project. No configuration and tests exposed in powertools repo - dropped due to initial assumption that we want end to end tests to be part of the project/increase visibility/allow contributors to run those tests during development phase on their own
-
Instead of using CloudFormation with multiple lambdas deployed I also considered using hot swap mechanism - either via CDK or direct call. Based on latency measured CloudFormation seems the fastest option. Attaching my findings.
Additional material
Acknowledgment
- This feature request meets Lambda Powertools Tenets
- Should this be considered in other Lambda Powertools languages? i.e. Java, TypeScript
Issue Analytics
- State:
- Created a year ago
- Reactions:2
- Comments:14 (11 by maintainers)
Updating here for future correctness, as we moved to CDK CLI due to context methods
from_lookup
only works with the CLI. We also implemented Lambda Layer caching and moved away from Docker.Pasting the section in the maintainers playbook about the framework.
E2E framework
Structure
Our E2E framework relies on Pytest fixtures to coordinate infrastructure and test parallelization - see Test Parallelization and CDK CLI Parallelization.
tests/e2e structure
Where:
<feature>/infrastructure.py
. Uses CDK to define the infrastructure a given feature needs.<feature>/handlers/
. Lambda function handlers to build, deploy, and exposed as stack output in PascalCase (e.g.,BasicHandler
).utils/
. Test utilities to build data and fetch AWS data to ease assertionconftest.py
. Deploys and deletes a given feature infrastructure. Hierarchy matters:e2e/conftest
). Builds Lambda Layer only once and blocks I/O across all CPU workers.e2e/<feature>/conftest
). Deploys stacks in parallel and make them independent of each other.Mechanics
Under
BaseInfrastructure
, we hide the complexity of deployment and delete coordination underdeploy
,delete
, andcreate_lambda_functions
methods.This allows us to benefit from test and deployment parallelization, use IDE step-through debugging for a single test, run one, subset, or all tests and only deploy their related infrastructure, without any custom configuration.
Authoring a new feature E2E test
Imagine youβre going to create E2E for Event Handler feature for the first time. Keep the following mental model when reading:
1. Define infrastructure
We use CDK as our Infrastructure as Code tool of choice. Before you start using CDK, youβd take the following steps:
tests/e2e/event_handler/infrastructure.py
fileEventHandlerStack
and inherit fromBaseInfrastructure
create_resources
method and define your infrastructure using CDKhandlers/alb_handler.py
2. Deploy/Delete infrastructure when tests run
We need to create a Pytest fixture for our new feature under
tests/e2e/event_handler/conftest.py
.This will instruct Pytest to deploy our infrastructure when our tests start, and delete it when they complete whether tests are successful or not. Note that this file will not need any modification in the future.
3. Access stack outputs for E2E tests
Within our tests, we should now have access to the
infrastructure
fixture we defined earlier intests/e2e/event_handler/conftest.py
.We can access any Stack Output using pytest dependency injection.
Internals
Test runner parallelization
Besides speed, we parallelize our end-to-end tests to ease asserting async side-effects may take a while per test too, e.g., traces to become available.
The following diagram demonstrates the process we take every time you use
make e2e
locally or at CI:CDK CLI parallelization
For CDK CLI to work with independent CDK Apps, we specify an output directory when synthesizing our stack and deploy from said output directory.
We create the typical CDK
app.py
at runtime when tests run, since we know which feature and Python version weβre dealing with (locally or at CI).When we run E2E tests for a single feature or all of them, our
cdk.out
looks like this:Where:
<feature>
. Contains CDK Assets, CDKmanifest.json
, ourcdk_app_<PyVersion>.py
andstack_outputs.json
layer_build
. Contains our Lambda Layer source code built once, used by all stacks independentlylayer_build.diff
. Contains a hash on whether our source code has changed to speed up further deployments and E2E testsTogether, all of this allows us to use Pytest like we would for any project, use CDK CLI and its context methods (
from_lookup
), and use step-through debugging for a single E2E test without any extra configuration.This is now merged π Weβll be communicating more details as part of the release. Additional enhancements and E2E tests for other utilities will be dealt separately.
HUGE thank you @mploski for going through this length - intense review process, benchmarking multiple options, documentation, etc - and thanks for all reviewers, truly; it takes a village!