question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItΒ collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RFC: Mechanism for end2end testing

See original GitHub issue

Is this related to an existing feature request or issue?

https://github.com/awslabs/aws-lambda-powertools-python/issues/1009

Which AWS Lambda Powertools utility does this relate to?

Other

Summary

Build mechanism to run an end to end tests on Lambda Powertools library using real AWS Services (Lambda, DynamoDB). Initially, tests can be run manually by maintainers on specific branch/commit_id to ensure expected feature works. Tests should be triggered in GitHub but also maintainers/contributors should be able to run them in their local environment using their own AWS Account.

Use case

Providing mechanism to run end to end tests in real-live environment allows us to discover different class of problems we cannot find out otherwise by running unit tests or integration tests. For example, how the code base behaves during Lambda during cold and warm start, event source misconfiguration, IAM permissions, etc. It also allow us to validate integration with external services (CloudWatch Logs, X-ray) and ensure final real user experience is what we expect.

When it should be used

  • Test feature from end user perspective
  • Test external integration with AWS Services and applied policies this mechanism
  • Test event source configurations and/or combinations
  • Test whether our documented IAM permissions work as expected

Examples

  • Test if structured logs generated by library is visible in AWS CloudWatch and has all necessary attributes
  • Test if generated trace is visible in AWS X-Ray and has all provided metadata and annotation included
  • Test if business metric generated by library is visible in CloudWatch under expected namespace and with expected value

When integration test may be more appropriate instead

Integration testing would be a better fit when we can increase confidence by covering code base -> AWS Service(s). These can give us a faster feedback loop while reducing the permutations of E2E test we might need to cover the end user perspective, permissions, etc.

Examples

  • Test if pure Python function is idempotent and subsequent calls return the same value
  • Test whether Feature Flags can fetch schema
  • Test whether Parameters utility can fetch values from SSM, Secrets Manager, AppConfig, DynamoDB

Proposal

Overview

  • Use CDK SDK (lib) to describe infrastructure: currently lambda + powertools layer
  • Run tests in parallel and separate them by feature directory e.g. metrics/, tracer/
  • Every feature group has a separate infrastructure deployed e.g., metrics stack, tracer stack
  • Enable running them from GitHub Actions and from local machine on specified AWS Account
  • Clean up all resources at the end of the test

What an E2E test would look like

More details in the What's in a test section

Screenshot 2022-05-26 at 11 51 15

Details

Github configuration
  1. Integrate GitHub with AWS Account using OIDC: https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/configuring-openid-connect-in-amazon-web-services
  2. Specify end2end tests commands in Github Actions
Test setup

Tests will follow a common directory structure to allow us to parallelize infrastructure creation and test execution for all feature groups. Tests within a feature group, say tracer/, are run sequentially and independently from other feature groups.

Test fixtures will provide the necessary infrastructure and any relevant information tests need to succeed, e.g. Lambda Function ARN. Helper methods will also be provided to hide integration details and ease test creation.

If there are no more tests, infrastructure resources are automatically cleaned-up and results are synchronized and return to the user.

Directory Structure

tests/e2e
β”œβ”€β”€ conftest.py
β”œβ”€β”€ logger
β”‚   β”œβ”€β”€ handlers
β”‚   └── test_logger.py
β”œβ”€β”€ metrics
β”‚   β”œβ”€β”€ handlers
β”‚   └── test_metrics.py
β”œβ”€β”€ tracer
β”‚   β”œβ”€β”€ handlers
β”‚   └── test_tracer.py
└── utils
    β”œβ”€β”€ helpers.py
    └── infrastructure.py

Explanation

  1. We keep our end to end tests under tests/e2e directory.
  2. We split tests by groups matching different Powertools features - in this example we have 3 groups (logger, metrics and tracer).
  3. Our test mechanism parallelize tests execution by looking at those groups.
  4. utils directory has utilities to simplify writing tests, and an infrastructure module used for deploying infrastructure

Note: In the first phase we may reuse infrastructure helper class in all test groups. If we decide we need more infrastructure configuration granularity per groups we will create sub-classes from core infra class and overwrite method responsible for describing infrastructure in CDK.

Reasoning

Keeping infrastructure creation module separate from test groups helps in reusing infrastructure along multiple tests within a feature group. It also allows us to benchmark tests and infra separately in the future. It also help contributor to write tests without expectation to dive deep into infra creation mechanism.

General Flow Diagram
graph TD
    A([Start]) -->|Run e2e tests| B[Find all tests group AND parallelize execution]
    B -->|group 1|C1[Deploy infrastructure]
    B --> |group 2|C2[Deploy infrastructure]
    C1 --> F1{Is another test available?}
    F1 --> |no|G1[Destroy infrastructure]
    F1 --> |yes|I1[Run test]
    I1 -->|Find next test|F1
    C2 --> F2{Is another test available?}
    F2 --> |no|G2[Destroy infrastructure]
    F2 --> |yes|I2[Run test]
    I2 -->|Find next test|F2
    G1 --> K[Return results]
    G2 --> K
    K -->L([Stop])
What’s in a test

Sample test using Pytest as our test runner

Screenshot 2022-05-26 at 11 51 15
  1. execute_lambda fixture is responsible for deploying infrastructure and run our Lambda functions. They yield back their ARNs, execution time, etc., that can be used by helper functions, tests themselves, and maybe other fixtures
  2. helpers.get_logs functions fetch logs from CloudWatch Logs
  3. Tests follow GIVEN/WHEN/THEN structure as other parts of the project

Out of scope

  1. Automated tests run on PR opened/comments/labels
  2. Extensive set of tests

Potential challenges

Multiple Lambda Layers creation

By using pytest xdist plugin we can easily parallelise tests per group and create infrastructure and run tests in parallel. This leads to Powertools lambda layer being created 3 times, which put a pressure on CPU/RAM/IOPS unnecessarily. We should optimise the solution to create layer only once and then run parallelised tests with reference to this layer

CDK owned S3 bucket

As a CDK prerequisite, we bootstrap account for CDK usage by issuing cdk boostrap. Since s3 created by CDK doesn’t have a lifecycle policy to remove old artefacts, we need to customize the default template used by CDK bootstrap command, and attach it to the feature readme file with good description how to use it.

Dependencies and Integrations

CDK

AWS CDK is responsible for synthesizing provided code into a CloudFormation stack, not deployment. We will use AWS SDK to deploy the generated CloudFormation stack instead.

During evaluation (see: Alternative solutions section), this approach combined the best compromise in ensuring a good deployment speed, infrastructure code readability and maintainability.

Helper functions

Helper functions will be testing utilities to integrate with necessary AWS services tests need, hiding unnecessary complexity.

Examples

  1. Fetch structured logs from Amazon CloudWatch Logs
  2. Fetch newly created metrics in Amazon CloudWatch Metrics
  3. Fetch newly emitted traces in AWS X-Ray

Alternative solutions

  1. Use CDK CLI to deploy infrastructure directly, not custom code to synthesise cdk code, then deploy assets and run AWS CloudFormation deployment - dropped to avoid running CLI from subprocess (more latency added) + to avoid additional node dependencies

  2. Write AWS CodeBuild pipeline on AWS Account side that would run tests stored somewhere else outside of the project. No configuration and tests exposed in powertools repo - dropped due to initial assumption that we want end to end tests to be part of the project/increase visibility/allow contributors to run those tests during development phase on their own

  3. Instead of using CloudFormation with multiple lambdas deployed I also considered using hot swap mechanism - either via CDK or direct call. Based on latency measured CloudFormation seems the fastest option. Attaching my findings.

Additional material

Acknowledgment

Issue Analytics

  • State:closed
  • Created a year ago
  • Reactions:2
  • Comments:14 (11 by maintainers)

github_iconTop GitHub Comments

3reactions
heitorlessacommented, Sep 9, 2022

Updating here for future correctness, as we moved to CDK CLI due to context methods from_lookup only works with the CLI. We also implemented Lambda Layer caching and moved away from Docker.

Pasting the section in the maintainers playbook about the framework.


E2E framework

Structure

Our E2E framework relies on Pytest fixtures to coordinate infrastructure and test parallelization - see Test Parallelization and CDK CLI Parallelization.

tests/e2e structure

.
β”œβ”€β”€ __init__.py
β”œβ”€β”€ conftest.py # builds Lambda Layer once
β”œβ”€β”€ logger
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ conftest.py  # deploys LoggerStack
β”‚   β”œβ”€β”€ handlers
β”‚   β”‚   └── basic_handler.py
β”‚   β”œβ”€β”€ infrastructure.py # LoggerStack definition
β”‚   └── test_logger.py
β”œβ”€β”€ metrics
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ conftest.py  # deploys MetricsStack
β”‚   β”œβ”€β”€ handlers
β”‚   β”‚   β”œβ”€β”€ basic_handler.py
β”‚   β”‚   └── cold_start.py
β”‚   β”œβ”€β”€ infrastructure.py # MetricsStack definition
β”‚   └── test_metrics.py
β”œβ”€β”€ tracer
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ conftest.py  # deploys TracerStack
β”‚   β”œβ”€β”€ handlers
β”‚   β”‚   β”œβ”€β”€ async_capture.py
β”‚   β”‚   └── basic_handler.py
β”‚   β”œβ”€β”€ infrastructure.py  # TracerStack definition
β”‚   └── test_tracer.py
└── utils
    β”œβ”€β”€ __init__.py
    β”œβ”€β”€ data_builder  # build_service_name(), build_add_dimensions_input, etc.
    β”œβ”€β”€ data_fetcher  # get_traces(), get_logs(), get_lambda_response(), etc.
    β”œβ”€β”€ infrastructure.py # base infrastructure like deploy logic, etc.

Where:

  • <feature>/infrastructure.py. Uses CDK to define the infrastructure a given feature needs.
  • <feature>/handlers/. Lambda function handlers to build, deploy, and exposed as stack output in PascalCase (e.g., BasicHandler).
  • utils/. Test utilities to build data and fetch AWS data to ease assertion
  • conftest.py. Deploys and deletes a given feature infrastructure. Hierarchy matters:
    • Top-level (e2e/conftest). Builds Lambda Layer only once and blocks I/O across all CPU workers.
    • Feature-level (e2e/<feature>/conftest). Deploys stacks in parallel and make them independent of each other.

Mechanics

Under BaseInfrastructure, we hide the complexity of deployment and delete coordination under deploy, delete, and create_lambda_functions methods.

This allows us to benefit from test and deployment parallelization, use IDE step-through debugging for a single test, run one, subset, or all tests and only deploy their related infrastructure, without any custom configuration.

Class diagram to understand abstraction built when defining a new stack (LoggerStack)

classDiagram
    class InfrastructureProvider {
        <<interface>>
        +deploy() Dict
        +delete()
        +create_resources()
        +create_lambda_functions() Dict~Functions~
    }

    class BaseInfrastructure {
        +deploy() Dict
        +delete()
        +create_lambda_functions() Dict~Functions~
        +add_cfn_output()
    }

    class TracerStack {
        +create_resources()
    }

    class LoggerStack {
        +create_resources()
    }

    class MetricsStack {
        +create_resources()
    }

    class EventHandlerStack {
        +create_resources()
    }

    InfrastructureProvider <|-- BaseInfrastructure : implement
    BaseInfrastructure <|-- TracerStack : inherit
    BaseInfrastructure <|-- LoggerStack : inherit
    BaseInfrastructure <|-- MetricsStack : inherit
    BaseInfrastructure <|-- EventHandlerStack : inherit

Authoring a new feature E2E test

Imagine you’re going to create E2E for Event Handler feature for the first time. Keep the following mental model when reading:

graph LR
    A["1. Define infrastructure"]-->B["2. Deploy/Delete infrastructure"]-->C["3.Access Stack outputs" ]

1. Define infrastructure

We use CDK as our Infrastructure as Code tool of choice. Before you start using CDK, you’d take the following steps:

  1. Create tests/e2e/event_handler/infrastructure.py file
  2. Create a new class EventHandlerStack and inherit from BaseInfrastructure
  3. Override create_resources method and define your infrastructure using CDK
  4. (Optional) Create a Lambda function under handlers/alb_handler.py

Excerpt tests/e2e/event_handler/infrastructure.py

class EventHandlerStack(BaseInfrastructure):
    def create_resources(self):
        functions = self.create_lambda_functions()

        self._create_alb(function=functions["AlbHandler"])
        ...

    def _create_alb(self, function: Function):
        vpc = ec2.Vpc.from_lookup(
            self.stack,
            "VPC",
            is_default=True,
            region=self.region,
        )

        alb = elbv2.ApplicationLoadBalancer(self.stack, "ALB", vpc=vpc, internet_facing=True)
        CfnOutput(self.stack, "ALBDnsName", value=alb.load_balancer_dns_name)
        ...

Excerpt tests/e2e/event_handler/handlers/alb_handler.py

from aws_lambda_powertools.event_handler import ALBResolver, Response, content_types

app = ALBResolver()


@app.get("/todos")
def hello():
    return Response(
        status_code=200,
        content_type=content_types.TEXT_PLAIN,
        body="Hello world",
        cookies=["CookieMonster", "MonsterCookie"],
        headers={"Foo": ["bar", "zbr"]},
    )


def lambda_handler(event, context):
    return app.resolve(event, context)

2. Deploy/Delete infrastructure when tests run

We need to create a Pytest fixture for our new feature under tests/e2e/event_handler/conftest.py.

This will instruct Pytest to deploy our infrastructure when our tests start, and delete it when they complete whether tests are successful or not. Note that this file will not need any modification in the future.

Excerpt conftest.py for Event Handler

import pytest

from tests.e2e.event_handler.infrastructure import EventHandlerStack


@pytest.fixture(autouse=True, scope="module")
def infrastructure():
    """Setup and teardown logic for E2E test infrastructure

    Yields
    ------
    Dict[str, str]
        CloudFormation Outputs from deployed infrastructure
    """
    stack = EventHandlerStack()
    try:
        yield stack.deploy()
    finally:
        stack.delete()

3. Access stack outputs for E2E tests

Within our tests, we should now have access to the infrastructure fixture we defined earlier in tests/e2e/event_handler/conftest.py.

We can access any Stack Output using pytest dependency injection.

Excerpt tests/e2e/event_handler/test_header_serializer.py

@pytest.fixture
def alb_basic_listener_endpoint(infrastructure: dict) -> str:
    dns_name = infrastructure.get("ALBDnsName")
    port = infrastructure.get("ALBBasicListenerPort", "")
    return f"http://{dns_name}:{port}"


def test_alb_headers_serializer(alb_basic_listener_endpoint):
    # GIVEN
    url = f"{alb_basic_listener_endpoint}/todos"
    ...

Internals

Test runner parallelization

Besides speed, we parallelize our end-to-end tests to ease asserting async side-effects may take a while per test too, e.g., traces to become available.

The following diagram demonstrates the process we take every time you use make e2e locally or at CI:

graph TD
    A[make e2e test] -->Spawn{"Split and group tests <br>by feature and CPU"}

    Spawn -->|Worker0| Worker0_Start["Load tests"]
    Spawn -->|Worker1| Worker1_Start["Load tests"]
    Spawn -->|WorkerN| WorkerN_Start["Load tests"]

    Worker0_Start -->|Wait| LambdaLayer["Lambda Layer build"]
    Worker1_Start -->|Wait| LambdaLayer["Lambda Layer build"]
    WorkerN_Start -->|Wait| LambdaLayer["Lambda Layer build"]

    LambdaLayer -->|Worker0| Worker0_Deploy["Launch feature stack"]
    LambdaLayer -->|Worker1| Worker1_Deploy["Launch feature stack"]
    LambdaLayer -->|WorkerN| WorkerN_Deploy["Launch feature stack"]

    Worker0_Deploy -->|Worker0| Worker0_Tests["Run tests"]
    Worker1_Deploy -->|Worker1| Worker1_Tests["Run tests"]
    WorkerN_Deploy -->|WorkerN| WorkerN_Tests["Run tests"]

    Worker0_Tests --> ResultCollection
    Worker1_Tests --> ResultCollection
    WorkerN_Tests --> ResultCollection

    ResultCollection{"Wait for workers<br/>Collect test results"}
    ResultCollection --> TestEnd["Report results"]
    ResultCollection --> DeployEnd["Delete Stacks"]

CDK CLI parallelization

For CDK CLI to work with independent CDK Apps, we specify an output directory when synthesizing our stack and deploy from said output directory.

flowchart TD
    subgraph "Deploying distinct CDK Apps"
        EventHandlerInfra["Event Handler CDK App"] --> EventHandlerSynth
        TracerInfra["Tracer CDK App"] --> TracerSynth
       EventHandlerSynth["cdk synth --out cdk.out/event_handler"] --> EventHandlerDeploy["cdk deploy --app cdk.out/event_handler"]

       TracerSynth["cdk synth --out cdk.out/tracer"] --> TracerDeploy["cdk deploy --app cdk.out/tracer"]
    end

We create the typical CDK app.py at runtime when tests run, since we know which feature and Python version we’re dealing with (locally or at CI).

Excerpt cdk_app_V39.py for Event Handler created at deploy phase

from tests.e2e.event_handler.infrastructure import EventHandlerStack
stack = EventHandlerStack()
stack.create_resources()
stack.app.synth()

When we run E2E tests for a single feature or all of them, our cdk.out looks like this:

total 8
drwxr-xr-x  18 lessa  staff   576B Sep  6 15:38 event-handler
drwxr-xr-x   3 lessa  staff    96B Sep  6 15:08 layer_build
-rw-r--r--   1 lessa  staff    32B Sep  6 15:08 layer_build.diff
drwxr-xr-x  18 lessa  staff   576B Sep  6 15:38 logger
drwxr-xr-x  18 lessa  staff   576B Sep  6 15:38 metrics
drwxr-xr-x  22 lessa  staff   704B Sep  9 10:52 tracer
classDiagram
    class CdkOutDirectory {
        feature_name/
        layer_build/
        layer_build.diff
    }

    class EventHandler {
        manifest.json
        stack_outputs.json
        cdk_app_V39.py
        asset.uuid/
        ...
    }

    class StackOutputsJson {
        BasicHandlerArn: str
        ALBDnsName: str
        ...
    }

    CdkOutDirectory <|-- EventHandler : feature_name/
    StackOutputsJson <|-- EventHandler

Where:

  • <feature>. Contains CDK Assets, CDK manifest.json, our cdk_app_<PyVersion>.py and stack_outputs.json
  • layer_build. Contains our Lambda Layer source code built once, used by all stacks independently
  • layer_build.diff. Contains a hash on whether our source code has changed to speed up further deployments and E2E tests

Together, all of this allows us to use Pytest like we would for any project, use CDK CLI and its context methods (from_lookup), and use step-through debugging for a single E2E test without any extra configuration.

NOTE: VSCode doesn’t support debugging processes spawning sub-processes (like CDK CLI does w/ shell and CDK App). Maybe this works. PyCharm works just fine.

2reactions
heitorlessacommented, Jul 14, 2022

This is now merged πŸ˜ƒ We’ll be communicating more details as part of the release. Additional enhancements and E2E tests for other utilities will be dealt separately.

HUGE thank you @mploski for going through this length - intense review process, benchmarking multiple options, documentation, etc - and thanks for all reviewers, truly; it takes a village!

Read more comments on GitHub >

github_iconTop Results From Across the Web

RFC 4872 - RSVP-TE Extensions in Support of End-to-End ...
Standards Track [Page 1] RFC 4872 RSVP-TE Extensions for E2E GMPLS Recovery May ... GMPLS recovery uses control plane mechanisms (i.e., signaling, routing,Β ......
Read more >
RFC 2746: RSVP Operation Over IP Tunnels
Introduction IP-in-IP "tunnels" have become a widespread mechanism to transport datagrams in the Internet. Typically, a tunnel is used to route packetsΒ ...
Read more >
Extensible Messaging and Presence Protocol (XMPP): Core
and the "XMPP 0.9" authentication mechanism used before RFC 3920 defined the use of SASL ... a technology that meets the requirements defined...
Read more >
What is End-to-End Testing? | E2E Testing Tools - Katalon
The main purpose of End-to-end (E2E) testing is to test from the end user's experience by simulating the real user scenario and validating...
Read more >
RFC - Integration Tests & End to End tests - Development
Beside integration tests, We can take these advantages, and create an E2E architecture for foreman (within production environment).
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found