question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Simplify and Accelerate Integration Tests

See original GitHub issue

Motivation

Today Druid integration tests are slow to run in Travis, and painful to use as a developer. We wish to fix this issue.

This proposal starts the discussion. A prototype of the redesign is in progress. However, the integration tests are complex: let’s get more eyes on the issue to ensure we have a workable plan to modernize the tests.

Proposed changes

Proposed is a restructuring of the integration tests to allow faster Travis runs and far easier debugging. Key goals include:

  • Speed-up the Druid test image build by avoiding download of dependencies. (Instead, any such dependencies are managed by Maven and reside in the local build cache.)
  • Use official images for dependencies to avoid the need to download, install, and manage those dependencies.
  • Ensure that it is easy to manually build the image, launch a cluster, and run a test against the cluster.
  • Convert tests to JUnit so that they will easily run in your favorite IDE, just like other Druid tests.
  • Use the actual Druid build from distribution so we know what is tested.
  • Leverage, don’t fight, Maven.
  • Run the integration tests easily on a typical development machine.

This project is mostly a matter of managing many details. Rather than spell out the gory details here, please see the project documentation in the prototype branch.

Rationale

The current integration tests are characterized by a number of quirks resulting from their evolution.

  • The tests run in Maven before the distribution project, forcing the integration tests to do their own build. This means that what we test is not the software we “ship.”
  • The tests are in a single Maven project. That single project can start the cluster, run tests, and shut down the cluster. Since we have multiple test groups, this means that we need a separate Travis run for each test group.
  • The test image contains Druid plus dependencies such as MySQL, ZooKeeper and Kafka. Each build pulls these dependencies down from the public repository, resulting in very slow image builds (and unkind load on the upstream repositories.)
  • The plumbing (scripts, Maven tasks, etc.) show their age: they are very complex and quite difficult to understand and extend.
  • Tests are based on the TestNG framework and are hard to run within an IDE.
  • The directory mounted into the container is placed in the user’s home directory, which makes it hard to manage and outside of the Maven build tree. When run locally, tests share the same directory, leading to non-deterministic results.

Each of these issues contains the seed of its resolution:

  • Move tests to run in Maven after distribution so that the tests run against the artifacts produced by the main Maven build. This eliminates the need for a second build within the tests.
  • Split the tests into multiple Maven subprojects. Each can start a Docker cluster as that test needs, run the test, and shuts down the cluster. Maven will step through the subprojects (former test groups) one by one in a single run.
  • MySQL, ZK and Kafka all provide “official” images. We can use those to avoid pulling down the software on each build. We can use Maven to pull down things like the MySQL driver, MariaDB driver and Kafka protobuf provider. Both Docker and Maven provide caches, so that we hit the upstream repositories only when the dependencies change, not once per test run.
  • By splitting tests into multiple projects, we can simplify the “plumbing”: each test project performs the setup that it needs, without complex scripts and and if-statements.
  • TestNG usage is replaced by the JUnit library which has built-in IDE support. A bit of shim code ensures that most test code can remain unchanged.
  • The shared directory mounted into the containers now resides in each project’s target folder, so that Maven will clean up the directory the same way it cleans all other build artifacts.

Migration

The Druid integration tests represent a large amount of code. One-shot conversion will not be possible. Instead, we must work step-by-step.

  • The prototype branch has worked out how to convert a single test group using the new structure. (The current code is a work in progress.)
  • We propose to check in a final version of that code as a new Druid project. The project would not yet be built in Travis, but can be used for local testing by developers.
  • Once we feel the new code is stable, we can replace the existing Travis tests group-by-group. A single new Travis job will eventually include all of the current test groups, resulting in a faster build cycle.
  • If we don’t have capacity to convert all tests, then some groups may continue to run in the old system for the time being.

This approach does mean some short-term redundancy, but provides a safe roll-out path.

Future work

This proposal is an outline of an approach, along with a prototype to demonstrate the idea. Additional work includes:

  • Complete the foundational work. Mostly a matter of final clean-ups, verifying that tests flow smoothly, etc.
  • Conversion of the remaining tests.
  • Ensure the tests work in the Kubernetes and Quickstart environments which some tests seem to support.
  • Support multiple environments: MariaDB vs. MySQL drivers, Hadoop vs. S3 vs. Azure vs. GCS, etc.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:6
  • Comments:13 (13 by maintainers)

github_iconTop GitHub Comments

1reaction
paul-rogerscommented, Mar 28, 2022

There is a PR for this work: #12368. With this, it is now possible to:

  1. Build Druid in the normal way.
  2. Build the test image: about 1 minute.
  3. Run a test: depending on task, 1 minute to several minutes.

It is also possible to speed up the process for debugging.

2a. After the first image build, you can use a script to rebuild in just a few seconds. 3a. The converted ITs can be run from your favorite IDE as a JUnit test.

The PR is pretty big, but most of it is configs and refactored code. Please review and provide suggestions.

0reactions
paul-rogerscommented, Nov 3, 2022

I realized that this issue hadn’t been updated in some time. The “new IT framework” has been merged into master. Several existing tests are converted. New test should use the new framework. We already have new tests for MSQ and the Catalog project.

The next step is to convert the remaining “old” ITs. Previously, such a conversion was a “nice to have” done when someone needed to work with a test, and found it more convenient to do so using the new framework. It may be that the required transition of off Travis and onto GitHub Actions will force this issue: we may find it faster to convert each test, then port that converted version to GHA than to fight the old framework. Such a decision is still likely to be case-by-case: whoever is tasked with doing a GHA conversion will likely choose the path of least resistance.

Read more comments on GitHub >

github_iconTop Results From Across the Web

6 best practices for integration testing with continuous ...
6 best practices for integration testing with continuous integration · 1. Do integration testing before unit testing · 2. Don't test business logic...
Read more >
Optimizing Spring Integration Tests - Baeldung
An opinionated guide to implementing and optimizing integration tests with Spring.
Read more >
What is integration testing? The basics explained! - UTOR
Integration testing is an essential step on the way to releasing an app with thought-out logic and flawless performance.
Read more >
Cognizant and Tricentis Integrate Testing Solutions to Simplify ...
Cognizant and Tricentis are collaborating to offer an integrated testing solution for companies transitioning and modernizing critical business applications ...
Read more >
Integration Testing - Delphix
Create unified testing environments across complex app landscapes · Critical Challenges in Integration Testing · Automate Data to Accelerate System Integration ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found