Create a guide on how to write tests for nodes and pipelines
See original GitHub issueDescription
Our users have expressed a need to learn how to write tests for their nodes and pipelines. We encourage learning about software-engineering best practice and want to include a guide in our documentation for this.
Possible Implementation
This guide must focus on explaining:
- How
kedro test
works - How to write a test for a node, using an example
- How to write a test for a pipeline, using an example
This guide can fall into our Development chapter. And remember to follow our guidelines for contributing to the documentation.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:4
- Comments:9 (8 by maintainers)
Top Results From Across the Web
Node.js Unit Testing: Get Started Quickly With Examples
This post we'll show you how to get started with Node.js unit testing in practice, with examples. Think of this post as a...
Read more >Tutorial: Set up a continuous testing pipeline with Node.js
The pipeline is initiated when code is pushed to a repository on GitHub. CircleCI will automatically pull the new changes, build the code,...
Read more >Integration testing tutorial with Bitbucket Pipelines - Atlassian
Unit tests validating individual methods and classes are a great start to prevent issues, but you will also need to run integration tests...
Read more >Node.js Unit Testing Automation With Drone CI Using Mocha ...
CI lets developers save time on manual testing and focus on building the rich features that their customers need. Today, we'll demonstrate how ......
Read more >Unit test report examples - GitLab Docs
Unit test reports can be generated for many languages and packages. Use these examples as guidelines for configuring your pipeline to generate unit...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I’m not able to access those links, but in any case, in my experience writing tests for pipeline came to its full power when doing operations on Spark DataFrames. Usually, for pipelines, one test with data suffices. Here is a suggestion for what this page could look like using Spark DataFrames:
Writing Tests for Nodes and Pipelines
In this section we introduce the way you write unit tests and integration tests for Nodes and Pipelines respectively. Each Node in a pipeline should have its own (parametrised) unit tests. Tests for pipelines on the other hand should test a number of sequential nodes. For such a test, the input of the first node is tested against the expected output of the last node. Imagine the following nodes, were we compute the total equity of a store’s inventory:
The pipeline:
Unit Tests
We begin by writing the first unit test by spinning up two sample data sets from a CSV file: inventory
and total_price_per_product
Since, we need to be able to read these CSV datasets into Spark DataFrames, we construct a Spark Session in the root of our testing package:
Now, we can create fixtures for loading these datasets:
In
test_nodes.py
we use the above fixtures to test our Nodes:Integration Test
As we have unit tests for each individual node in our Inventory pipeline, we would like to create an integration test for the pipeline as a whole. Again, we start by creating a number of fixtures in the root of the testing package:
The fixture guarantees we are working with a clean instance of the
DataCatalog
in our tests. In the end we can write an integration test of the Inventory pipeline:Make sure to add
from_inputs(..)
andto_outputs(..)
tocreate_pipeline
, if you add nodes later to the pipeline, you’d still like to keep this integration test working.Let me know what you guys think of this setup 😄
This is actively being worked on 😃