question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Reduce parallelism and split validation per language

See original GitHub issue

Today, there is one validation workflow per building block (new quickstarts) and within each there is a multiplication matrix for programming language and protocol/SDK. It helps to avoid validation to fail due to left over state from a previous scenario, but this causes too many jobs to be triggered, exhausting the available workers for other workflows.

Instead, split the workflows per programming language. Within each workflow, multiplex only per protocol and have a loop within the workflow to run validation for each building block. This way, there would still be isolation for each building block but reduced number of jobs triggered.

Summary:

  • Move from workflow file per building block to workflow per language in .github/workflows
  • Matrix is only for [sdk, http] (variant)
  • Building blocks run in sequence in a single step with a loop that understands the folder structure, so when a new building block quickstart is added for a language, it is automatically picked up without requiring code changes:
for building_block in `ls -1`; do # get list of building blocks from folders or from env variable in global.env
  # TODO: add check if folder exists.
  cd $building_block/$language/$variant
  make validate 
  cd ../../../
done

Issue Analytics

  • State:closed
  • Created 7 months ago
  • Comments:9 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
paulyukcommented, Feb 23, 2023

Note @ASHIQUEMD is signing on to drive this. Thank you MD!

1reaction
paulyukcommented, Feb 8, 2023

I like it that there’s still some split so that:

  • one workflow isn’t too long
  • the isolation prevents 1 flakey issue taking down the whole matrix

The split by Building block has been good so that any new building block we introduce, does not affect the others that are in good shape. We’ll add building blocks more often than adding languages. Also you sort of have to “test in main” with these things, so i think authoring will get harder. but if it makes daily runtime better I’m still good with trying something and optimizing that way.

We should definitely solve too many jobs being triggered.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to speed up nested cross validation in python?
Two things: Instead of GridSearch try using HyperOpt - it's a Python library for serial and parallel optimization. I would reduce ...
Read more >
Training, Validation, Test Split for Machine Learning Datasets
This will allow you to realistically measure your model's performance by ensuring that the dataset used to train the model and the dataset...
Read more >
Efficient Large NLP Model Training with 3D Parallelism ...
It applies tensor parallelism to split model layers as depicted in Fig. 2. To reduce the communication time, each tensor parallel group is....
Read more >
Distributed Deep Learning training: Model and Data ...
Model parallelism: enables us to split our model into different chunks and train each chunk into a different machine.
Read more >
Deep Learning Frameworks for Parallel and Distributed ...
This third post of this series will explore some fundamental concepts in distributed and parallel Deep Learning training and introduce current deep learning ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found