question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Would like to handle jobs in batch

See original GitHub issue

Hi !

Something i was wondering about, could we add batch support for worker ? Let say i have 500 jobs which each of them fetch data to an external API or database for example. Batching them in a single request would help reduce the load on these external resources and also reduce latency in general. Example :

const zbBatchWorker = zbc.createBatchWorker(
    'test-worker',
    'demo-service',
    {
        batch: 50, // # of jobs per batch
        timeout: 1000, // or 1 sec
    },
    async (payloads, complete) => {
        const ids = jobs.map((j) => j.id);
        const users = await fetch(`someUrl/user?ids=${ids.join(',')}`);

        for(const [key] of Object.keys(ids)) {
            const index = Number(key);
            const user = users[index];

            try {
                // ...Do some work... 

                if(...) {
                    complete.success(index, ...)
                } else {
                    complete.failure(index, ...)
                }
            } catch(error) {
                complete.error(index, ...)
            }
        }

        /*
            When the batch is done,
            this commit each job and
            call the right procedure (grpc)
            base on 'complete'
        */
        await complete.done();
    }
);

I can pass the number of jobs per batch or a timeout after which the batch is handle anyway. That said, it could be link to the maxActiveJobs propertie.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:20 (13 by maintainers)

github_iconTop GitHub Comments

1reaction
jbeaudoin11commented, Mar 2, 2020

I think it’s exactly what i want in term of functionality. Thanks !

1reaction
jwulfcommented, Mar 2, 2020

Will be out this week in the 0.23.0-alpha.1 release.

Here are the docs from the README:

The ZBBatchWorker Job Worker

The ZBBatchWorker Job Worker batches jobs before calling the job handler. Its fundamental differences from the ZBWorker are:

  • Its job handler receives an array of one or more jobs.
  • The jobs have success, failure, error, and forwarded methods attached to them.
  • The handler is not invoked immediately, but rather when enough jobs are batched, or a job in the batch is at risk of being timed out by the Zeebe broker.

You can use the batch worker if you have tasks that benefit from processing together, but are not related in the BPMN model.

An example would be a high volume of jobs that require calls to an external system, where you have to pay per call to that system. In that case, you may want to batch up jobs, make one call to the external system, then update all the jobs and send them on their way.

The batch worker works on a first-of batch size or batch timeout basis.

You must configure both jobBatchMinSize and jobBatchMaxTime. Whichever condition is met first will trigger the processing of the jobs:

  • Enough jobs are available to the worker to satisfy the minimum job batch size;
  • The batch has been building for the maximum amount of time - “we’re doing this now, before the earliest jobs in the batch time out on the broker”.

You should be sure to specify a timeout for your worker that is jobBatchMaxTime plus the expected latency of the external call plus your processing time and network latency, to avoid the broker timing your batch worker’s lock and making the jobs available to another worker. That would defeat the whole purpose.

Here is an example of using the ZBBatchWorker:

import { API } from './lib/my-awesome-external-api'
import { ZBClient, BatchedJob } from 'zeebe-node'

const zbc = new ZBClient()

// Helper function to find a job by its key
const findJobByKey = jobs => key => jobs.filter(job => job.jobKey === id)?.[0] ?? {}

const handler = async (jobs: BatchedJob[], worker: ZBBatchWorker) => {
    worker.log("Let's do this!")
    const {jobKey, variables} = job
    // Construct some hypothetical payload with correlation ids and requests
    const req = jobs.map(job => ({id: jobKey, data: variables.request}))
    // An uncaught exception will not be managed by the library
    try {
        // Our API wrapper turns that into a request, and returns
        // an array of results with ids
        const outcomes = await API.post(req)
        // Construct a find function for these jobs
        const getJob = findJobByKey(jobs)
        // Iterate over the results and call the succeed method on the corresponding job,
        // passing in the correlated outcome of the API call
        outcomes.forEach(res => getJob(res.id)?.success(res.data))
    } catch (e) {
        jobs.forEach(job => job.failure(e.message))
    }
}

const batchWorker = zbc.createBatchWorker({
    taskType: 'get-data-from-external-api',
    taskHandler: handler,
    jobBatchMinSize: 10, // at least 10 at a time
    jobBatchMaxTime: 60, // or every 60 seconds, whichever comes first
    timeout: 80 // 80 second timeout means we have 20 seconds to process at least
})
Read more comments on GitHub >

github_iconTop Results From Across the Web

Would like to handle jobs in batch · Issue #134 - GitHub
Let say i have 500 jobs which each of them fetch data to an external API or database for example. Batching them in...
Read more >
What Is a Batch Job? – BMC Software | Blogs
Batch jobs are frequently used to automate tasks that need to be performed on a regular basis, like payroll, but don't necessarily need...
Read more >
Work management concepts: Batch jobs - IBM
Batch jobs run in the system background, freeing the user who submitted the job to do other work. Several batch jobs can be...
Read more >
Handle Errors in your Batch Job… Like a Champ!
Fact: Batch Jobs are tricky to handle when exceptions raise. The problem is the huge amounts of data that these jobs are designed...
Read more >
How to run a batch job from an application screen or logic in ...
Batch jobs can also be triggered to run from a screen or business logic. Watch this video to learn how. In case you...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found