question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Feature Request: ForEach-Object BatchSize Parameter

See original GitHub issue

Summary of the new feature/enhancement

As a user I want my ForEach-Object -Parallel jobs to be evenly distributed so that they finish at about the same time (and not all except one and that one continues running for an excessive extra amount of time).

Example use case

  1. This one works, but creates a thread for each and every host and because of the “initialization part” (loading the module and connecting to the server) it becomes quite bottlenecked.
$ansibleInventory["all"]["children"].GetEnumerator() ` # List of hosts grouped by tenant
| Where-Object {$null -ne $_.Value["hosts"]} ` # Filter out all tenants without hosts
| ForEach-Object {$_.Value["hosts"].GetEnumerator()} ` # Get a list of all host objects.
| ForEach-Object -Throttlelimit 20 -UseNewRunspace -AsJob -Parallel {
  Import-Module "VMware.PowerCLI"
  Connect-ViServer ...
  Do-Stuff
}
  1. Iterating over each tenant in parallel and within that loop iterate over each customer vm. This however results in an uneven distribution. For example customer 0-9 have 100 vms each, but customer 10 has 1000 vms. We’re now basically back at single threaded performance.
$ansibleInventory["all"]["children"].GetEnumerator() ` # List of hosts grouped by tenant
| Where-Object {$null -ne $_.Value["hosts"]} ` # Filter out all tenants without hosts
| ForEach-Object -Throttlelimit 20 -UseNewRunspace -AsJob -Parallel {
  $_.Value["hosts"].GetEnumerator() `
  | ForEach-Object {
    Import-Module "VMware.PowerCLI"
    Connect-ViServer ...
    Do-Stuff
  }
} `

Proposed technical implementation details (optional)

Adding either a Batch-Object -Size command, or add -BatchSize as an additional parameter to ForEach-Object

$ansibleInventory["all"]["children"].GetEnumerator() `
| Where-Object {$null -ne $_.Value["hosts"]} `
| ForEach-Object {$_.Value["hosts"].GetEnumerator()} `
| ForEach-Object -Throttlelimit 20 -BatchSize 20 -UseNewRunspace -AsJob -Parallel {
  Import-Module "VMware.PowerCLI"
  Connect-ViServer ...
  Do-Stuff
}

or

$ansibleInventory["all"]["children"].GetEnumerator() `
| Where-Object {$null -ne $_.Value["hosts"]} `
| ForEach-Object {$_.Value["hosts"].GetEnumerator()} `
| Batch-Object -Size 20`
| ForEach-Object -Throttlelimit 20 -UseNewRunspace -AsJob -Parallel {
  $_ | ForEach-Object {
    Import-Module "VMware.PowerCLI"
    Connect-ViServer ...
    Do-Stuff
  }
}

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:7 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
MartinGC94commented, Dec 30, 2020

Not to shill myself too much but I recently published my SplitCollection module to the PS gallery which splits arrays into smaller arrays like this:

PS C:\> $DemoArray=1..20
PS C:\> $SplitArray=Split-Collection -InputObject $DemoArray -ChunkSize 2
PS C:\> $SplitArray.Count
10
PS C:\> $SplitArray[0].Count
2
PS C:\> $SplitArray=Split-Collection -InputObject $DemoArray -AmountOfParts 2
PS C:\> $SplitArray.Count
2
PS C:\> $SplitArray[0].Count
10

I think this should work in your scenario OP. For anyone curious the source code is available here: https://github.com/MartinGC94/SplitCollection/blob/master/SplitCollection/SplitCollectionCommand.cs

<div> GitHub</div><div>MartinGC94/SplitCollection</div><div>Contribute to MartinGC94/SplitCollection development by creating an account on GitHub.</div>
1reaction
mklement0commented, Dec 30, 2020

If I understand your intent correctly, then a general-purpose batching mechanism may solve your problem. Such a mechanism is proposed in #8270.

Also, can you please update the OP to provide a syntax-highlighting hint for the code blocks? Currently, the code is hard to read; use ```powershell

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to loop through a text file (running a command) in ...
What I do not understand is how I loop through the text file but only run a limited number at a time, then...
Read more >
My query is on for each scope where in i need to pass ...
My query is on for each scope where in i need to pass the batchsize parameter dynamically from the incoming payload is there...
Read more >
PowerShell ForEach-Object Parallel Feature
The new ForEach-Object -Parallel parameter set uses existing PowerShell APIs for running script blocks in parallel. These APIs have been around ...
Read more >
Walkthrough: Using BatchBlock and BatchedJoinBlock to ...
This batching mechanism is useful when you collect data from one or more sources and then process multiple data elements as a batch....
Read more >
Automated Script-generation with Powershell and SMO
I have no need to script certificates. I get this error: ForEach-Object : Exception calling "Script" with "1" argument(s): "Script faile d for ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found