add EverypipeTimeoutSecond to ForEach-Object-Parallel
See original GitHub issueSummary of the new feature / enhancement
Context:
Q:how to limit time for everypipe on ForEach-Object-Parallel? Apparently, -timeoutsecond can’t do that.
It can be understood like this: If there is no “-ThrottleLimit” parameter, -TimeoutSeconds is basically as expected. But now there is the “-ThrottleLimit” parameter, which makes -TimeoutSeconds unable to limit the runtime of each parallel. **“-ThrottleLimit” and “-timeoutsecond” cause subsequent pipes to time out without starting execution **
So,we need -EverypipeTimeoutSecond. time begin: from every parallel start
-timeoutsecond can’t do that.
-timeoutsecond can limit total time for all pipes. If there are many pipes and the execution time of each pipe is inconsistent, then limit total time does not look so good. Both in the manual and in actual tests it is demonstrated that:
1..10 | ForEach-Object -ThrottleLimit 1 -TimeoutSeconds 2 -Parallel {
Start-Sleep -Seconds 1
$_
}
Proposed technical implementation details (optional)
No response
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:8 (1 by maintainers)
@dkaszews I can see a wish for both. 100 items to process, average time 1 second, 10 threads. > 10 cores to run them on. Whole process should run in 10 seconds. We might want to stop a single item that is still running after 10 seconds. Or we might also want to stop the whole set if has not completed within 2 minutes
The second seems like a terminating error “We stopped without processing everything”. The first would be a way to fail sooner but re-using the thread to process another item (maybe with a warning) strikes me as wrong - my instinct says something should process everything or throw an error, it should not silently drop items. But I can see someone saying “My output tells me what has been processed, the commands I’m running in parallel can’t change their timeouts so let me set one in the ‘command-runner’”
In my company we have hundreds of scripts that do parallel processing against many machines every day.
Prior to moving to PowerShell 7 we had to manage the jobs manual and it made the scripts about an order of magnitude more complicated than they should be. For us, ForEach-Object is perfect and enjoy it very much.
Only caveat is randomly we will have sweeps that never return. They go through their whole process using about 500 threads and each process finishes and shows their output, but the pipeline for ForEach-Object never returns.
I’m not clear on why this happens, but if we had a timeout per each process it would be fantastic. It’s hard to predict the full time a sweep for us will take (typically 10 to 60 minutes), however, it is very easy to determine the maximum amount a single process should take, normally no more than 10 to 30 seconds depending on the script. Anything beyond that, it’s stuck somehow and we need to stop it and free up the pool.
Some suggest ForEach-Object is not suited well for such long running processes, however I don’t think that should be a scapegoat for not addressing the issues around ForEach-Object, which I think is one of the best features of PowerShell. I understand if at a certain load it is not performant enough, however it shouldn’t just crash or get stuck because you are using too many threads and memory. If the demand is more than it can handle it should have a graceful way of reporting it.
If we can make it so each thread has a timeout limit and we can improve the termination of the processes running it would be a tremendous win for PowerShell. I do realize processes getting stuck on IO that can’t be terminated is a Windows thing, but I’m hoping the PowerShell team can help bridge the gap with the Windows team and help resolve this. Otherwise who can?