question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Local Scheduler with large number of worker threads

See original GitHub issue

The task is to run processor.Process in 100 parallel threads. processor.Process is non async, IO (network) bound method belonging to a third party API.

let threadCount = 100

let scheduler =
    Scheduler.create
        { Foreground = None
          IdleHandler = None
          MaxStackSize = None
          NumWorkers = Some threadCount
          TopLevelHandler = 
            Some ^ fun e -> job { Console.WriteLine e }}

using (new ObjectPool<_>((fun _ -> new KlDfs(settings)), capacity = 200u)) <| fun pool -> 
    File.ReadLines fileName
    |> Stream.ofSeq
    |> Stream.chooseFun parser
    |> Stream.afterEach (timeOut afterEachTimeout)
    |> Stream.mapPipelinedJob threadCount (fun (key, input) -> 
        job {
            let! res = 
                pool.WithInstanceJobChoice <| fun fs -> 
                    job { return processor.Process (key, input) fs }
            return key, res
        })
    |> Stream.foldFun (fun total (key, res) ->
          Interlocked.Increment count |> ignore
          if !count % 100 = 0 then printfn "..."
        )
    |> Scheduler.run scheduler
    |> ignore

It seems to work, but when I look at threads with dotTrace, it shows small number of worker thread (== logic processor count (16)):

image

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Comments:14 (12 by maintainers)

github_iconTop GitHub Comments

3reactions
polytypiccommented, Sep 25, 2016

FYI, I made a new release with some potential fixes.

Now all the Streams benchmarks seem to run without leaks on .NET Core/Framework. I tested this by replacing Stream.iter with a version that periodically shows GC.GetTotalMemory true. Note that streams allocate heavily and this can make the runtime grow the heap aggressively to avoid spending time in GC.

One thing I found out is that it seems that Mono and .NET Core/Framework behave differently in terms of space safety: some programs (unexpectedly) leak under Mono, but do not leak (as expected) under .NET Core/Framework. I have not investigated this in enough detail to find out the exact reason, but I suspect differences in tail call behavior.

1reaction
slavcommented, Nov 14, 2016

Please correct me if I’m wrong, but here’s how I understand Hopac works:

  1. When app begin to use Hopac, a number of worker threads equal to number of cores (default) is created. Usually all Hopac jobs are executed on those threads, but job can be executed on local thread if started via start. In those cases eventually jobs would migrate to one of the Hopac worker threads (for example when the jobs begins to wait for input from channel, but there’re other circumstances when that can happen). It’s also possible to push job to Hopac thread via switchToWorker
  2. When job is to be executed on Hopac thread, it’s put in a queue of jobs (stash) by the scheduler. There’s a separate stash per each thread.
  3. Each job executes to completion. While it’s running on Hopac worker thread, no other job can be launched on that thread. In effect the job blocks the thread. Thus the need for jobs to be small and cooperative, to free thread up and allow other jobs to execute.
  4. isolate simply moves jobs stuck behind current job to a global stack to be executed on other Hopac worker threads. It simply “reshufles” jobs, but if there’re multiple long-running blocking jobs, they’ll take over all Hopac worker threads and no other regular jobs will execute, in which case I need to use onThreadPool.
  5. Does scheduler take jobs from the global stack and puts them back into stash per worker thread? Or did I get the concept of stash wrong?
Read more comments on GitHub >

github_iconTop Results From Across the Web

Tuning Quartz Scheduler for large number of small jobs
quartz. scheduler. batchTriggerAcquisitionMaxCount ), execute them (in parallel, number of worker threads can be configured using org. quartz.
Read more >
Local Scheduler TOP node
This node is the default scheduler that executes work items on your local machine. This node also runs its scheduled work items in...
Read more >
Quartz Scheduler worker threads not running jobs even ...
We have 30 worker threads in quartz scheduler, and there are 104k jobs in the queue (they were piled up since quartz was...
Read more >
Thread and task architecture guide - SQL Server
A scheduler, also known as SOS scheduler, manages worker threads that require processing time to carry out work on behalf of tasks. Each ......
Read more >
How to Troubleshoot THREADPOOL Waits and Deadlocked ...
A background process, called “Scheduler Monitor“, will identify when the same worker threads are “stuck” in the same state for 60 seconds or ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found