question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Issue in parallel execution

See original GitHub issue

I was trying to concurrently update a hashmap using parseq Task.par(...), expecting that that due to race conditions, I would see an anomaly. However, I ran the parallel task multiple times and no race condition ever happened. Also, parseq ran considerably (20x) slower than ExecutorService. This seems to suggest that either parallel tasks are not actually happening in parallel, or I am using the parseq API incorrectly.

Consider the following comparison between using Parseq Task.par() versus using raw ExecutorService:

import com.linkedin.parseq.Engine;
import com.linkedin.parseq.EngineBuilder;
import com.linkedin.parseq.ParTask;
import com.linkedin.parseq.Task;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;

public class TestingParseqParallelTasks {
  private final static int expectedCount = 5000000;

  private static ExecutorService getExecutorService(){
    int numCores = Runtime.getRuntime().availableProcessors(); // 16
    return Executors.newFixedThreadPool(numCores + 1);
  }

  private static Engine getEngine() {
    final ExecutorService taskScheduler = getExecutorService();
    final ScheduledExecutorService timerScheduler = Executors.newSingleThreadScheduledExecutor();
    return new EngineBuilder().setTaskExecutor(taskScheduler).setTimerScheduler(timerScheduler).build();
  }

  private static ParTask<Void> parallelUpdateTask(Map<String, Integer> map, int count) throws InterruptedException {
    List<Task<Void>> tasks = new ArrayList<>();
    for (int i = 0; i < count; i++) {
      tasks.add(Task.action(() -> map.put("abcd", map.get("abcd") + 1)));
    }
    return Task.par(tasks);
  }

  public static int testParseq() throws Exception {
    Map<String, Integer> map = new HashMap<>();
    map.put("abcd", 0);
    Engine engine = getEngine();
    Task<List<Void>> t1 = parallelUpdateTask(map, expectedCount);
    engine.run(t1);
    t1.await();
    return map.get("abcd");
  }

  public static int testExecutorService() throws Exception {
    Map<String, Integer> map = new HashMap<>();
    map.put("abcd", 0);
    ExecutorService executor = getExecutorService();
    for (int i = 0; i < expectedCount; i++) 
    {
        executor.execute(() -> map.put("abcd", map.get("abcd") + 1));
    }
    executor.shutdown();
    boolean done = executor.awaitTermination(60, TimeUnit.SECONDS);
    if(!done){
      throw new Exception("Taking too much time!");
    }
    return map.get("abcd");
  }

  public static void main(String[] args) throws Exception {
    long before, after;
    before = System.nanoTime();
    final int observedCountFromExecutor = testExecutorService();
    after = System.nanoTime();
    System.out.println(String.format("Executor returned in %s nanoseconds and returned %s", (after-before), observedCountFromExecutor));
    // This runs way faster and is always less than expectedCount

    before = System.nanoTime();
    final int observedCountFromParseq = testParseq();
    after = System.nanoTime();
    System.out.println(String.format("Parseq returned in %s nanoseconds and returned %s", (after-before), observedCountFromParseq));
    // This runs considerably slower and is always equal to expectedCount

    System.exit(0);
    
    /*
    Output:
        Executor returned in 1616721786 nanoseconds and returned 4980309
        Parseq returned in   21630724933 nanoseconds and returned 5000000
    */
  }
}

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:8 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
junchuanwangcommented, Sep 2, 2020

@Anmol-Singh-Jaggi

Hi Anmol, It is not the best analogy because you can read a file synchronously or asynchronously.

if you do it synchronously, it is just like doing some blocking operations, in this case, it is same asThread.sleep and in ParSeq, you have a chance to do it using Task.blocking() so it can be made asynchronously.

If you directly use asynchronous library to read a file, you can check with this section to integrate with parseq task. In this case, it is made asynchronous and will be optimal

For Thread.sleep(), (if you don’t use Task.blocking() to run it in ParSeq), it will actually considered a synchronous operation and will block the task engine from picking up other tasks.

0reactions
Anmol-Singh-Jaggicommented, Sep 3, 2020

Thanks for the reply @junchuanwang Closing the issue for now.

Read more comments on GitHub >

github_iconTop Results From Across the Web

21 Diagnosing Parallel Execution Performance Problems
When workload distribution is unbalanced, a common culprit is the presence of skew in the data. For a hash join, this may be...
Read more >
Problems in Parallel Execution With Static WebDriver
This post will use a static WebDriver reference variable and show how it creates a problem in running scripts in parallel but this...
Read more >
What is Parallel Execution? - Functionize
In simple terms, parallel execution is a means to test multiple applications or multiple components of an application at the same time. This ......
Read more >
Overcoming Challenges When Moving Toward Parallel Testing
Another common issue organizations encounter when moving towards parallel testing is running into scenarios where a reliance on certain test ...
Read more >
6 Common Challenges Running WebDriver UI Tests in Parallel
1. Preparing Test Environment and Managing Tests Data · 2. Keep The Clean State of Agents · 3. Video Recording · 4. Screenshots...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found