Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Issue in parallel execution

See original GitHub issue

I was trying to concurrently update a hashmap using parseq Task.par(...), expecting that that due to race conditions, I would see an anomaly. However, I ran the parallel task multiple times and no race condition ever happened. Also, parseq ran considerably (20x) slower than ExecutorService. This seems to suggest that either parallel tasks are not actually happening in parallel, or I am using the parseq API incorrectly.

Consider the following comparison between using Parseq Task.par() versus using raw ExecutorService:

import com.linkedin.parseq.Engine;
import com.linkedin.parseq.EngineBuilder;
import com.linkedin.parseq.ParTask;
import com.linkedin.parseq.Task;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;

public class TestingParseqParallelTasks {
  private final static int expectedCount = 5000000;

  private static ExecutorService getExecutorService(){
    int numCores = Runtime.getRuntime().availableProcessors(); // 16
    return Executors.newFixedThreadPool(numCores + 1);
  }

  private static Engine getEngine() {
    final ExecutorService taskScheduler = getExecutorService();
    final ScheduledExecutorService timerScheduler = Executors.newSingleThreadScheduledExecutor();
    return new EngineBuilder().setTaskExecutor(taskScheduler).setTimerScheduler(timerScheduler).build();
  }

  private static ParTask<Void> parallelUpdateTask(Map<String, Integer> map, int count) throws InterruptedException {
    List<Task<Void>> tasks = new ArrayList<>();
    for (int i = 0; i < count; i++) {
      tasks.add(Task.action(() -> map.put("abcd", map.get("abcd") + 1)));
    }
    return Task.par(tasks);
  }

  public static int testParseq() throws Exception {
    Map<String, Integer> map = new HashMap<>();
    map.put("abcd", 0);
    Engine engine = getEngine();
    Task<List<Void>> t1 = parallelUpdateTask(map, expectedCount);
    engine.run(t1);
    t1.await();
    return map.get("abcd");
  }

  public static int testExecutorService() throws Exception {
    Map<String, Integer> map = new HashMap<>();
    map.put("abcd", 0);
    ExecutorService executor = getExecutorService();
    for (int i = 0; i < expectedCount; i++) 
    {
        executor.execute(() -> map.put("abcd", map.get("abcd") + 1));
    }
    executor.shutdown();
    boolean done = executor.awaitTermination(60, TimeUnit.SECONDS);
    if(!done){
      throw new Exception("Taking too much time!");
    }
    return map.get("abcd");
  }

  public static void main(String[] args) throws Exception {
    long before, after;
    before = System.nanoTime();
    final int observedCountFromExecutor = testExecutorService();
    after = System.nanoTime();
    System.out.println(String.format("Executor returned in %s nanoseconds and returned %s", (after-before), observedCountFromExecutor));
    // This runs way faster and is always less than expectedCount

    before = System.nanoTime();
    final int observedCountFromParseq = testParseq();
    after = System.nanoTime();
    System.out.println(String.format("Parseq returned in %s nanoseconds and returned %s", (after-before), observedCountFromParseq));
    // This runs considerably slower and is always equal to expectedCount

    System.exit(0);
    
    /*
    Output:
        Executor returned in 1616721786 nanoseconds and returned 4980309
        Parseq returned in   21630724933 nanoseconds and returned 5000000
    */
  }
}

Issue Analytics

State:
Created 3 years ago
Comments:8 (8 by maintainers)

Top GitHub Comments

1reaction

junchuanwangcommented, Sep 2, 2020

@Anmol-Singh-Jaggi

Hi Anmol, It is not the best analogy because you can read a file synchronously or asynchronously.

if you do it synchronously, it is just like doing some blocking operations, in this case, it is same asThread.sleep and in ParSeq, you have a chance to do it using Task.blocking() so it can be made asynchronously.

If you directly use asynchronous library to read a file, you can check with this section to integrate with parseq task. In this case, it is made asynchronous and will be optimal

For Thread.sleep(), (if you don’t use Task.blocking() to run it in ParSeq), it will actually considered a synchronous operation and will block the task engine from picking up other tasks.

0reactions

Anmol-Singh-Jaggicommented, Sep 3, 2020

Thanks for the reply @junchuanwang Closing the issue for now.