question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Adding asynchronous execution to TaskSchedule

See original GitHub issue

Currently TaskSchedule API contains only blocking versions of execute methods:

void execute() ;
void execute(GridTask gridTask);
void executeWithProfiler(Policy policy);
void executeWithProfilerSequential(Policy policy);
void executeWithProfilerSequentialGlobal(Policy policy);

All these methods block currently executing Java thread till all computations are done. However, computations are typically off-loaded to GPU and CPU at this moment just keeps waiting.

I think it should be both beneficial and possible to add asynchronous versions of the same methods with the following signatures:

CompletableFuture executeAsync() ;
CompletableFuture executeAsync(GridTask gridTask);
CompletableFuture executeWithProfilerAsync(Policy policy);
CompletableFuture executeWithProfilerSequentialAsyn(Policy policy); // Not sure about "sequential async"
CompletableFuture executeWithProfilerSequentialGlobalAsync(Policy policy); // Not sure about "sequential async"

Thoughts about implementation

Per my understanding, TaskSchedule delegates back to TornadoTaskSchedule

And this objects waits on Event event object (driver-specific). There are specific classes in each driver- CLEvent and PTXEvent

OpenCL provides clSetEventCallback, CUDA has cudaLaunchHostFunc – so it’s possible to get async notifications from both OpenCL and PTX drivers.

So it should be possible to extend CLEvent and PTXEvent + PTXStream to add some form of listeners, where concrete listener inside TornadoTaskSchedule can settle CompletableFuture returned from the proposed TaskSchedule.executeAsync().

Thought?

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:24 (24 by maintainers)

github_iconTop GitHub Comments

1reaction
vsilaevcommented, Jan 26, 2021

There are two main points: At the task-schedule level and the TornadoVM level

Yep, found it. There are were many other issues that I had to fix when trying to add true out-of-order execution (even inside JNI code - blocking wait for read/write). Attached is a patch with fixed runtime + OpenCL, and an API for executeAsync(...)

Currently with OpenCL all tests run ok with both -Dtornado.ooo-execution.enable=true (ooo) and -Dtornado.ooo-execution.enable=false (partly blocking) on NVIDIA OpenCL.

On Intel OpenCL everything is ok for -Dtornado.ooo-execution.enable=false but with -Dtornado.ooo-execution.enable=true I get Segmentation Fault for just everything. Need someone who can debug code and find out the reason.

It would be great if anyone will apply this patch to current develop branch and run tests (for both settings of tornado.ooo-execution.enable) on AMD or other device.

0001-Enable-full-Out-of-order-execution-Adding-executeAsy.zip

0reactions
vsilaevcommented, Feb 2, 2021

Status update.

  1. Obviously, just keeping reference to avoid GC-ing buffer array doesn’t help - its location in memory may be changed during GC.
  2. I rewrite OCLCommandQueue to use DirectByteBuffer for non-blocking calls and left existing “array-copying” code only for blocked read/write.
  3. Fixed issues with events processing in TornadoVM and OCLTornadoDevice
  4. Fixed issues in OCLEvent with static buffer usage
  5. Added OpenCL 1.1 compatible version of enqueue barrier / marker that works exactly the same as for OpenCL 1.2 (OCLCommandQueue and OCLDeviceContext).
  6. Finally, added code for CompletableFuture TaskSchedule.executeAsync(...)

Tested on both NVIDIA and AMD. Almost all of the tests – ~325 out of 331 – run ok in blocking, default (mixed) and non-blocking modes. The only exceptions are:

        Running test: testVectorChars            ................  [FAILED]
                \_[REASON] expected:<102> but was:<33126>

Sporadically happens on both AMD and NVIDIA in any mode (default, blocking, non-blocking).

  102 = 0x0066
33126 = 0x8166

I guess this is smth. related to handling 2 bytes types. From tests I saw that you implemented special handling for single-byte type. Probably, two-bytes should be addressed as well. Because it’s always some crap in one byte and correctly set second byte.

        Running test: testProfilerEnabled        ................  [FAILED]
                \_[REASON] null

Always happens on both AMD and NVIDIA in non-blocking mode. Blocking and default modes are ok.

        Running test: testComputePi              ................  [FAILED]
                \_[REASON] expected:<3.14> but was:<5.1518049240112305>

Always happens on AMD (any mode). Need to check on Tornado 0.8 – probably this is not my regression at all.

And the asynchronous invocation itself (i.e. Tornado.executeAsync(...) works as expected.

Waiting for your approval of my previous PR, so I’ll share these results.

Read more comments on GitHub >

github_iconTop Results From Across the Web

windows - Asynchronous Task Scheduler actions - Super User
I want to run a few programs / files on logon, so I'm using Task Scheduler, rather than the Startup folder or scripts,...
Read more >
Asynchronous task within scheduler in Java Spring
Executing scheduled tasks asynchronously. To execute scheduled tasks asynchronously you can use Spring's @Async annotation (and make sure to @ ...
Read more >
25.5 Annotation Support for Scheduling and Asynchronous ...
The @Async annotation can be provided on a method so that invocation of that method will occur asynchronously. In other words, the caller...
Read more >
Consuming the Task-based Asynchronous Pattern
This task scheduler determines whether the awaited asynchronous operation should resume where it completed or whether the resumption should ...
Read more >
Asynchronous and scheduled tasks in Spring - Waiting For Code
Simple asynchronous tasks are executed in background, ... Both can be activated by adding @EnableScheduling and @EnableAsync annotations to ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found