Task completion at external events
This API has been developed as a result of the feedback received by the OpenMP community and the concerns expressed by some vendors’ implementers about blocking the current execution of the task.
In this approach, synchronous communication primitives are not supported, only asynchronous ones. A task can execute asynchronous communication primitives that will not block the current task, but the task finalization (i.e. when the dependencies of the task are released) will be delayed until the in-flight MPI operation completes.
In the OpenMP Language Committee, this proposal is based on defining the concept of a detached task. Detached tasks are marked by means of a new clause: the detach clause. If no detach clause is present on a task construct the generated task is completed when the execution of its associated structured block is completed. If a detach clause is present on a task construct, the task is completed when the execution of its associated structured block is completed, and the detach-event is fulfilled. In any case, the task data environment is destroyed at the end of the associated structured block.
A new type is defined: omp_event_t. This new type will play the role of the event handler. A new variable of this type is defined, and it is initialized when passed as the parameter of the detach clause. The event and task are associated at that time.
The handler can be used by the omp_fulfill_event routine which will completely finalize the task it was related to. Then, all dependences or task finalization notifications will be complete. The omp_fulfill_event routine fulfills and destroys the OpenMP event passed as a parameter.
In the INTERTWinE project we have implemented a similar API which has two main differences with respect to the OpenMP detachable tasks. The first one is that we do not define (nor require) a new runtime specific type to represent the event. All the operations are done within the context of the task, and we can use a getter routine to obtain the event (or in our case, event counter). The second difference is that our proposal accepts many increases/decreases of this specific counter and allows a single event to wait for more than one asynchronous external event.
The implementation of the API based on external events has been carried out successfully in OmpSs-2 (Nanos6) and the open-source Intel OpenMP runtime which can be considered as the reference implementation of the interface.
The detached task event API proposal will be almost certainly be included in the next version of OpenMP (i.e. OpenMP 5.0, to appear in November 2018). With this extension, Task-Based Runtime Systems will have External Event API capabilities that will allow controlling task block/unlock based on events that occur outside their scope, but more work is needed to push the complete pause/resume functionality into the standard.
Our work on Task Pause/Resume has also been presented to the OpenMP ARB.