GASPI Forum Meeting - report
The first of the two annual GASPI Forum Meetings in 2018 was held in Frankfurt, on January 10th. Participants came from Fraunhofer ITWM, DLR, and T-Systems SfR, all of whom are represented in the INTERTWinE project, as well as from Tartu University, LRZ, and ZIH TU Dresden. The meeting primarily focused on two main objectives: a generic C++ Interface to GASPI, and also the concept of GASPI shared segments. Daniel Grünewald from Fraunhofer ITWM presented key concepts for the design of the C++ Interface. The main drivers and underlying requirements for the C++ interface are to increase ease-of-use of GASPI while retaining the performance of notified communication in GASPI. In order to achieve this goal, the C++ interface internally manages all configuration and allocation of GASPI resources such as queues, segments, notifications or groups. The C++ Interface (not dissimilar to ideas used in MPI persistent communication) then splits the communication into a setup part and an execution part, where the assumption is that the latter will be executed many times (e.g in ghost cell exchanges for subsequent iterations).
I myself presented ideas which extend the concept of System-V shared memory (or MPI shared windows) towards notified communication in shared memory. The shared application memory visible across multiple processes here is complemented with a shared notification space. Every process (or rank) with access to the common shared memory (also called ‘local’) now becomes able to read/write the common notification space. This implies that every local process can see all incoming messages for all other local ranks with access to the shared window.
One example where such a feature is of particular use are pipelined rings as used in an allreduce function. Here a local process can reduce and forward all incoming messages for the entire shared segment, which allows not only to entirely hide away latency in the pipelined allreduce communication but to also use all available bandwidth from local ranks for the actual reduction. By exploiting this feature we were able to outperform the best currently available implementations for allreduces by a factor of up to three. (Interoperability of GASPI and MPI in a large scale Lattice-Boltzmann code, proceedings PPAM 2017).
By combining the concept of shared GASPI windows with dependencies between local ranks, the concept can also be used to migrate (flat) MPI-only applications towards an asynchronous communication model. Explicit local communication (within the shared window) here is replaced with data dependencies on local semaphores and the read-access to local ranks. Communication across shared windows is replaced with notified GASPI communication.
While the concept of shared GASPI can be implemented as an additional memory policy (which then will need to go in the GASPI specification) it is not clear yet if the GASPI API will require additional changes (such as e.g. local semaphores).
Further discussion on these topics is expected for our second meeting, to be held on 27th June 2018 (parallel to ISC) in Frankfurt. The GASPI Forum always welcomes new members and fresh ideas. Participating is free of charge and easy to do: An email (for organizational purposes here to: christian.simmendinger [at] t-systems.com ( christian.simmendinger_at_ t-systems.com)) is all the GASPI Forum requires.