Optimising shared reduction variables in MPI programs
AJ Field, PHJ Kelly, TL Hansen - Euro-Par 2002 Parallel Processing: 8th …, 2002 - Springer
AJ Field, PHJ Kelly, TL Hansen
Euro-Par 2002 Parallel Processing: 8th International Euro-Par Conference …, 2002•SpringerAbstract CFL (Communication Fusion Library) is an experimental C++ library which supports
shared reduction variables in MPI programs. It uses overloading to distinguish private
variables from replicated, shared variables, and automatically introduces MPI
communication to keep replicated data consistent. This paper concerns a simple but
surprisingly effective technique which improves performance substantially: CFL operators
are executed lazily in order to expose opportunities for run-time, context-dependent …
shared reduction variables in MPI programs. It uses overloading to distinguish private
variables from replicated, shared variables, and automatically introduces MPI
communication to keep replicated data consistent. This paper concerns a simple but
surprisingly effective technique which improves performance substantially: CFL operators
are executed lazily in order to expose opportunities for run-time, context-dependent …
Abstract
CFL (Communication Fusion Library) is an experimental C++ library which supports shared reduction variables in MPI programs. It uses overloading to distinguish private variables from replicated, shared variables, and automatically introduces MPI communication to keep replicated data consistent. This paper concerns a simple but surprisingly effective technique which improves performance substantially: CFL operators are executed lazily in order to expose opportunities for run-time, context-dependent, optimisation such as message aggregation and operator fusion. We evaluate the idea using both toy benchmarks and a ‘production’ code for simulating plankton population dynamics in the upper ocean. The results demonstrate the library’s software engineering benefits, and show that performance close to that of manually optimised code can be achieved automatically in many cases.
Springer
Showing the best result for this search. See all results