LFBS Home Lehrstuhl für Betriebssysteme RWTH
Disclaimer/
Haftungsausschluss
Webmaster
(admin@lfbs...)
Home > Research > MetaMPICH > MetaComm  

MetaMPICH - Flexible Coupling of Heterogenous MPI Systems

The MetaComm Library

MetaComm is a small additional communication library for MetaMPICH, which provides an easy adaptation of iterative grid based algorithms to the structures of metacomputers.

While MetaMPICH's feature of transparency simplifies the porting of existing MPI applications to metacomputers, the slowest network connection and the slowest processor will dominate the performance and scalability, as it is illustrated in figure 1. In this example the slow TCP connection between the two SCI clusters obviously constitutes the system's bottleneck.

Example of a metacomputer system
Figure 1: Example of a metacomputer system

As algorithms for physical simulation represent a wide scope of parallel applications, we developed MetaComm as an intermediate layer on top of MetaMPICH, which a class of those applications can easily attach to and which tries to avoid such bottlenecks in communication, if possible. This class of grid based algorithms contains all those whose cores solve discrete boundary value problems by iterative relaxation methods. The most known are the Jacobi and the Gauss-Seidel method.
One of simplest boundary value problem is the steady state distribution of temperature in a two-dimensional plate, described by the most known Laplace equation and the predetermined temperature values on the plate's borders. This problem can be solved by iterating several times over a grid of discretized values, as indicated in figure 2.

Illustration of a parallelised grid solver
Figure 2: Illustration of a parallelised grid solver

In case of parallelisation, each process works on a section of the entire grid, so that values on borders of those sections have to be exchanged between the processes. Thereby, the effort of communication for a process depends on the border length of these sections and on the location of its neighbours, because communication with neighbours in other metahosts will reduce performance.
The question is, whether the effort for inter-metahost communication can be reduced by partitioning these sections in a smart way.
In most cases, the given boundary value problems specify additional boundary values within the grid. These inner boundary values are fixed or can be calculated by each process on its own.
The underlying idea is now to disregard such fixed values during the inter-metahost communication phase, because they do not need to be exchanged. Hence, in case of a smart partition scheme, the message length and thus the communication effort between the metahosts will decrease.
Figure 3 shows a simple example of such a smart partition scheme, where the essential message length is reduced by a focus barrier as you can imagine e.g. in a CFD simulation of a flow channel.

Example of a smart partition scheme
Figure 3: Example of a smart partition scheme

MetaComm provides simple communication functions, which can replace all explicit MPI function calls within a parallel application and can easily be included into serial applications to make them parallel. The functions are designed to be called by all application processes, whereby the target paradigm of SPMD is strongly maintained.
The optimizing features are based on an appropriate predetermined partition scheme for the entire grid. To find such a smart partition scheme, we developed a software tool named SmartPart, which automates the search for the optimal partition for a given problem. By analyzing the metacomputer's structure (number of metahosts, number of nodes in each metahost, etc.) and measuring the respective computational power, the tool checks every possible grid allocation, regarding the knowledge (bandwidth and latency) of the communication bottlenecks.
Since MetaComm provides an almost arbitrary allocation of the grid and SmartPart iterates over all possibilities, an effective and optimal partition scheme will mostly be found.

In figure 4 you can see an example screenshot of a CFD simulated flow channel, where three metahosts are working together on a discretized grid by using the optimizing communication functions of MetaComm.

Screenshot of a flow channel simulation
Figure 4: Screenshot of a flow channel simulation

Print Version