This topic focuses on getting you started with SMPI (Simulated MPI) so that you are ready for the modules in the next topics. Simply read through the 4 tabs below:
This CourseWare was tested with SimGrid version 3.13 (small changes may be needed with other versions).
On a Debian/Ubuntu (virtual) machine, you can retrieve SimGrid directly from the
sudo apt-get install simgrid
For other systems, please refer to the Download page that have all the information you need. But just to be sure, here is a typical way to compile the version you want on a Linux (virtual) box:
sudo apt-get install g++ libboost-all-dev
tar -xvf SimGrid-x.x.x.tar.gz
cmake -DCMAKE_INSTALL_PREFIX=/usr/local -Denable_smpi=on -Denable_documentation=off
sudo make install
Assuming your path includes
/usr/local/bin (which it should), you now can invoke two new
If you are not a superuser on your system, then you have to install SimGrid in your home directory, say in a directory
local, with the modified
invocation. The binaries are then available as
cmake -DCMAKE_INSTALL_PREFIX=$HOME/local/ -Denable_smpi=on -Denable_documentation=off
SMPI simulates the execution of MPI applications by relying on the fast and accurate simulation core provided by SimGrid. The SMPI user (you!) must describe simulated platforms (i.e., sets of simulated hosts and network links, with some performance characteristics). The SimGrid documentation provides ample information on platform descriptions, which are written in XML. Below we simply show a series of examples, which should be sufficient for our purpose. Note that platform files are typically provided in each pedagogic module, but you may have to modify them.
A simple 3-host example: At the most basic level, you can describe your simulated platform as a graph of hosts and network links. For instance:
In the above XML, note the way in which hosts, links, and routes are defined. Note that all hosts are defined
power (i.e., compute speed in Gflops), and links with a
latency (in us)
bandwidth (in MBytes per second). Other units are possible and written as expected.
By default, routes are symmetrical. See more information on the
SimGrid Web site.
Note that this XML file is intended for SimGrid v3.12 or earlier. To
use it with SimGrid v3.13 or above, you have to convert it using the
simgrid_update_xml program, as follows:
A homogeneous cluster with a crossbar switch: A very common parallel computing platform is a homogeneous cluster in which hosts are interconnected via a crossbar switch with as many ports as hosts, so that any disjoint pairs of hosts can communicate concurrently at full speed. For instance:
In the above XML, note that one simply specifies a name prefix and suffix for each host, and then give an integer range (in the example the cluster contains 256 hosts). All hosts have the same power (1 Gflop/sec) and are connected to the switch via links with same latency (5 microseconds) and bandwidth (225 MBytes/sec). See more information on the SimGrid Web site.
A homogeneous cluster with a shared backbone: Another popular model for a parallel platform is that of a set of homogeneous hosts connected to a shared communication medium, a backbone, with some finite bandwidth capacity and on which communicating host pairs can experience contention. For instance:
In the above XML, note that one specifies the latency and bandwidth of the link that connects a host to the backbone (in this example 50 microsec and 125 MByte/sec), as well as the latency and bandwidth of the backbone itself (in this example 500 microsec and 2.25 GByte/sec). See more information on the SimGrid Web site.
Two interconnected clusters: One can connect clusters together and in fact build simulated platforms hierarchically in arbitrary fashions. For instance:
The above XML is a bit more involved. See all details and documentation on the SimGrid Web site.
We are now ready to simulate the execution of an MPI program using SMPI. Let us use the following simple
roundtrip.c, in which the processes pass around a message and
print the elpased time:
Say we want to simulate the execution of this program on a homogeneous cluster, such as the one we saw in the "XML Platforms" tab: cluster_crossbar.xml. We need an "MPI host file", that is a simple text file that lists all hosts on which we wish to start an MPI process: cluster_hostfile.txt.
Compiling the program is straightforward:
Running (simulating) it using 16 hosts on the cluster is done as follows:
-np 16option, just like in regular MPI, specifies the number of MPI processes to use.
-hostfile ./cluster_hostfile.txtoption, just like in regular MPI, specifies the host file.
-platform ./cluster_crossbar.xmloption, which doesn't exist in regular MPI, specifies the platform configuration to be simulated.
Feel free to tweak the content of the XML platform file and the prorgam to see the effect on the simulated execution time. Note that the simulation accounts for realistic network protocol effects and MPI implementation effects. As a result, you may see "unexpected behavior" like in the real world (e.g., sending a message 1 byte larger may lead to significant higher execution time).
SMPI is robust, but it still has some limitations, as listed (and further explained) below:
smpicccompiles your code, and is fairly robust. But if you go crazy with tons of macros and C oddness,
smpiccmay get confused. It should happen only in rather extreme cases, but you are now warned.