Topic #0 Module: Hello World!

We are now ready to simulate the execution of an MPI program using SMPI. Let us use the following simple MPI program, roundtrip.c, in which the processes pass around a message and print the elpased time:

See the code of roundtrip.c...
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

#define N (1024 * 1024 * 1)

int main(int argc, char *argv[])
{
  int size, rank;
  struct timeval start, end;
  char hostname[256];
  int hostname_len;

  MPI_Init(&argc, &argv);

  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  MPI_Get_processor_name(hostname,&hostname_len);

  // Allocate a 10MiB buffer
  char *buffer = malloc(sizeof(char) * N);


  // Communicate along the ring
  if (rank == 0) {
        gettimeofday(&start,NULL);
        printf("Rank %d (running on '%s'): sending the message rank %d\n",rank,hostname,1);
	MPI_Send(buffer, N, MPI_BYTE, 1, 1, MPI_COMM_WORLD);
       	MPI_Recv(buffer, N, MPI_BYTE, size-1, 1, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
        printf("Rank %d (running on '%s'): received the message from rank %d\n",rank,hostname,size-1);
  	gettimeofday(&end,NULL);
  	printf("%f\n",(end.tv_sec*1000000.0 + end.tv_usec -
		 	start.tv_sec*1000000.0 - start.tv_usec) / 1000000.0);

  } else {
       	MPI_Recv(buffer, N, MPI_BYTE, rank-1, 1, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
        printf("Rank %d (running on '%s'): receive the message and sending it to rank %d\n",rank,hostname,(rank+1)%size);
	MPI_Send(buffer, N, MPI_BYTE, (rank+1)%size, 1, MPI_COMM_WORLD);
  }

  MPI_Finalize();
  return 0;
}

  

Say we want to simulate the execution of this program on a homogeneous cluster, such as the one we saw in the "XML Platforms" tab: cluster_crossbar.xml. We need an "MPI host file", that is a simple text file that lists all hosts on which we wish to start an MPI process: cluster_hostfile.txt.

Compiling the program is straightforward:

  % smpicc -O4 roundtrip.c -o roundtrip

Running (simulating) it using 16 hosts on the cluster is done as follows:

  % smpirun -np 16 -hostfile ./cluster_hostfile.txt -platform ./cluster_crossbar.xml ./roundtrip
The -np 16 option, just like in regular MPI, specifies the number of MPI processes to use.
The -hostfile ./cluster_hostfile.txt option, just like in regular MPI, specifies the host file.
The -platform ./cluster_crossbar.xml option, which doesn't exist in regular MPI, specifies the platform configuration to be simulated.
At the end of the command, one finds the executable name and command-line arguments (in this case there are none).

You will see some warnings/information regarding setting some SMPI configuration parameters. Ignore them for now. One of them will say something about not setting the power of the host running the simulation. This is fine here because in this small example we wish only to simulate the network.

Feel free to tweak the content of the XML platform file and the program to see the effect on the simulated execution time. Note that the simulation accounts for realistic network protocol effects and MPI implementation effects. As a result, you may see "unexpected behavior" like in the real world (e.g., sending a message 1 byte larger may lead to significant higher execution time).