$ cp -r /share/apps/samples/lammpi-c++ ./
// Filename: mpi_hello.cpp
// Description: A parallel hello world program
#include <iostream>
#include <mpi.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
MPI::Init(argc, argv);
int rank = MPI::COMM_WORLD.Get_rank();
int size = MPI::COMM_WORLD.Get_size();
std::cout<<"Returned: "<<system("sleep 2")<<" ";
std::cout << "Hello World! I am " << rank << " of " << size <<
std::endl;
MPI::Finalize();
return(0);
}
$ module load lam-intel
(Loads the lammpi/intel-12 module)
$ module load lammpi/intel-11
$ module load lammpi/intel-12
$ module load lammpi/gnu
$ module load lam-gnu (loads the lammpi/gnu module)
$ mpiCC -o lam-hello mpi_hello.cpp
#PBS -N lam-hello
#PBS -q @nic-cluster.mst.edu
#PBS -l nodes=2
#PBS -l walltime=01:00:00
#PBS -m abe
#PBS -M joeminer@mst.edu
#PBS -V
cd $PBS_O_WORKDIR
mpiexec -v -boot -n 2 ./lam-hello
$ qsub jobfile.job
mpiexec: Booting lam..
Running: lamboot -v
LAM 7.1.4/MPI 2 C++/ROMIO - Indiana University
mpiexec: Lamboot Complete
mpiexec: Launching MPI programs
Running: mpirun -v /tmp/lam_appschema_5Q0gj722678 ./lam-hello running on n0 (o)
15318 ./lam-hello running on n1
Returned: 0 Hello World! I am 1 of 2
Returned: 0 Hello World! I am 0 of 2
mpiexec: MPI program execution over..
mpiexec: Performing Lamhalt
LAM 7.1.4/MPI 2 C++/ROMIO - Indiana University
Shutting down LAM
hreq: received HALT_ACK from n1 (compute-0-1.local)
hreq: received HALT_ACK from n0 (compute-0-2.local)
lamhalt: sleeping to wait for lamds to die
LAM halted
mpiexec: Lamhalt complete
--------------------------------------------------------------------------
Failed to find the following executable:
Host: compute-0-2.local
Executable: -boot
Cannot continue.
--------------------------------------------------------------------------
mpiexec: Booting lam..
Running: lamboot -v
LAM 7.1.4/MPI 2 C++/ROMIO - Indiana University
mpiexec: Lamboot Complete
mpiexec: Launching MPI programs
Running: mpirun -v /tmp/lam_appschema_wrR54i12722 ./broken-lam running on n0 (o)
5600 ./broken-lam running on n1
Returned: 0 Hello World! I am 0 of 1
LAM 7.1.4/MPI 2 C++/ROMIO - Indiana University
Shutting down LAM
hreq: received HALT_ACK from n1 (compute-0-1.local)
hreq: received HALT_ACK from n0 (compute-0-2.local)
lamhalt: sleeping to wait for lamds to die
LAM halted
mpiexec: Lamhalt complete
n-1<12717> ssi:boot:base:linear_windowed: booting n0 (compute-0-2.local)
n-1<12717> ssi:boot:base:linear_windowed: booting n1 (compute-0-1.local)
-----------------------------------------------------------------------------
It seems that [at least] one of the processes that was started with
mpirun did not invoke MPI_INIT before quitting (it is possible that
more than one process did not invoke MPI_INIT -- mpirun was only
notified of the first one, which was on node n0).
mpirun can *only* be used with MPI programs (i.e., programs that
invoke MPI_INIT and MPI_FINALIZE). You can use the "lamexec" program
to run non-MPI programs over the lambooted nodes.
-----------------------------------------------------------------------------
mpirun failed with exit status 252