Table of contents

BASIC
How can I get QCG-OMPI?
Is there any howto available?

INSTALLATION
How can I compile / install QCG-OMPI?
I cannot find the configure script, where is it?
What software do I need to have installed on my computer?
Configuration fails with ...

USING
How do I start my application?
mpirun or mpiexec won't work, why?
mpiexec crashes with a SEGFAULT in OOB TCP (persistent services)

ADVANCED
Why do I have to change the range of ports used for two consecutive jobs?
What inter-cluster communication techniques are implemented, and how can I chose the one I want to use?
How do I specify the number of processes I want to run? How do I specify the number of processes I want to execute on a given machine?

TROUBLESHOOTING
A page here is dedicated to finding why your job failed.

How can I get QCG-OMPI?
Get it from the SVN. The current version is in the directory named "INRIA". Other directories named "INRIA-*" contain previous versions.
Is there any howto available?
Yes, you can find it in the doc/ section of your QCG-OMPI directory. Compile it with pdflatex or latex or anything to build a document from a LaTeX file.
How can I compile / install QCG-OMPI?
Everything is explained in the documentation that comes with QCG-OMPI, in the doc section.
I cannot find the configure script, where is it?
As explained in the installation documentation ("Howto"), you need to generate it with autogen.sh.
What software do I need to have installed on my computer?
You will need: Of course all of those programs need to be in your PATH, and the libraries must be in your LD_LIBRARY_PATH.
Configuration fails with ...
With what? TODO
How do I start my application?
You have to use a deployer that will deploy the grid infrastructure, setup some parameters and start the MPI application with the appropriate parameters. Please read the documentation, everything is written in this not-so-long document.
mpiexec crashes with a SEGFAULT in OOB TCP (persistent services)
If you get the following error : i
[qcg:22817] *** Process received signal ***
[qcg:22817] Signal: Segmentation fault (11)
[qcg:22817] Signal code: Address not mapped (1)
[qcg:22817] Failing at address: 0x8
[qcg:22817] [ 0] [0xffffe440]
[qcg:22817] [ 1] /home/inria/persistent/lib/openmpi/mca_oob_tcp.so [0xb7c6593b]
[qcg:22817] [ 2] /home/inria/persistent/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_peer_resolved+0x43) [0xb7c67b98]
[qcg:22817] [ 3] /home/inria/persistent/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_resolve+0x54) [0xb7c5e315]
[qcg:22817] [ 4] /home/inria/persistent/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_peer_send+0x86) [0xb7c637e2]
[qcg:22817] [ 5] /home/inria/persistent/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_send+0x537) [0xb7c6a627]
[qcg:22817] [ 6] /home/inria/persistent/lib/libopen-rte.so.0(mca_oob_send_packed+0x9b) [0xb7f26e80]
[qcg:22817] [ 7] /home/inria/persistent/lib/openmpi/mca_ns_proxy.so(orte_ns_proxy_create_my_name+0x1e3) [0xb7cacbee]
[qcg:22817] [ 8] /home/inria/persistent/lib/openmpi/mca_sds_singleton.so(orte_sds_singleton_set_name+0x1d) [0xb7c067b1]
[qcg:22817] [ 9] /home/inria/persistent/lib/libopen-rte.so.0(orte_sds_base_set_name+0x37) [0xb7f4fe87]
[qcg:22817] [10] /home/inria/persistent/lib/libopen-rte.so.0(orte_init_stage1+0x488) [0xb7ef6268]
[qcg:22817] [11] /home/inria/persistent/lib/libopen-rte.so.0(orte_system_init+0x24) [0xb7efb2fc]
[qcg:22817] [12] /home/inria/persistent/lib/libopen-rte.so.0(orte_init+0x7b) [0xb7ef5b7f]
[qcg:22817] [13] mpiexec(orterun+0x20f) [0x804a28f]
[qcg:22817] [14] mpiexec(main+0x22) [0x804a076]
[qcg:22817] [15] /lib/i686/cmov/libc.so.6(__libc_start_main+0xe0) [0xb7cf4450]
[qcg:22817] [16] mpiexec [0x8049ff1]
[qcg:22817] *** End of error message ***
Segmentation fault
you may have to restart the grid infrastructure.
Why do I have to change the range of ports used between two consecutive deployments of the grid infrastructure?
Because the ports used by the grid infrastructure components in the previous execution may still be in TIME_WAIT state, that makes them unusable.
What inter-cluster communication techniques are implemented, and how can I chose the one I want to use?
Direct connection, proxy method and TCP traversing. The method is set in the configuration file of the broker. Use 1 to select direct connection, 2 to select proxy method and 3 to select TCP traversing.
How do I specify the number of processes I want to run on a single machine?
Two solutions:

Valid XHTML 1.0 Strict