I am attempting to run on multiple nodes in my organizations high performance computing cluster. I know this requires enabling MPI. I have loaded Open MPI version 4.1.4 (ompi/4.1.4) and then running:
make mpi=1
This seems to stall and never progress or error no matter how long I wait. Is it possible this is an issue with the ompi version I use or my approach? The output is shown in the attached image
Note: my organizations HPC only has loaded ompi/4.1.0 - 6 I have tried some other versions with no different results
Hello @Ethan_Evans
I seem to recall I’ve heard of a similar issue in the past but I can’t seem to find a report about it, so I’m afraid we’ll have to do some debugging. The first thing to check is if at least make show
works, and if so what is the output. If make show
works, the next thing to try is a make echo=1 plain=1 -j1
to see where the build stalls.
Thanks for the help @giuseppe.bilotta
make show
does appear to work and I attached the output below.
Then running make echo=1 plain=1 -j1
has the following output which stalls:
Appreciate the help and let me know if I can try anything else!
P.S. I know @AlirezaZarei will be interested in the results as well
OK one issue I’m seeing is that CXX
is being set to the MPI compiler. I suspect this is the reason for the stall. Did you set CXX
manually or was it autodetected that way?
It must be autodetected that way. I have done nothing to set CXX. Should I set it to something else?
You can override the autodetection adding something like CXX=g++
to Makefile.local
, and see if this solves the problem.
We should try to understand why it’s being detected this way too. What does command -v c++
report on that system?
Here is what command -v c++
gives:
If it is helpful, this is what I currently have pertaining to CXX in Makefile.local:
CXXFLAGS=-std=c++14 -march=native
CPPFLAGS += -g
Also: Adding CXX=g++
to the Makefile.local is sucessful. It now sucessfully compiles!
Interesting. So something is setting the default value of CXX
for make
to the MPI compiler, probably when the mpi module is loaded.