Running Molcas on a Linux Cluster


[ Molcas user's WWWBoard ]

Posted by Jose C. Corchado on January 16, 2003 at 12:57:29:

Hi,

I am trying to install Molcas 5.2 on a Red Hat 7.2 Linux Cluster with Portland compilers (Portland's Cluster Development Kit). When I run it on one processor, it works fine, but when I try to run it on two or more processors I get the following error:

-------- Extract of the output file

ARMCI configured for 2 cluster nodes
-10001(s):armci_rcv_req: invalid to: 55535
-10001(s):armci_rcv_req: invalid to: 55535
1:Child process terminated prematurely, status=: 256
rm_l_1_4949: p4_error: net_recv read: probable EOF on socket: 1
Last System Error Message from Task -10001:: Resource temporarily unavailable
1:Child process terminated prematurely, status=: 256
Last System Error Message from Task 1:: Resource temporarily unavailable
-10000(s):armci_rcv_req: invalid to: 55536
-10000(s):armci_rcv_req: invalid to: 55536
Last System Error Message from Task -10000:: Resource temporarily unavailable
0:Child process terminated prematurely, status=: 256
0:Child process terminated prematurely, status=: 256
Last System Error Message from Task 0:: Resource temporarily unavailable
/usr/local/PGI/linux86/bin/mpirun: line 1: 29562 Broken pipe /home/corchado/molcas52/bin/seward.exe -p4pg /home/corchado/tmp/test001.input.29354/PI29485 -p4wd /home/corchado/tmp/test001.input.29354
--- Stop Module: seward at Thu Jan 16 12:14:38 CET 2003 /rc=98 ---
--- Stop Module: seward at Thu Jan 16 12:14:38 CET 2003 /rc=98 ---
bm_list_29563: p4_error: net_recv read: probable EOF on socket: 1
Non-zero return code - check program input.
--- Stop Module: automolcas at Thu Jan 16 12:14:38 CET 2003 /rc=98 ---
--- Stop Module: automolcas at Thu Jan 16 12:14:38 CET 2003 /rc=98 ---

-------- END

I have tried to run it on a local directory for each node and also on a NFS shared directory, but none of them worked. If the working directory is not NFS shared, I also get a missing file error.
This is the Symbols file that configure generated. I had to change it a little bit in order to make it run on my cluster:

-------- Symbols file

# Molcas build symbols generated by ./configure on Fri Jan 10 18:43:27 CET 2003 for MOLCAS version 5.2 patch level 138.

# ./configure options, DO ONLY CHANGE BY RERUNNING CONFIGURE.
OS='Linux'
COMPILER='portland'
FAST='yes'
PARALLEL='yes'
MSGPASS='mpich'

# Machine.
HW='i686'

# Standard commands.
SH='/bin/ksh'
MAKE='/usr/bin/gmake'
CP='/bin/cp'
MV='/bin/mv'
RM='/bin/rm'
LS='/bin/ls'
AWK='/usr/bin/awk'
SED='/bin/sed'
GREP='/bin/grep'
CHMOD='/bin/chmod'
FIND='/usr/bin/find'
MKDIR='/bin/mkdir'
LN='/bin/ln'
SOFTLINK='-L'
WC='/usr/bin/wc'
MORE='/bin/more'
CAT='/bin/cat'
AR='/usr/bin/ar'
TIME='/usr/bin/time'
RANLIB='/usr/bin/ranlib'
LATEX='/usr/bin/latex'
DVIPS='/usr/bin/dvips'
MAKEINDEX='/usr/bin/makeindex'
BIBTEX='/usr/bin/bibtex'

# Compilers.
CPP='/usr/bin/cpp'
CPPFLAGS='-P -C -D_LINUX_ -D_MOLCAS_MPP_ -D_HAVE_UNISTD_ -I. -I${INCDIR} -I${GAINC}'
F77='/usr/local/PGI/linux86/bin/pgf77'
F77FLAGS='-fast -Minform,warn -Minfo=loop -Munixlogical -I/home/corchado/MOLCAS5/OTRO/molcas52/include -D_LINUX_ -D_MOLCAS_MPP_ -D_HAVE_UNISTD_ -I. -I${INCDIR} -I${GAINC}'
F77NOWARN=' '
F77STATIC=' '
F90=''
F90FLAGS=''
FPREPROC='F'
CC='/usr/local/PGI/linux86/bin/pgcc'
CFLAGS='-O2 -w -D_LINUX_ -D_MOLCAS_MPP_ -D_HAVE_UNISTD_ -I. -I${INCDIR} -I${GAINC}'
LDFLAGS=''

# External libraries.
XLIB='-llapack -lblas'

# Molcas.
MOLCAS='/home/corchado/MOLCAS5/OTRO/molcas52'
INCDIR='/home/corchado/MOLCAS5/OTRO/molcas52/include'
PRGM_LIST=' seward scf rasscf mipi check alaska caspt2 casvb cpfmcpf ffpt funi genano grid_it guga mbpt2 mckinley mclr motra mrci rasread rassi slapaf vibrot '
UTIL_LIST=' aces2_util amfi_util blas_util casvb_util clones_util dtraf_util essl_util integral_util io_util lapack_util memory_util molcas_ci_util molpro_util nq_util parallel_util pcm_util property_util runfile_util rys_util util '
MANUALS='manual'
MOLCASDRIVER='/home/corchado/bin'

# Global arrays.
GADIR='/home/corchado/molcas52/g'
GAINC='/home/corchado/molcas52/g/include'
GALIB='-L/home/corchado/molcas52/g/lib/LINUX -lma -ltcgmsg-mpi -lglobal -larmci -lpario -L/usr/local/PGI/linux86/lib -lmpich'
GATARGET='LINUX'
GAOPTIONS='MPI_LIB=/usr/local/PGI/linux86/lib MPI_INCLUDE=/usr/local/PGI/linux86/include LIBMPI=-lmpich USE_MPI=yes CC=pgcc FC=pgf77'

# Commands for running executables.
RUNSCRIPT='$program < $input'
RUNBINARY='/usr/local/PGI/linux86/bin/mpirun -np $CPUS $program'

-------- END

I also tried with p4_ch as the message passing method, but it didn't work either, even though the GA test worked.

Any suggestion will be deeply appreciated.

Thank you very much for your help




Follow Ups:



Post a Followup

Name:
E-Mail:

Subject:

Comments:


[ Follow Ups ] [ Post Followup ] [ Molcas user's WWWBoard ]