nextuppreviouscontents
MOLCAS manual:

Next: 9.2 Applying patches Up: 9. Maintaining the package Previous: 9. Maintaining the package

Subsections


9.1 Tailoring

MOLCAS, as shipped, is configured with some default settings. You can change some of these easily. You can change default settings used in MOLCAS (like memory usage, default scratch area, policy in saving files, etc.) by editing MOLCAS resource file: global resource file $MOLCAS/molcasrc or user resource file $HOME/.Molcas/molcasrc.


9.1.1 Dynamic memory

Most modules in MOLCAS utilize dynamic memory allocation. The amount of memory each module allocate is controlled by the environment variable MOLCAS_MEM. The amount of memory allocated is

  • MOLCAS_MEM is undefined -- 1024MB of memory is allocated (on 32 bit installation)
  • MOLCAS_MEM=nn -- nnMB is allocated. If this amount cannot be allocated, the module stops.


9.1.2 Disk usage

Today many workstations utilize 64-bit integers and addressing. However, old UNIX workstations and PC's had 32-bit integers resulting in a file size limit of 2GB. To circumvent these limitations, the I/O routines of MOLCAS support multifile files, where a ``file'' is in reality a logical file consisting of several physical files. The size limit of these physical files is controlled by the environment variable MOLCAS_DISK according to

  • MOLCAS_DISK is undefined -- The modules will use a 2GB size of the physical files. This might be the appropriate setting for machines with 32-bit addressing.
  • MOLCAS_DISK=nn -- The modules will use a nnMB size of the physical files.

To use files with a size bigger than 2GB MOLCAS should be compiled as 64-bit executable.

9.1.3 Improving CPU performance

MOLCAS is shipped with a number of default setup files located in directory cfg/. The defaults in these files are set to a fairly safe level, but not necessary optimal. What you can change to improve performance is

  • Compiler flags
  • Mathematical (blas) libraries

The simplest way to set up optimization level, and/or compile MOLCAS with various BLAS libraries is to use configure -setup. This interactive script helps to make a proper selection of flags for improvement of MOLCAS performance.

If you do decide to try to improve the performance we recommend that you create a new setup file, for example, cfg/local.cfg and modify this file. It is not unlikely that your attempts to optimize the codes will lead you to a case where some modules work and others do not. In such a scenario it can be fruitful to have two copies of MOLCAS, one ``safe'' where all modules work and one ``fast'' where some modules do not function properly.

Changing the compiler flags is the easiest. Using the most aggressive optimization flags do sometimes lead to problems for some of the modules. We have tried to choose an optimization level that yields functioning code, but still reasonable fast. For some systems there is a predefined set of compiler flags for aggressive optimization. To compile MOLCAS with these flags you should run configure with flag -speed fast. Note that this agressive optimization level is not supported by the MOLCAS team. In other words, you are using it at your own risk.

For some platforms you can utilize the vendor blas libraries. This will certainly yield better performance, but may not work on all platforms.

During configuration of MOLCAS it is possible to specify an external BLAS/LAPACK library. Use a flag -blas TYPE to specify the type of BLAS libary: lapack (for a standard lapack library), Goto (for GotoBLAS), Atlas (for ATLAS), MKL (for Intel MKL). You should also specify a flag -blas_lib -Wl,-start-group -L/path/to/blas -lmy-blas -Wl,-end-group specifying the link options. For example, to configure MOLCAS with Intel MKL library, you should issue a command ./configure -compiler intel -blas MKL -blas_lib -Wl,-start-group /opt/intel/mkl/lib/intel64 -lmkl_gf_ilp64 -lmkl_sequential -lmkl_core -Wl,-end-group

To compile MOLCAS with CUDA BLAS library, first, you have to compile the fortran wrapper provided by nVIDIA:

CUDA=/path/to/cuda/
FLAGS=-m64
gcc $FLAGS -I$CUDA/include/ -I$CUDA/src/ -c $CUDA/src/fortran_thunking.c -o \ 
$MOLCAS/lib/fortran.o
./configure -blas CUDA -blas_dir $CUDA/lib
or, if on a 64bit system:
./configure -blas CUDA -blas_dir $CUDA/lib64

After making changes to the setup files you have to issue the commands make veryclean, ./configure and make in the MOLCAS root directory. It is highly recommended to run the verification suite after any changes in configuration file.


9.1.4 Improving I/O performance

In order to activate this technology for a MOLCAS scratch file, one needs to do three things. First, please edit an external resource *.prgm (for example, $MOLCAS/data/seward.prgm) from the $MOLCAS/data/ directory. If you don't have access to the root MOLCAS directory, then you can simply copy the needed resource file into your home $HOME/.Molcas/ directory and edit it there. The editing of the file consists in adding the 'e' character to its attributes:

original: (file) ORDINT "$WorkDir/$Project."OrdInt rw*
modified: (file) ORDINT "$WorkDir/$Project."OrdInt rw*e

Second, you need to set up the MOLCAS_FIM environment variables to 1, i.e.:

export MOLCAS_FIM=1

The third and the final step is to specify the MOLCAS_MAXMEM ($\ge$MOLCAS_MEM) parameter such that the MOLCAS_MAXMEM-MOLCAS_MEM difference (in MW) is sufficient to host an entire file in RAM. In other words, the MOLCAS_MAXMEM-MOLCAS_MEM difference should exceed the original filesize.

In general, not all MOLCAS files are sutiable for placing in RAM. In particular, it is a bad idea to activate FiM for RUNFILE. In order to identify which MOLCAS's files are proper candidates for FiM, you can simply inspect the section "II. I/O Access Patterns" from a MOLCAS's output. All files with high ratio of I/O random Write/Read calls are good candidates for FiM. In particular case of the SEWARD module, the ORDINT file is a very good candidate for FiM:

  II. I/O Access Patterns
  - - - - - - - - - - - - - - - - - - - -
  Unit  Name               % of random
                         Write/Read calls
  - - - - - - - - - - - - - - - - - - - -
   1  RUNFILE             28.6/  11.5
   2  ORDINT             100.0/  24.0
   3  DNSMAT               0.0/   0.0
   4  TWOHAM               0.0/   0.0
   5  GRADIENT            88.9/   0.0
   6  DNSMAX               0.0/   0.0
   7  TWOHAX               0.0/   0.0
   8  SODGRAD             85.7/   0.0
   9  SOXVEC              85.7/   0.0
  10  SODELTA             88.9/   0.0
  11  SOYVEC              88.9/   0.0
  12  ONEINT             100.0/  53.3
  - - - - - - - - - - - - - - - - - - - -


next up previous contents
Next: 9.2 Applying patches Up: 9. Maintaining the package Previous: 9. Maintaining the package