Parallel Job Failure


[ Molcas user's WWWBoard ]

Posted by Helen Tsui on May 29, 2002 at 18:16:38:

Hi! I tried to run the test029.input using the parallel version of MOLCAS5.2, and here are the error from that (see below).

Anyone has any idea how to solve the problem? I am unsure if it is a hardware or software problem.

The parallel version compiled without any problems, and only 7 test jobs failed. They are 029, 032, 048, 059, 088, 092 and 093. They all failed at the beginning of the RASSCF module.

Thanks for your help in advance!

Best wishes

Helen
------------------------------------

RASSCF iterations: Energy and convergence statistics
----------------------------------------------------

Iter CI SX CI RASSCF Energy max BLB max BLB max ROT Level Ln srch Ste
p QN Time(min)
iter iter root energy change element value param shift minimum typ
e update CPU
grabit: MOLCASMEM is 24
grabit: MOLCASDISK is 2048 MBytes
0:Floating Point Exception error, status=: 8
0:Floating Point Exception error, status=: 8
Last System Error Message from Task 0:: Not a typewriter
[0] Aborting program !
forrtl: error (76): IOT trap signal
0: __FINI_00_remove_gp_range [0x3ff81a6c374]
1: __FINI_00_remove_gp_range [0x3ff81a748f0]
2: __FINI_00_remove_gp_range [0x3ff800d0b9c]
3: __FINI_00_remove_gp_range [0x3ff800e1f24]
4: __FINI_00_remove_gp_range [0x3ffbff989d4]
5: __FINI_00_remove_gp_range [0x3ffbffa7810]
6: armci_msg_abort [0x12039f574]
7: [0x12039de38]

8: SigFpeHandler [0x1203abc60]
9: __FINI_00_remove_gp_range [0x3ff800d0b9c]
10: [0x1203a0f1c]

11: armci_msg_reduce_scope [0x1203a1d70]
12: [0x1203a1ec8]

13: ga_dgop [0x12038f298]
14: ga_dgop_ [0x12038f2d4]
15: gadsum_ [gamod.F: 188, 0x1200e9818]
16: tractl2_ [tractl2.F: 202, 0x12005edd8]
17: rasscf_ [rasscf.F: 216, 0x120048f24]
18: main_ [main.F: 6, 0x120047e00]
19: main [for_main.c: 203, 0x120047e6c]
20: __start [0x120047d38]
--- Stop Module: rasscf at Wed May 29 17:05:07 BST 2002 /rc=98 ---
Non-zero return code - check program input.
--- Stop Module: automolcas at Wed May 29 17:05:07 BST 2002 /rc=98 ---
grabit: MOLCASMEM is 24
grabit: MOLCASDISK is 2048 MBytes
1:SigIntHandler: interrupt signal was caught: 2
1:SigIntHandler: interrupt signal was caught: 2
Last System Error Message from Task 1:: Not a typewriter
[1] Aborting program !
forrtl: error (76): IOT trap signal
0: __FINI_00_remove_gp_range [0x3ff81a6c374]
1: __FINI_00_remove_gp_range [0x3ff81a748f0]
2: __FINI_00_remove_gp_range [0x3ff800d0b9c]
3: __FINI_00_remove_gp_range [0x3ff800e1f24]
4: __FINI_00_remove_gp_range [0x3ffbff989d4]
5: __FINI_00_remove_gp_range [0x3ffbffa7810]
6: armci_msg_abort [0x12039f574]
7: [0x12039de38]

8: SigIntHandler [0x1203ab7dc]
9: __FINI_00_remove_gp_range [0x3ff800d0b9c]
10: __FINI_00_remove_gp_range [0x3ff80570a44]
11: __FINI_00_remove_gp_range [0x3ffbff8798c]
12: __FINI_00_remove_gp_range [0x3ffbff90d28]
13: __FINI_00_remove_gp_range [0x3ffbff92df4]
14: __FINI_00_remove_gp_range [0x3ffbff98d3c]
15: __FINI_00_remove_gp_range [0x3ffbff9bffc]
16: armci_msg_rcv [0x12039fb28]
17: armci_msg_bcast_scope [0x12039f914]
18: armci_msg_bcast [0x12039f9f8]
19: [0x1203a1f24]

20: ga_dgop [0x12038f298]
21: ga_dgop_ [0x12038f2d4]
22: gadsum_ [gamod.F: 188, 0x1200e9818]
23: tractl2_ [tractl2.F: 202, 0x12005edd8]
24: rasscf_ [rasscf.F: 216, 0x120048f24]
25: main_ [main.F: 6, 0x120047e00]
26: main [for_main.c: 203, 0x120047e6c]
27: __start [0x120047d38]




Follow Ups:



Post a Followup

Name:
E-Mail:

Subject:

Comments:


[ Follow Ups ] [ Post Followup ] [ Molcas user's WWWBoard ]