Ticket #706 (closed defect: fixed)

Opened 4 weeks ago

Last modified 4 weeks ago

hyp2000_mgr on Centos7.6 is failling in production for HVO lashup

Reported by: paulf Owned by: somebody
Priority: major Milestone: Linux 64bit
Component: hyp2000_mgr Version:
Keywords: Cc:


Okay, we got some weirdness. This is on Centos7.6. 64bit This same hyp2000_mgr passes the memphis test just fine.....but when I run it on a production HVO lashup it fails with this -83 code.

20190322_UTC_13:41:24 hyp2000_mgr: Version 1.48 2019/2/19 hypoinverse-1.43
20190322_UTC_13:41:24 hyp2000_mgr: Read command file <eq/hyp2000_mgr.d>
20190322_UTC_13:41:24 hyp2000_mgr: Initialized hypoinverse with file <eq/hypoinverse/hvo2.hyp.2000>
20190322_UTC_13:48:00 hyp2000_mgr: HYPINV returned -83 while executing <LOC>.
20190322_UTC_13:48:00 hyp2000_mgr: Nonfatal error(s) locating event.
20190322_UTC_13:48:00 hyp2000_mgr: Error reading arcfile <hypoMgrArcOut32>

Change History

comment:1 Changed 4 weeks ago by paulf

Note that the file gets created:

-rw-rw-r--. 1 earthworm earthworm 0 Mar 13 11:08 hypoMgrArcOut32

comment:2 Changed 4 weeks ago by paulf

  • Status changed from new to closed
  • Resolution set to fixed

Okay, so this issue was a case where we had a misconfig in the hyp2000_mgr.d. aka operator error!!!

The SeparatePRTdir setting was a directory that did not exist!!! A definite misconfig and no-no....the upshot was that hypoinverse couldn't write the PRT file and stopped there before writing out the ARC file.

The older code seems to have been more forgiving about this sort of failure to write the PRT file.....not so now.

comment:3 Changed 4 weeks ago by baker

No, the previous version of hyp2000 was not more forgiving. It was wrong. It deceived you by silently continuing after the OPEN failed, instead of notifying you. At HVO you probably see one of those fort.nn files created, like you did on Solaris, instead of the PRT files that should have been created.

In ticket #696, hyp2000_mgr does not work on solaris 10, the OPEN failures on Solaris were due to Fortran code that was acceptable on the other platforms, but was rejected on Solaris. However, the other defect ticket #696 revealed was that hyp2000 improperly continued executing when that happened—silently—without emitting an error message or returning an error code to HYOPEN.

r7828 fixed OPENR/OPENW in allsubs.f to use standard Fortran 90 syntax for the OPEN statements. It also fixed HYOPEN to pass back OPEN failures to its caller. Error codes -81 through -84 were added to hyp2000 for those new failure situations. What you have encountered is the new code which reports back OPEN failures to HYOPEN. HYPINV passes the error code back to hyp2000_mgr and exits.

Unfortunately, you did not see more than the -83 error code in the hyp2000_mgr error message because I neglected to add explanatory text for those new error codes to LogHypoError?() in hyp2000_mgr.c. I will fix that.

The defect you are reporting is really in hyp2000_mgr. It does not validate the options in its configuration file.

This was fixed by Stefan in r7969.

comment:4 Changed 4 weeks ago by paulf

Note that I added a fix to the hyp2000_mgr.c code to actually check for the existence of the PRT directory now...the program exits if the directory doesn't already exist.

comment:5 Changed 4 weeks ago by baker

FYI. The mod in r7969 is necessary, but not sufficient. That code checks for the existence of the SeparatePRTdir file, but does not verify that it is a directory. You have to add more code to check that the S_IFDIR bit is set in the st_mode field in struct stat statbuf.

Note: See TracTickets for help on using tickets.