source: trunk/ewdoc/WEB_DOC/USER_GUIDE/Inst_config_guide.htm @ 2163

Revision 2163, 37.3 KB checked in by paulf, 14 years ago (diff)

mitch's changes for his email address instead of Barbara

  • Property svn:eol-style set to native
  • Property svn:keywords set to Author Date Id Revision
Line 
1<HTML>
2<HEAD>
3<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=windows-1252">
4<META NAME="Generator" CONTENT="Microsoft Word 97">
5<TITLE>Earthworm Project Summary </TITLE>
6<META NAME="Template" CONTENT="C:\Program Files\Microsoft Office\Office\html.dot">
7</HEAD>
8<BODY TEXT="#000000" LINK="#0000ff" VLINK="#800080" BGCOLOR="#fac0a2" ALINK="#FF0000">
9
10
11<a href="http://www.usgs.gov/"><img src="../images/dark_blue.gif" width="480" height="72" border="0" alt="Visual identity banner and link to the U.S. Geological Survey's home page."></a> 
12
13<P><!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"></P>
14<H1><IMG SRC="../GIFS/ew.logo.gif" WIDTH=104 HEIGHT=100 ALT="EARTHWORM Logo"></H1>
15
16
17<P><!doctype html public "-//w3c//dtd html 4.0 transitional//en"></P>
18<FONT FACE="Courier New"><H1 ALIGN="CENTER">Earthworm Installation &amp; Configuration Guide</H1>
19<B><P ALIGN="CENTER">April 17, 2002</P>
20</B><P>The following notes are intended to be a general guide to installing an Automatic Earthworm system. Note that, the Earthworm project consists of the Automatic and Interactive portions of Earthworm. As described in Section1,"Overview", the Automatic portion consists of the "real-time", rapid-response modules, and the Interactive portion consists of the DBMS and web-page codes. These notes deal with the installation of the Automatic portion of Earthworm.</FONT> </P>
21<FONT FACE="Courier New"><P>These notes are based on the experiences gained from numerous installations, and represent an idealized, average situation. In practice, of course, each installation presents unique challenges which require specific solutions. It is hoped, however, that these notes may be of use in preventing common problems and expediting the process.</FONT> </P>
22<FONT FACE="Courier New"><P>Installation tasks can be divided into those requiring primarily computer engineering skills, and those requiring seismological expertise. Engineering tasks include interfacing to communication links, integration of computer hardware, establishing network connections, and most of the software configuration.</FONT> </P>
23<FONT FACE="Courier New"><P>Seismological tasks include determining the desired processing functions, establishing agreed-upon data exchanges, and configuration of the core processing modules. The lists of such modules is growing, but currently consists of the P-phase picker ("pick_ew"), the event associator ("bind_ew"), the STA/LTA trigger ("carlstatrig" and "carlsubtrig"), and hypoinverse. In particular, configuring the hypoinverse magnitude calculation parameters has proven to be a significant seismological task.</FONT> <BR>
24&nbsp; <BR>
25&nbsp; </P>
26<FONT FACE="Courier New" SIZE=5><P>1. DETERMINE THE DESIRED FUNCTIONS</FONT><FONT SIZE=1> </P>
27</FONT><FONT FACE="Courier New"><P>The main body of documentation is available on the web (http://gldbrick.cr.usgs.gov/ew-doc), and additional features are being continuously developed. New processing modules are initially documented in the release notes and subsequently integrated into the documentation set at the above web site. However, for planning purposes, Earthworm currently offers the following general features:</FONT> </P>
28<FONT FACE="Courier New"><P>1.1. Analog data acquisition: 12 or 16 bit A/D conversion, up to 256 channels per Windows 2000 machine, any number of such machines per system.</FONT> </P>
29<FONT FACE="Courier New"><P>1.2. Digital acquisition: Currently, most commercial data loggers are supported. All data streams, whether analog or digital, can be processed in the same manner.</FONT> </P>
30<FONT FACE="Courier New"><P>1.3. Auto Location: Automatic P-picking, event association, and hypo-inverse solutions.</FONT> </P>
31<FONT FACE="Courier New"><P>1.4. Auto Notification: Selective event notification via e-mail or pager system.</FONT> </P>
32<FONT FACE="Courier New"><P>1.5. STA/LTA Events: Traditional STA/LTA coincidence event generation.</FONT> </P>
33<FONT FACE="Courier New"><P>1.6 Event Archive: Events can be archived in various existing formats, including PC-SUDS, SAC, AH, and UW2. Other archive formats will be integrated as required.</FONT> </P>
34<FONT FACE="Courier New"><P>1.7. Error Processing: email and pager notification of system malfunctions.</FONT> </P>
35<FONT FACE="Courier New"><P>1.8. Manual Events: Events can be declared manually within the time interval available in the on-line trace data storage (below).</FONT> </P>
36<FONT FACE="Courier New"><P>1.9 On-line Trace Data Storage: all acquired trace data can be made available via an IP server for a configurable time period, ranging from hours to weeks.</FONT> </P>
37<FONT FACE="Courier New"><P>1.10. Distributed systems: Any number of Earthworm systems can be linked via IP communication links. The extend of such linkages is configurable. Possible configurations include a 'master' central site and any number of remote nodes operating as one integrated system, as well as limited, negotiated linkages between two sovereign central systems.</FONT> </P>
38<FONT FACE="Courier New"><P>1.11. Multi-machine configurations: Any number of networked Windows 2000 or Solaris computers can be combined to run one Earthworm system.</FONT> </P>
39<FONT FACE="Courier New"><P>1.12 Data Exchange: Any type of internal data can be exchanged in real time with other Earthworm systems. This exchange mechanism has proved to be quite reliable, and has performed well in numerous applications. It requires a communications link capable of physically interfacing to a computer's ethernet port and capable of carrying IP messages. It functions over the public internet as well as private dedicated links.</FONT> </P>
40<FONT FACE="Courier New"><P>1.13. Long period Events: A volcanic long-period event detector, developed by John Evans at Menlo Park.</FONT> <BR>
41&nbsp; <BR>
42&nbsp; </P>
43<FONT FACE="Courier New" SIZE=5><P>2. ESTIMATE THE HARDWARE REQUIREMENTS:</FONT><FONT SIZE=1> </P>
44</FONT><FONT FACE="Courier New"><P>Earthworm has been developed under the mandate of speed and reliability. It therefore tends to use fast, low-level operating system services rather than high-level end-user packages. A side effect of this is that Earthworm is fairly economical in terms of hardware requirements. In general, a contemporary Pentium- or Ultra- class machine is adequate for most regional network loads. Specifically, the hardware requirements can be divided into four categories:</FONT> </P>
45<FONT FACE="Courier New"><P>a. Data acquisition: This can be via the ethernet network port, system bus, or serial ports. In practice, we have not observed problems in the first two cases. We have observed problems with serial port acquisition, especially when running on the same machine as the bus-based A/D acquisition. The symptoms are intermittent data loss from the serial ports, and warning messages from the A/D subsystem.</FONT> </P>
46<FONT FACE="Courier New"><P>b. Numeric processing: This is not generally a problem when the current, common processing features are enabled. A contemporary machine (PentiumII or Sparc Ultra) has been demonstrated to be capable of supporting the numeric and I/O requirements of processing a large events on over 500 channels in rear-real time. However, modules are currently being worked on which may require considerable numeric resources.</FONT> </P>
47<FONT FACE="Courier New"><P>c. On-line Trace Storage: This is provided by the WaveServerV module, which maintains a circular, on-line history of trace data . Each channel is maintained in a separate file, and the user is free to specify the location and size of each such file. In operation, trace data is written to these files as it is acquired, and multiple concurrent client programs can request data via network connections. Therefore, the critical hardware resources are disk storage capacity, disk access time, and network bandwidth. Disk storage requirements can be estimated by multiplying the number of channels to be served, the sampling rate, the number of bytes per sample, the time interval to be available, and an overhead factor of perhaps ten percent. Disk access time depends on the anticipated usage. A write is generated as each trace packet is acquired, and a burst of reads is generated with each client request. In practice, a contemporary 10,000 RPM SCSI disk supporting 64 channels seems conservative; twice that load is successfully functioning at some sites.</FONT> </P>
48<FONT FACE="Courier New"><P>Network bandwidth requirements are critical on the input side, as the trace input streams are real-time. The server-side responses, or course, are not. In situations where this is a concern, a second, dedicated ethernet subnet is usually established to carry real-time trace data, and the participating machines are equipped with two (or more) network adapters.</FONT> </P>
49<FONT FACE="Courier New"><P>d. Event storage and processing: As mentioned above, Earthworm can produce directories of event files from various triggers in various formats. Adequate disk space has to be provided to hold such events until they are moved to off-line media. Considerations there include the expected worst-case seismicity levels, time of storage, and off-line media capacity and life-time.</FONT> </P>
50<FONT FACE="Courier New"><P>* Determine the hardware required (or available) at each node and the central site.</FONT> </P>
51<FONT FACE="Courier New"><P>Given the desired features, as discussed above, and the above considerations, it is then possible to determine the hardware required at each node.</FONT> </P>
52<FONT FACE="Courier New"><P>* Identify available communication link between central and remote nodes. The relevant parameters include nominal bandwidth, likelyhood of degraded performance, and interruptions. This may affect the features to be activated at the various nodes: how autonomous should they be, how much data should they be able to store.</FONT> <BR>
53&nbsp; <BR>
54&nbsp; </P>
55<FONT FACE="Courier New" SIZE=5><P>3. PRODUCE A TEST EVENT</FONT><FONT SIZE=1> </P>
56</FONT><FONT FACE="Courier New"><P>This is an optional step which has proven to be extremely useful in rapidly producing a reliable configuration in low-seismicity areas. It will not be needed until testing, but the effort should be initiated early to produce a timely result. The procedure is to find a suitable past event recorded on digital media, and to convert this to an Earthworm 'player' file. Earthworm includes a 'player' module ("waveserver") which can be used to repeatedly inject the trace data of the event to test the operation of the system.</FONT> </P>
57<FONT FACE="Courier New"><P>This usually requires help from members of the Earthworm development team, and may require some effort, although some conversion software from other formats exists (such as PC-SUDS and CUSP).</FONT> <BR>
58&nbsp; <BR>
59&nbsp; </P>
60<FONT FACE="Courier New" SIZE=5><P>4. PRODUCE A NETWORK STATION LIST</FONT><FONT SIZE=1> </P>
61</FONT><FONT FACE="Courier New"><P>* Request an IRIS-sanctioned network identifier from Tim Ahern, IRIS DMC, Seattle Washington. This two-letter code is used throughout the system as part of the trace data naming convention. It is also essential if any data exchange with other networks is contemplated.</FONT> </P>
62<FONT FACE="Courier New"><P>* Request an Earthworm "Installation Id" from Mitch Withers, CERI, Memphis. Mitch will update the distribution to include your new installation id, and issue an updated "earthworm_global.d" to all involved installations.</FONT> </P>
63<FONT FACE="Courier New"><P>* Earthworm uses the IRIS "Station, Component, Network" channel naming convention. See the IRIS web site for details. An installation is free to define its own naming convention, but this may present problems if linkages to other institutions are desired.</FONT> </P>
64<FONT FACE="Courier New"><P>* Define all station names and locations, and produce a station list in hypoinverse format. If some parameters are not known, use unique temporary names and values which can later be search-and-replaced with real values.</FONT> <BR>
65&nbsp; <BR>
66&nbsp; </P>
67<FONT FACE="Courier New" SIZE=5><P>5. CREATE THE EARTHWORM SOFTWARE DIAGRAMS</FONT><FONT SIZE=1> </P>
68</FONT><FONT FACE="Courier New"><P>For each node, draw a diagram showing modules, message rings, and message paths. Examples are shown in <a href="examples.html">Examples</a>. The diagram should include:</FONT> </P>
69<FONT FACE="Courier New"><P>* Final naming structure:</FONT> </P>
70<FONT FACE="Courier New"><P>It is encouraged that each module in an installation be given a unique module name ( MOD_ID) and that the name include an identifier indicating which node this module runs in. This is not required, but in practice avoids confusion during initial configuration, and especially for later modifications. It has been found that finalizing the names (as discussed below) at this time can avoid a prolonged debugging period later.</FONT> </P>
71<FONT FACE="Courier New"><P>* Identify port numbers for each node:</FONT> </P>
72<FONT FACE="Courier New"><P>Various modules which send and receive data involve IP addresses and port numbers. To avoid later conflicts, the diagram should include the final values of these parameters.</FONT> </P>
73<FONT FACE="Courier New"><P>* Note that each node must include a 'startstop' and 'statmgr' module to permit startup and re-starts of modules.</FONT> <BR>
74&nbsp; <BR>
75&nbsp; </P>
76<FONT FACE="Courier New" SIZE=5><P>5.1 EARTHWORM NAMING SCHEME (BACKGROUND):</FONT><FONT SIZE=1> </P>
77</FONT><FONT FACE="Courier New"><P>There are four types of names within the Earthworm system: "Installation Id", "Module Id", "Message Type", and "Ring Name". Each type of name is represented as an ASCII string at the human interface level, but is resolved to a numeric value when the system is running. The motivation for the naming scheme is that</FONT> </P>
78<FONT FACE="Courier New"><P>* Earthworm is a message-passing system,</FONT> </P>
79<FONT FACE="Courier New"><P>* An Earthworm system consists of many modules which listen to and produce messages, and</FONT> </P>
80<FONT FACE="Courier New"><P>* Many Earthworm systems can exchange messages.</FONT> </P>
81<FONT FACE="Courier New"><P>To implement this, each message, as it is generated, is given a 'shipping label' which contains the numeric values of "Installation Id", "Module Id", "Message Type". The "Installation Id" is the name of the installation at which this Earthworm is running. The "Module Id" is the name of the module which generated this message, and "Message Type" is the name of the type of message this is (eg. trace data, pick, location, heartbeat, etc)</FONT> </P>
82<FONT FACE="Courier New"><P>Since any message may be shipped to any installation, this combination has to be unique amongst all Earthworm installations. This is assured by keeping the "Installation Id" unique. These are assigned by Mitch Withers, CERI Memphis Tennessee.</FONT> </P>
83<FONT FACE="Courier New"><P>A module specifies which messages it is interested in listening to by specifying the 'shipping label'. Therefore, to be general, the "Module Id" must be unique within an installation. (Note that a module may specify a list of desired 'shipping labels', as well as specify wild-card values for selected portions of a label.)</FONT> </P>
84<FONT FACE="Courier New"><P>In order to facilitate meaningful data exchange, there have to be "Message Type" values which are common to all installations. That is, if two installations decide to exchange pick data, both systems must have the same value for "Message Type" pick. Any installation can, of course, define additional private "Message Type" for its own internal purposes.</FONT> </P>
85<FONT FACE="Courier New"><P>"Ring Names" are used by each module to specify which message ring the module is to 'listen to', and which ring it is to broadcast messages into. These names must be unique to each node. A set of traditional names have evolved, (eg. WAVE_RING, HYPO_RING, etc) and it is recommend to adhere to those, as it greatly reduces the amount of editing required.</FONT> </P>
86<FONT FACE="Courier New"><P>The names appear in the configuration files of each module in the form of ASCII strings. The tables which define the numeric values of those strings are stored in two files: "earthworm.d" and "earthworm_global.d". "earthworm_global.d" contains the definitions which are common to all Earthworm sites: the "Installation Id" and "Message Type" values which are exchanged between installations. Changes to this file are coordinated through Mitch Withers, CERI, Memphis. "earthworm.d" contains definitions for local "Message Type", "Module Id", and "Ring Name".</FONT> </P>
87<FONT FACE="Courier New"><P>In the Earthworm release these two files are included in the subdirectory "environment". They have to be moved from there to the local /run/params directory (see section 7).</FONT> </P>
88<P>&nbsp;</P>
89<FONT FACE="Courier New" SIZE=5><P>6. LOAD THE EARTHWORM SOFTWARE</FONT><FONT SIZE=1> </P>
90</FONT><B><FONT FACE="Courier New"><P ALIGN="JUSTIFY">The Earthworm Directory Structure:</FONT> </P>
91<P><div align=left></B><FONT FACE="Courier New">The run-time directory structure traditionally includes the following sub-tree:</div></FONT> <div ALIGN=left></P><DIR>
92<DIR>
93
94<PRE>.../earthworm
95&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /run
96&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /params
97&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /log
98&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /vx.xx&nbsp;&nbsp;
99&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /bin
100&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /environment</PRE></DIR>
101</DIR>
102
103<P></div><div align=left><FONT FACE="Courier New">where</FONT> </P>
104<FONT FACE="Courier New"><P>/earthworm is the root of the Earthworm software;</FONT> </P>
105<FONT FACE="Courier New"><P>/run is the directory which specifies and contains all data specific to an earthworm configuration. Many such 'run' directories can be stored under /earthworm, each specifying a distinct construction. Note that only one such construction can be running at a time.</FONT> </P>
106<FONT FACE="Courier New"><P>/params contains the files specifying the configuration and environment for this earthworm construction.</FONT> </P>
107<FONT FACE="Courier New"><P>/log contains all log files generated by this construction.</FONT> </P>
108<FONT FACE="Courier New"><P>/vx.xx represents some earthworm version, as obtained from the distribution.</FONT> </P>
109<FONT FACE="Courier New"><P>/bin contains all executables for this version, as obtained from the distribution, or by local compilation.</FONT> </P>
110<FONT FACE="Courier New"><P>/environment contains files of global definitions a discussed in next section. Some of these must be moved to the /params directory.</FONT> <BR>
111&nbsp; <BR>
112&nbsp; </P>
113<B><FONT FACE="Courier New"><P>Loading the Software:</FONT> </P>
114</B><FONT FACE="Courier New"><P>* Create the root directory for Earthworm (e.g. c:\earthworm). Bring the current Earthworm release from FTP site into this directory, and unpack it there.</FONT> </P>
115<FONT FACE="Courier New"><P>* Note that the executable binaries are in separate directories of the ftp site. The appropriate version of the executables should be placed in the \bin directory (e.g. c:\earthworm\v#.#\bin).</FONT> </P>
116<FONT FACE="Courier New"><P>* Note that the "src" (\earthworm\v#.#\src) directory of the release contains a subdirectory for each module. Each such directory contains the source and make files, as well as example error processing configuration files (*.desc) and parameter files (*.d).</FONT> <BR>
117&nbsp; <BR>
118&nbsp; </P>
119<FONT FACE="Courier New" SIZE=5><P>7. CREATE THE RUN DIRECTORIES</FONT><FONT SIZE=1> </P>
120</FONT><FONT FACE="Courier New"><P>It is recommended that each node have the same directory structure. This is not required, but has been found useful in simplifying maintenance and updates.</FONT> </P>
121<FONT FACE="Courier New"><P>* Under the \earthworm root directory, create a 'run' directory for each node. (ie., c:\earthworm\run_central, c:\earthworm\run_remote1, c:\earthworm\run_remote2 etc)</FONT> </P>
122<FONT FACE="Courier New"><P>* Under each such run directory create a \log and a \params directory. The \log directory will contain all log files witten by modules in this node. The \params directory will contain all configuration files (.d and .desc) neccessary to run this node.</FONT> </P>
123<FONT FACE="Courier New"><P>* Copy three files from the release the each of the \parms directories. These are "earthworm_global.d", "earthworm.d", and "ew_nt.cmd" (or "ew_sol_sparc.cmd", or "ew_sol_intel.cmd", depending on the platform). These files are located in the earthworm release in the \environment directory.</FONT> <BR>
124&nbsp; <BR>
125&nbsp; </P>
126<FONT FACE="Courier New" SIZE=5><P>8. CONFIGURE THE EARTHWORM ENVIRONMENT FILES</FONT><FONT SIZE=1> </P>
127</FONT><FONT FACE="Courier New"><P>* The earthworm_global.d file should contain your Installation Id. Contact Mitch Withers, CERI, Memphis, who acts as control for this file. Local editing of this file is possible, but will prevent the installation form exchanging data with other sites.</FONT> </P>
128<FONT FACE="Courier New"><P>* Edit earthworm.d: Using your Earthworm software diagram, enter a definition for each each mesasage Ring Name and Module ID at each node. At this point, do not alter the Message Types found there, but use the default entries.</FONT> </P>
129<FONT FACE="Courier New"><P>* Create a .cmd file for each node. This file contains the environment variables used by the Earthworm system. As mentioned above, these files are platform specific, and currently three versions exist: for Windows 2000 and NT: "ew_nt.cmd", for Solaris on Sparc: "ew_sol_sparc.cmd", and for Solaris on Intel: "ew_sol_intel.cmd". (OS/2 has sadly been dropped). Thus, for example, one would create files such as "ew_sol_sparc_central.cmd", "ew_nt_remote1.cmd", and "ew_nt_remote2.cmd".</FONT> </P>
130<FONT FACE="Courier New"><P>In each file, edit the definitions for EW_INSTALLATION, EW_HOME, EW_VERSION, EW_PARAMS and EW_LOG. The remainder of the file requires no change:</FONT> </P>
131<FONT FACE="Courier New"><P>EW_INSTALLATION is the installation id of this institution, as listed in "earthworm_global.d".</FONT> </P>
132<FONT FACE="Courier New"><P>EW_HOME is the earthworm root directory ("/earthworm").</FONT> </P>
133<FONT FACE="Courier New"><P>EW_VERSION is the version of the Earthworm release to be used (eg. v4.0). This permits several versions to be stored concurrently. Changing this definition will cause various versions to be run.</FONT> </P>
134<FONT FACE="Courier New"><P>EW_PARAMS specifies the path to the dirctory from will paramete files will be read (eg. /earthworm/run_central/params). This will be different for each node.</FONT> </P>
135<FONT FACE="Courier New"><P>EW_LOG similarly determines where log files will be written.</FONT> <BR>
136&nbsp; <BR>
137&nbsp; </P>
138<FONT FACE="Courier New" SIZE=5><P>9. MODULE CONFIGURATION</FONT><FONT SIZE=1> </P>
139</FONT><FONT FACE="Courier New"><P>For each module within a node you will need to create a parameter (.d) and an error processing file (.desc). Samples of these files are in the source directory for each module (e.g. c:\earthworm\v#.#\src).</FONT> </P>
140<FONT FACE="Courier New"><P>To configure a module, first create the invocation paragraph for the module in startstop_nt.d (see below). Then copy the sample .d and .desc files from that module's source directory. Change the names to match the name of the Module Id. Edit the files to suit the configuration. At a minimum this involves changing the name of the Module Id (this is where the executable learns its Module Id), Installation Id (in the .desc file) and possibly the ring names. Change any other parameters to suit the configuration.</FONT> </P>
141<FONT FACE="Courier New"><P>"startstop" and "statmgr" are required to run in all nodes an an earthworm system. Since the "startstop" and "statmgr" module's parameter and error processing files contain settings that control and effect the overall node it is advised to first configure these modules. "startstop" is the module which starts first, and brings up the system by creating the specified rings and starting each module. "statmgr" is responsible for processing error messages created by other modules, monitoring their heartbeats, and issuing restart requests when a module's heartbeat ceases (see below).</FONT> </P>
142<FONT FACE="Courier New"><P>9.1 STARTSTOP</FONT> </P>
143<FONT FACE="Courier New"><P>This module is system-specific. Therefore, there are system-specific configuration files (startstop_nt.d and startstop_sol.d) It's configuration file contains the substance of the earthworm configuration diagram (Sect.4). It first lists the rings required for this construction, and then lists a paragraph for each module to be started. This paragraph lists the name of the module's executable, the configuration file it is to read when starting, the priority it is to run at, and (in the Windows 2000 case) whether the module is to have its own window.</FONT> </P>
144<FONT FACE="Courier New"><P>9.2 STATMGR</FONT> </P>
145<FONT FACE="Courier New"><P>This module's configuration file (statmgr.d) lists the error processing files (.desc) of all modules which it is to keep track of. Here are listed the procedures to be followed for the various error types which a client module may issue (email, page, etc), as well as the token "restartMe". If this token appears in a module's .desc file, statmgr will initiate the restart procedure for that module if it's heartbeat should not appear within the stated time period. This procedure involves issuing a 'kill' to the offending module's process id, and "startstop" will then restart the module in a manner identical to that used at startup.</FONT> </P>
146<FONT FACE="Courier New"><P>Note that "statmgr", like all modules, has one input. That is, it attaches and listens to one message ring. In order for it to receive error and heartbeat messages from all modules in a node, there are helper modules which 'conduct' such messages from one ring to another. These are the module "copystatus". They are extremely simple, and have no associated .d or .desc files. They are listed in "startstop_xx.d" as other modules, but their command line arguments, instead of being the name of the configuration file, name the input and output rings to which they are to attach.</FONT> <BR>
147&nbsp; <BR>
148&nbsp; </P>
149<FONT FACE="Courier New"><P>9.3 OTHER MODULES</FONT> </P>
150<FONT FACE="Courier New"><P>The Earthworm release currently includes over 50 modules. These are described on the Earthworm web site (http://gldbrick.cr.usgs.gov/ew-doc). However, several key modules are discussed below:</FONT> </P>
151<FONT FACE="Courier New"><P>* Import/Export</FONT> </P>
152<FONT FACE="Courier New"><P>These modules perform long-distance message exchange between Earthworm systems. The Import ("import_generic") module is supplied with the IP address and port of the Export it is to accept connections from. When it senses a connect request, it connects, and places all incoming messages on it's specified output message ring. The 'shipping label' of the message is that of the incoming message. It exchanges predetermined heartbeat texts with it's Import at specified rates. If the incoming heartbeat text does not match the specification in its configuration file, it closes the connection. If the heartbeat from its Import does not arrive on time, it re-initialized the connection.</FONT> </P>
153<FONT FACE="Courier New"><P>The Export module is given the address of the network port through which they are to communicate - the assumption is that the host machine may have several ports - and the port number to use. Its is given the heartbeat text which it is to send to its Imports, the rate to send them, as well as the text and expected rate of the incoming heartbeats. If the incoming heartbeat does not arrive on time, the connection is closed and re-initialized. It has buffering capability to prevent loss of messages during this process. The Export module comes in two varieties:</FONT> </P>
154<FONT FACE="Courier New"><P>"export_generic" accepts a list of shipping labels which it is to ship. "export_scn" ('scn' as in Station Component Network), specializes in shipping only trace messages, and accepts a list of SCN names to ship. This permits selective sending of only specified data channels.</FONT> </P>
155<FONT FACE="Courier New"><P>* RingtoCoax / CoaxtoRing</FONT> </P>
156<FONT FACE="Courier New"><P>These modules effect short-distance replication of messages on a ring. "ringtocoax" (the name dates back to thin-net times), will broadcast all messages in its specified input ring onto a network adapter. The messages are broadcast in the IP sense, in that no recipient is specified. This allows any number of unknown systems to be listening, without affecting the sending system.</FONT> </P>
157<FONT FACE="Courier New"><P>* Adsend</FONT> </P>
158<FONT FACE="Courier New"><P>* Picker</FONT> </P>
159<FONT FACE="Courier New"><P>* Assoc (binder_ew)</FONT> </P>
160<FONT FACE="Courier New"><P>* Carl Trig</FONT> </P>
161<FONT FACE="Courier New"><P>* HypoInverse</FONT> </P>
162<FONT FACE="Courier New"><P>* Etc.</FONT> <BR>
163&nbsp; <BR>
164&nbsp; </P>
165<FONT FACE="Courier New" SIZE=5><P>10. INITIAL STARTUP</FONT><FONT SIZE=1> </P>
166</FONT><FONT FACE="Courier New"><P>After all modules are configured, the system can be started.</FONT> </P>
167<FONT FACE="Courier New"><P>*For mulit-node configurations, it is recommended that at least one remote node be physically close to the central node for initial debugging.</FONT> </P>
168<FONT FACE="Courier New"><P>* Establish and test all communication links.</FONT> </P>
169<FONT FACE="Courier New"><P>* Start each node. This is initially best done manually: create a command window, and manually change the directory to the /run_xx/params of this node. Execute the applicable environment file (ew_nt.cmd, ew_sol_sparc.cmd, etc.). Enter the command 'startstop'.</FONT> </P>
170<FONT FACE="Courier New"><P>* The window should display a list of modules and their status, as produced by "startstop" presssing 'enter' will re-display this list. The window will also display any error messages from modules which are sharing this window (see below).</FONT> </P>
171<FONT FACE="Courier New" SIZE=5><P>11. DEBUGGING TOOLS AND HINTS:</FONT><FONT SIZE=1> </P>
172</FONT><FONT FACE="Courier New"><P>In the startstop configuration file (startstop_nt.d), each module which is to be run has a "NewConsole" / "NoNewConsole" specification. This determines whether the module's standard output is written to its own command window or into the window used to start the system. For initial testing, it is recommended to set all these to "NoNewConsole". In routine operation, individual windows are convenient for checking on system status. During initial debugging, however, modules tend to exit due to gross configuration errors (typical are invalid "Module Id" and "Ring Name" values). Under Windows 2000 this causes the window of the deceased module to vanish, carrying with it the module's error messages. The "NoNewConsole" setting permits the user to analyze terminal error messages.</FONT> </P>
173<FONT FACE="Courier New"><P>As mentioned above, Earthworm includes a logging scheme in which each module can produce a log file in the /run/log directory. When a module has identity information to create a log file, it will do so, creating a new log file at midnight. These log files are the primary method for analyzing system malfunctions after initial setup. Note that under Windows 2000,a log file which is in use cannot be edited. One solution is to copy the current log file to a scratch file, and then look at that scratch file with an editor.</FONT> </P>
174<FONT FACE="Courier New"><P>Another possible problem during initial configuration is that a configuration file error may cause "startstop" to exit. Since "startstop" is responsible for terminating any modules which may be running, this can leave modules running, which will cause chaos when the system is re-started again. To clean up such 'wild' modules, either terminate them via the Win2000 Task Manager, or simply reboot the machine.</FONT> </P>
175<FONT FACE="Courier New"><P>It is often useful to have direct confirmation of what (if any) messages are appearing in a given message ring. "sniffring" is a useful debugging program which will show a summary of all messages being broadcast into a specified ring. To run this program, open a command window, execute the currently used "ew_nt.cmd" file (see section 9), and then enter the command "sniffring RING_NAME", where RING_NAME is the name of the ring you wish to examine. The program will display the lengths, shipping labels, and summaries of various messages.</FONT> </P>
176<FONT FACE="Courier New"><P>Often, there are many messages in a ring, and one wishes to know if trace data from a specified channel is flowing through that ring. The program "sniffwave" is similar to "sniffring" above, but in addition takes as command line arguments the "Station" "Component" and "Network" name of the channel to be displayed.</FONT> </P>
177<FONT FACE="Courier New"><P>After starting an Earthworm system with "startstop", a short list of the modules in operation, and their status is displayed. This list can be re-displayed by pressing return in the window in which "startstop" was invoked. The same display can be produced by entering the command "status" in a window in which "ew_nt.cmd" has been executed.</FONT> <BR>
178&nbsp; <BR>
179&nbsp; </P>
180<FONT FACE="Courier New" SIZE=5><P>12. COMMON CONFIUGRATION ERRORS:</FONT><FONT SIZE=1> </P>
181</FONT><FONT FACE="Courier New"><P>* Module ID:</FONT> </P>
182<FONT FACE="Courier New"><P>A module will terminate because it was given a "Module Id" which is not defined in "earthworm.d".</FONT> </P>
183<FONT FACE="Courier New"><P>* Port Numbers:</FONT> </P>
184<FONT FACE="Courier New"><P>Strange behavior can occur if several modules are using the same port number. The simplest solution is to assign unique port numbers within an installation.</FONT> </P>
185<FONT FACE="Courier New"><P>* Ring Name:</FONT> </P>
186<FONT FACE="Courier New"><P>A problem of a module not functioning is that it has been configured to listen to the wrong ring, or that the module which is to be supplying the input messages is putting them on some other ring.</FONT> </P>
187<FONT FACE="Courier New"><P>* Heartbeat rates:</FONT> </P>
188<FONT FACE="Courier New"><P>Modules can produce heartbeat messages which "statmgr" can use to determine when a module has died. The rate at which heartbeats are produced (specified in a module's ".d" file) should be reasonably higher than the rate at which "statmgr" expects to see them (as specified in the module's ".desc" file). Setting them the same, or very close can cause a module to be spuriously declared dead (and optionally restarted).</FONT> </P>
189<FONT FACE="Courier New"><P>Heartbeats are also used by the "Import" and "Export" modules to assure that long-distance links are functional. Note that these heartbeats are sent over the long-distance link, and are distinct from the internal heartbeats mentioned above. Heartbeats are sent in both directions. The parameter file contains the rate at which they are generated, and the rate at which they are expected. The sending rate should be considerably larger than the expected rate, to allow for transmission delays. In practice, sending rates of 60 seconds, and expected rates of 300 seconds have been successful over long-haul, low-quality IP links.</FONT> </P>
190<FONT FACE="Courier New"><P>"Import" and "Export", like other modules, produce internal heartbeats. These should be configured to be slower than the heartbeats being exchanged over the long-haul link. Otherwise, "statmgr" may decide that the module has died, and perform needless numerous restarts.</div></FONT> </P>
191<P>&nbsp;</P>
192<FONT FACE="Courier New" SIZE=5><P>13. INCREASING THE NUMBER OF SHARED MEMORY RINGS UNDER SOLARIS</FONT><FONT SIZE=1> </P>
193</FONT><P>Under Solaris 2.6 (and probably other versions as well), the maximum number of shared memory segments is six. This means that on an out-of-the-box machine you can only configure six rings. If you try to configure more than that, you will see a cryptic message from tport_create about too many open files.  The fix to this problem is to add the following lines to the /etc/system file, and then reboot the system.
194</P>
195<PRE>
196 set shmsys:shminfo_shmmax = 4294967295
197 set shmsys:shminfo_shmmin = 1
198 set shmsys:shminfo_shmmni = 100
199 set shmsys:shminfo_shmseg = 20
200 set semsys:seminfo_semmns = 200
201 set semsys:seminfo_semmni = 70
202
203This allows for 20 rings.</PRE>
204<FONT SIZE=4></FONT></BODY>
205</HTML>
206
207<hr></center>
208<table width="87%" border="0">
209  <tr>
210    <td width="74%"><a href="http://www.doi.gov/">U.S. Department of the Interior</a>,
211      U.S. Geological Survey, Reston, VA, USA<br>
212      Contact: <b>bogaert@usgs.gov</b> <br>
213      <a href="http://www.usgs.gov/privacy.html">Privacy Statement</a> || <a href="http://www.usgs.gov/disclaimer.html">Disclaimer</a>
214
215      || <a href="http://www.usgs.gov/foia/">FOIA</a> || <a href="http://www.usgs.gov/accessibility.html">Accessibility</a>
216      <br>
217    </td>
218    <td width="26%"><a href="http://firstgov.gov/"><img src="../images/first_gov2_beveled.gif" width="160" height="47" border="0" alt="Li
219nk to First Gov web site."></a></td>
220  </tr>
221</table>
222<p>&nbsp;</p>
223
224<address>
225The URL of this page is <b>http://gldbrick.cr.usgs.gov/ew-doc/USER_GUIDE/Inst_config_guide.htm</b></address>
226<p>
227Date last modified: April 16, 2002
228</body>
229</html>
230
Note: See TracBrowser for help on using the repository browser.