The anvil command is designed to be a command line interface to the IceCube detector. It is based on the original rundaq, roadrunner and redaq commands and is designed to superseed these to provide a single point of access for the management of the IceCube detector for operators who are not using the IceAxe portal.
Before explaining how anvil can be used you should underestand that it is not designed for scripting the detector's operations, which can be done by writing jyboss scripts, and so does not support any "flow of control" operations. If this was the reason you started reading this document it is recommended that you first read the "Developer's Introduction to using jyboss in IceCube" and then read this document, paying close attention to "Using anvil commands in jyboss scripts"
The anvil command is normally installed as part of the standard deployment of IceCube software, therefore if you follow the instructions as laid out here and do not manage to get anvil working, you should look at the installation procedure that is documented elsewhere.
To use anvil on an IceCube cluster you need to log on to the *-expcont node of that cluster as the "jboss" user. While you can run the command from anywhere, it is recommended that you change into the ~/jyboss-scripts directory before you execute the command. Having changed inte the jyboss-scripts directory you can start anvil by simply typing anvil, at which point you should see the following rubric.
jboss@spts-expcont[jyboss-scripts] 20:00:28 (0)$ anvil Welcome to Anvil, the command line interface to IceCube. For general help type 'help', for specific help type 'help <topic>'. For help getting started type 'help getting_started'. Anvil@spts-expcont**
You are now ready to use anvil to manage the detector.
The most basic level of detector management is making sure that ACME's watchdog task is running. This task is responsible for trying to keep the detector taking, processing and managing data. The current implement of this task is fairly basic, but as we understand the foibles of the detector we can improve this implement. To check whether the watchdog task is running you can issue the following command at the anvil prompt.
Anvil@spts-expcont** check acme The watchdog and data-taking tasks are running
The watchdog task always tries to restart data-taking if that task is not running. This means you will sometimes see a response to the check acme command like the following.
Anvil@spts-expcont** check acme Only the watchdog task is running
This normally means that you can caught the system after one data-taking task has stopped but before the next has started, and is considered normal.
During normal running the watchdog task should always be running, but there are some circumstances where it can interfere with operations, especially during debugging or a major failure. Under these circumstances the task can be stopped using the following command.
Anvil@spts-expcont** stop
This command requests that the watchdog task stop once the current data-taking task has stopped, but leaves the data-taking task to finish its current run. If you want the current data-taking task to stop immediately, and thus get the watchdog task to also stop immediately, then you shuld use the halt command.
Anvil@spts-expcont** halt
As far as anvil is concerned the acme subsystem is considered running succcessfully only when the watchdog task is running. Thus, if the data-taking task has been started by hand (See "Running the Data-Taking Task"), the check command will give you a result like the following.
Anvil@spts-expcont** check acme Only the data-taking task is running Elements failed Check: acme
While the watchdog task is normally keeps the detector running, there may be times when you want to check how various parts of the system are behaving. You can do this by using different options the check. For example, the default option for the check command is -s that checks the state of all of the detectors subsystems. Thus using this command with no arguments generates something like the following.
Anvil@spts-expcont** check DAQ status is "Running" Dispatch-Delivery status is "Started" PnF status is "running" SPADE is RUNNING. The watchdog and data-taking tasks are running
For complete details on the check command use the help facility.
Anvil@spts-expcont** help check
As noted above, there may be times when it helps to manage the detector directly rather than delegate that responsibility to the watchdog task. In those cases you will need to manage the subsystem of the detector by hand using anvil. Of course, before you start any of those action you should make sure that the watchdog task is not running. otherwise it be executing commands at the same time you are which will lead to confusion at best!
Each subsystem that makes up the IceCube detector runs in one or more JBoss servers. These servers provide the standard exeuction environment for all of our Java software. Each JBoss server is defined by a cluster-independent label called its "location" . All operations effecting JBoss servers are specified in terms of its location. In the current organization of our clusters the location of a JBoss server can be simply derived by removing the cluster's prefix from its node name. For example the spts-expcont node hosts the JBoss server whose location is expcont.
Under normal running the watchdog task makes sure the necessary JBoss servers are running. But there are circumstances where you may want to manage the servers yourself, in these circumstances you will want to make sure that the watchdog is not running (which you ccan do using the check acme command). To check the state of all of the main JBoss servers on a cluster you can simple use the following command.
Anvil@spts-expcont** check -j JBoss is running on expcont JBoss is running on evbuilder JBoss is running on fpmaster JBoss is running on stringproc01 JBoss is running on ichub21 JBoss is running on icetop01 JBoss is running on ithub01
To check the state of a particular JBoss server you simply need to specify its location as an argument to the command.
Anvil@spts-expcont** check -j evbuilder JBoss is running on evbuilder
There are sitiations where the JBoss servers get into strange states (this is usually cause by a breakdown in the management comunications layer) in which case they need to be stopped and restarted so that they are in a known state. The nojboss and jboss commands respectively do these tasks. Thus to cycle the main JBoss servers you can use the following commands. (You should note that these commands do take some time to complete as the wait for the shutdowns and startups to complete before returning the prompt.)
Anvil@spts-expcont** nojboss Shutting down JBoss servers at the following locations: expcont evbuilder fpmaster stringproc01 ichub21 icetop01 ithub01 JBoss stopping on expcont JBoss stopping on evbuilder JBoss stopping on fpmaster JBoss stopping on stringproc01 JBoss stopping on ichub21 JBoss stopping on icetop01 JBoss stopping on ithub01 Anvil@spts-expcont** jboss Starting JBoss servers at the following locations expcont evbuilder fpmaster stringproc01 ichub21 icetop01 ithub01 JBoss starting on expcont JBoss starting on evbuilder JBoss starting on fpmaster JBoss starting on stringproc01 JBoss starting on ichub21 JBoss starting on icetop01 JBoss starting on ithub01
If you have worked you way through the previous section the detector is currently has all of its JBoss servers running and in a known state but the detector will not be taking data. This section reviews how you can start and stop data-taking by hand.
To be completed.
As with the watchdog task, there may be occasions where you do not want to delegate the task of data-taking to the standard task, but wish to manage the whole process by hand. This section reviews the basis steps you have to go through to do that.
To begin this review, it helps to have the JBoss servers in a known state, therefore if you have not already do so you should cycle the main JBoss servers as discussed in "Managing the JBoss servers by hand". This leaves the dispatch-delivery subsystem shutdown, so before you startup DAQ you should make sure it has somewhere to send its data. The following commands starts up the dispatch-delivery subsystem and checks it has started properly.
Anvil@spts-expcont** startup dd Anvil@spts-expcont** check dd Dispatch-Delivery status is "Started"
You are now ready to start walking the DAQ through its paces. This starts with discovering when components are available for the next run using the following commands.
Anvil@spts-expcont** signal DiscoverSig Anvil@spts-expcont** check daq DAQ status is "Idle" Elements failed Check: daq
(The "Elements failed check" means that DAQ is not running, so is the correct response for the current state.)
You can list the discovered components and their states using the daq command.
Anvil@spts-expcont** daq iceTopDataHandler_1 Idle stringProcessor_21 Idle snBuilder_0 Idle tcalBuilder_0 Idle globalTrigger_0 Idle inIceTrigger_0 Idle eventBuilder_0 Idle monitorBuilder_0 Idle domHub_81 Idle domHub_21 Idle iceTopTrigger_0 Idle
With the DAQ in the Idle state you can now connect its output to the dispatch-delivery input with the following command.
Anvil@spts-expcont** connect daq dd
The next step in the process is configuring the DAQ by specifying the configuration Id and trigger files to use and moving the components into a configured state.
Anvil@spts-expcont** configure -i 204 Configuration 204:"daqSystem: DAQ system configuration to be used for LC with SN data taking on SPTS" Anvil@spts-expcont** trigger -f /usr/local/icecube/scripts/configs/spts-triggers.xml Anvil@spts-expcont** signal -w ConfigSig Anvil@spts-expcont** daq iceTopDataHandler_1 Ready stringProcessor_21 Ready snBuilder_0 Ready tcalBuilder_0 Ready globalTrigger_0 Ready inIceTrigger_0 Ready eventBuilder_0 Ready monitorBuilder_0 Ready domHub_81 ChannelsConfigured domHub_21 ChannelsConfigured iceTopTrigger_0 Ready Anvil@spts-expcont** signal -w ConfigDOMsSig Anvil@spts-expcont** daq iceTopDataHandler_1 Ready stringProcessor_21 Ready snBuilder_0 Ready tcalBuilder_0 Ready globalTrigger_0 Ready inIceTrigger_0 Ready eventBuilder_0 Ready monitorBuilder_0 Ready domHub_81 Ready domHub_21 Ready iceTopTrigger_0 Ready
The -w of the signal waits for the transition to complete before returning a prompt.
The final step before starting data-taking is to get ACME to tell DAQ what the run number of the new run will be.
Anvil@spts-expcont** number -u Current run number is 1665
You are now ready to start data-taking.
Anvil@spts-expcont** signal -w StartSig Anvil@spts-expcont** daq iceTopDataHandler_1 Running stringProcessor_21 Running snBuilder_0 Running tcalBuilder_0 Running globalTrigger_0 Running inIceTrigger_0 Running eventBuilder_0 Running monitorBuilder_0 Running domHub_81 Running domHub_21 Running iceTopTrigger_0 Running
Stopping data-taking is much more straight-forward that starting as there is not the necessity for configuration. The following command can be used to stop data-taking.
Anvil@spts-expcont** signal -w StopSig Anvil@spts-expcont** daq iceTopDataHandler_1 Ready stringProcessor_21 Ready snBuilder_0 Ready tcalBuilder_0 Ready globalTrigger_0 Ready inIceTrigger_0 Ready eventBuilder_0 Ready monitorBuilder_0 Ready domHub_81 Ready domHub_21 Ready iceTopTrigger_0 Ready
If you are going to restart data-taking using the same configuration information then you are done. If, however, you want the systems to pick up new configuration information you need to move it back into the Idle state with the following command.
Anvil@spts-expcont** signal -w IdleSig Anvil@spts-expcont** daq iceTopDataHandler_1 Idle stringProcessor_21 Idle snBuilder_0 Idle tcalBuilder_0 Idle globalTrigger_0 Idle inIceTrigger_0 Idle eventBuilder_0 Idle monitorBuilder_0 Idle domHub_81 Idle domHub_21 Idle iceTopTrigger_0 Idle
To be completed.