Monolith Test, Certification and Release Plan

This document details how Monolith Version 2 will be tested, certified for release and deployed to the South Pole System (SPS).

Table of Contents

  1. Build, Installation & Execution Environments
    1. Build
    2. Installation
    3. Execution
  2. Unit Tests
  3. Compare against Monolith Version 1
  4. Run against Simulated Data
  5. Confirm new 'coincident events' feature
  6. Confirm bugfixes
  7. Tests against Offline Software
    1. Monolith Data Members
    2. File movement for daq-dispatch
  8. Test against TestDAQ Software
  9. Certification Process

1. Build, Installation & Execution Environments

1.1 Build

The certified build environment for Monolith will be a Platinum standard machine. Specifically spts-access as the jboss user.

To setup and build, the following commands will be used (assuming a delivery tag of "V02-00-05"):

$ mkdir DAQ-MONOLITH_205_ws
$ cd  DAQ-MONOLITH_205_ws
$ bfd init $ICECUBE_TOOLS
$ source setup.sh
$ bfd co -rV02-00-05 DAQ-MONOLITH
$ ant -q lib.all

To then create the distributable RPM run:

$ ./DAQ-MONOLITH/deployment/build-rpm.sh

The RPM file: Monolith-2.0.5-1.noarch.rpm will then exist in the current directory.

1.2 Installation

The certified execution environment is also a platinum standard machine, specifically spts-stringproc02. Installation of Monolith will be done via the RPM file created above and installed by a system administrator with the following command:

# rpm -ivh Monolith-2.0.5-1.noarch.rpm

or if upgrading an exiting release:

# rpm -Uvh Monolith-2.0.5-1.noarch.rpm

This installs Monolith into /usr/local/monolith/ owned by the root user with file modes of 755 for directories and executables and 744 for all other files.

This is how Monolith will be installed on the SPS, except sps-stringproc01 will be the machine it is installed on.

1.3 Execution

The execution of Monolith is done as follows:

$ source /opt/monolith/bin/setup-monolith.sh

This sets up the environment to run Monolith from the current working directory and only needs to be done once per execution environment. Then to run Monolith,

$ run_monolith.pl --file=</path/to/testdaq/hitfile> --config=</path/to/monolith/trigger/config-file>

Monolith when run at the pole is run as the testdaq user under control of a script which also runs testdaq.

2. Unit tests

The DAQ-MONOLITH meta-project must pass all unit tests as executed by running:

$ ant -q test.all

At the top of the workspace as checked out and built in the Build section.

3. Compare against Monolith Version 1

Verify that the output is the same when run on existing hit files (with the same trigger configuration) as Monolith V1 has generated. Four runs will be conducted in the execution environment:

  1. V01-00-00 with Inice triggers
  2. V01-00-00 with Icetop triggers
  3. V02-00-02 with Inice triggers
  4. V02-00-02 with Icetop triggers

The output of 1-3 and 2-4 will then be compared. A new feature of Monolith V2 required a change in the binary format of the event files, so the size of the events file will be different, but in a predictable way. Three comparisons will be made for each pair of configurations: number of events, size of file, execution time. This comparison can be done by comparing an ASCII dump of the two binary files confirming that the shared fields have the same values between the dumps of the V1 .events file and the V2 .events file.

4. Run against simulator

The payload-generator V??-??-?? project will be used to generate simulated hit data.. This simulator reads simple text files and produces either binary files or payload objects.

The simulator will be used to produce hits with specific characteristics or with random amounts of waveform data. As described in the following sections these will be used as input to Monolith to verify bugfixes, new features and compatibility with existing functionality. Additionally, runs of the simulator using random configurations will try all possible combinations to see if any of them choke Monolith.

5. Confirm new 'coincident events' feature

The new 'simultaneous Inice/Icetop trigger' feature of Monolith V2 must be tested separately. For true coincident events, there will be two triggers that overlap, one from inice and one from icetop which should get merged into one event. To exercise this feature, simulated hit files will be generated which contain coincident events.

The event files generated by Monolith operating on these simulated hit files will then be analyzed for proper merging of triggers and an a coincident rate will be calculated. Various runs will be compared looking for deviations in this coincident rate.

Additional simulated events will be generated that have a predictable behavior, such that the full parameter space can be tested.

6. Confirm bugfixes

There are two bug fixes in this release of Monolith.

The first bug will be tested as a result of the Compare against Monolith Version 1 section. This is because the bug is in a broken compareTo method, and will be tested when merging the above data. Additionally, unit tests have been created to look for this bug.

The second bug deals with flasherboard runs and happens with a mix of hits with different amounts of waveform data (the flashing DOM had no waveform data while all others had 64B). This will be tested via two paths:

  1. Studying flasher data from Pole.
  2. Setup the simulator to produce hits with random amounts of waveform data.

7. Tests against Offline Software

Offline software depends on Monolith components. Therefore Monolith certification for release and deployed to the SPS depends on successfully testing against the offline software which uses these components. To this end the following tests of projects not within the DAQ-MONOLITH meta-project must succeed.

7.1 Monolith Data Members

Insure that offline-software can read and fully make use of all Monolith data members. This involves feeding a v2 .events file to the monolith-reader module (via JNI triggerUtil-C++) and making sure events look basically like they should. We've got a few tests scripts, things that run the FeatureExtractor, etc. Would compare same run from V1 and V2 make sure results are the same. Also, would make sure the added data members (run# and year of run) are there and return sensible values when accessed.

This only requires a monolith v2 (and v1) files for testing (along with all xml configs in the directory) Would be done on a RHEL 3 machine running the offline software suite (not SPTS)

One complication is the reported memory leak that rears up in triggerUtil-C++ on long runs (like flasher data). Here is the link to the RT issue addressing this.

7.2 File movement for daq-dispatch

Make sure daq-dispatch can find, pick up, and read these files in their output directory structure and pass them to the PnF server at SPTS. Once the data is in the PnF server, it would be passed to clients running offline-software which would decode via item 7.1 above.

This requires a selection of monolith v2 output put into a directory tree structure as expected to run at SP so that DaqDispatch can run over it, find unsent files, and pass them on.

8. Tests against TestDAQ Software

The new monolith requires changes to one of the testdaq launch scripts called background_it.pl. Pat has already figured out what changes need to be made. We will install these changes on PCTS - being careful to not get rid of any changes which are necessary for PCTS (any changes between systems are always very minor). For example, right now background_it.pl on PCTS is programmed not to run monolith at all. The background_it.pl in the project testdaq-launch has the version of the script which is running at the pole.

Run TestDAQ and Monolith on PCTS with the necessary changes to background_it.pl. I assume the new monolith can trigger on the PCTS DOMs. (note, there is currently not much available disk space at PSL for doing this - have to coordinate with the FAT).

The PCTS system is vastly different from pole in that there are only a few DOMs - perhaps the thing to do is to require a 2-fold coincidence in monolith - I assume this is possible - instead of a larger multi-DOM coincidence.

Take the following types of data over and over (say 10 runs in a row for each type):

  1. Local Coincidence data (on PCTS the rate is almost zero for this type of run), so these runs would be by far the easiest type of data for monolith to process.
  2. DarkNoise data - this would produce a higher rate using 2-fold monolith coincidence than for case 1) above.
  3. External LED-pulser triggered LC data - depending on the pulser frequency, this would produce an extremely high trigger rate (it would be an attempt to reproduce how monolith might interact with a south pole flasher run).

With these three types of data we would effectively be testing three different regions of monolith trigger rates.

The questions to answer are:

  1. Is monolith being nohupped and running niced in the background, like I think it is supposed to be. Any memory issues?
  2. Does datacollector take data normally - I assume it would - ie I assume that the monolith nohup'ed background processes would not interfere with data collector.
  3. Does TestDAQ encounter pauses between runs because Monolith is failing to process all of the data quickly enough - if so, how long are the pauses for the different run classifications above.

Given that the scope of this release is not to improve Monolith run times, we expect that Monolith V2 will perform similarly to Monolith V1. Subsequent releases of Monolith may address timing issues.

9. Certification Process

The certification process will be as follows:

Tests will be executed and their results verified by the specific individuals listed below. The results of these tests, along with any reports or details will be checked into cvs under the RelEng project in the Monolith_V2 directory. The contents of this dir will be published at http://glacier.lbl.gov/DAQ/RelEng

Upon successful completion of all tests, an email will be circulated to the certifiers below who will 'sign off' that a specific delivery tag of the DAQ-MONOLITH meta-project has successfully pass all their tests. This email will be added into the RelEng Monolith V2 site and a certification announcement will be mailed to the daq-dev mailing list.

Certifiers

Certification Announcement

The following is the template certification announcement email to be 'signed-off' by the certifiers:

<Date>

Monolith V2, as identified by the delivery tag <V??-??-??> of the
DAQ-MONOLITH meta-project is hear by certified for release to the South
Pole System.

The following tests have been conducted and the above delivery of
DAQ-MONOLITH has successfully passed:

[ ] Keith Beattie - Unit Tests
[ ] Pat Toale     - Compare against Monolith Version 1
[ ] Pat Toale     - Run against simulator
[ ] Pat Toale     - Confirm new 'coincident events' feature
[ ] Pat Toale     - Confirm bugfixes
[ ] Erik Blaufuss - Tests against Offline Software
[ ] Mark Krasberg - Tests against TestDAQ Software

Yeah.