Note: This web site is only kept up to date for OSG Software 1.2 (VDT 2.0.0). If you are looking for information for the most recent release, the RPM-based OSG Software 3.0, please see the OSG documentation web site

What is WHAT, and why are we using it?

WHAT is the Wisconsin Host Administration Tool. It was written by the CSL, which uses it to maintain all the Linux and Solaris machines in the department. They're working on deploying it on Windows machines as well.

WHAT has several advantages over cfengine -- the configuration management tool that most people use. The first is that it can fetch files from a web server instead of requiring a shared filesystem or a special cfengine server. The second is that WHAT is designed for machine generating config files, unlike cfengine which is geared towards hand editing. This comes in very handy when managing a diverse set of machines.

How do I install WHAT on a machine

Couldn't be easier:
  1. If you have a checkout of nightly-tests, make sure it’s up to date:
    cd SOMEWHERE/nightly-tests
    cvs update

    Otherwise, get a fresh checkout:

    cd SOMEWHERE
    cvs -d /p/vdt/workspace/vdt_cvs co nightly-tests
  2. Log on to the new machine as root:
    ssh root@NEW-MACHINE
  3. Copy the installation script from your checkout on AFS to root’ home directory:
    scp 'USER@AFS-HOST:SOMEWHERE/nightly-tests/what/install_what' .
  4. Run the installation script
    ./install_what

How do I run it manually to fix a machine?

Run /opt/what/bin/what apply /opt/what/etc/real.xml

How do I “reserve” a machine for special testing?

If you need to stop running nightly tests on a machine for a while to allow you or someone else to run tests without interference, follow these steps.

To stop the machine from running what and the nightly tests:

  1. sudo /bin/rm -f /etc/cron.d/vdt-*
  2. Email vdt-dev and let everyone know the machine is reserved

To return the machine to its normal state:

  1. sudo /opt/what/bin/what apply /opt/what/etc/bootstrap.xml
  2. Email vdt-dev and let everyone know the machine is back to normal

Note: It will take a few minutes for the vdt-what cron job to run after step 1, so be patient.

Does it have docs?

Yes. Run "man -a what" for a TOC.

Ok, so how does it work?

All the stuff that drives our configuration comes off the website. Take a look at /p/vdt/public/html/what (you can't do this in a browser because it's IP limited):
vail(nmueller): ls /p/vdt/public/html/what   
root/               # Becomes / on the test boxes
bootstrap.xml       # Bootstrap config.  Installs new versions of WHAT and itself.
real.xml            # The "real" configuration.  This is just a mason wrapper around gen_what_config.
gen_what_config*    # A script to create the configuration for a given host.
software/           # Source tarballs for software that we require.
grid-security/      # Grid security tarballs
what.tar.gz         # WHAT itself
cslmods.tar.gz      # Some non-standard perl modules that it requires

Without getting too in depth, what runs periodically on the test boxes and processes bootstrap.xml. If bootstrap.xml, real.xml or what.tar.gz changes (real.xml contains md5sums for all the files in root, so it changes every time they change) it will pull down the new copy, install it and run what on real.xml.

real.xml does all the rest of the work. It pulls down our stuff (/vdt-install-test, /home/vdttest, grid-security.tar.gz), performs some misc configuration (sudoers, ntp, selinux), manages users, installs software we need (postgresql, sudo) and perl modules that we need (lots).

How do I make changes?

Check out the nightly-tests module. If you want to add/edit any files we lay down on the machine, just edit the appropriate files under root. All of the WHAT machinery (bootstrap.xml, gen_what_config, WHAT itself) is sitting in the what subdir in case you need to muck with it. Run "cvs update; make install" to copy everything to the web.

As for editing gen_what_config, it shouldn't be too hard. At the top of the file are variables that control some of the most frequently changed behavior -- users, crontab entries and the pacman version. For anything more in depth just take a look at the code.

Local WHAT patches

In order to make it do what we wanted, I had to add several new actions (nodes) to WHAT. In cfengine this would be accomplished with external scripts, but WHAT uses plugin modules instead. All these are located under nightly_tests.what/what/what/lib/What/Node
Accounts
Adds, removes and changes the uids on accounts. No sense setting up ldap to dist out our four stupid accounts. Doesn't maintain passwords, but you shouldn't use passwords on these boxes anyway.
Pacman
Downloads a version of pacman from Saul's site, installs it and writes the little wrapper script we use to exec it. Also takes care of the /opt/pacman symlink.
PerlModules
Installs perl modules straight from CPAN.
Software
Installs required software packages (currently postgresql and ntp) if they can't be fetched with yum/up2date/apt-get. Currently it's only used on SuSE, but it will come in handy on non-linux platforms.