Note: This version of the VDT (1.10.1) is supported, but is not our latest stable release. The current stable release is 2.0.0.

Post-install Configuration

Congratulations! You've installed the VDT. That wasn't so hard now, was it? If it was, please let us know. We're always working to make the VDT easier to install and your feedback is essential.

Post-installation steps for server administrators

After the VDT install completes there are still a few server components left unconfigured. Take a quick look at this list and handle anything that applies to you. You almost certainly do not need to do everything in this list--just do the ones that are relevant to you.

To learn more about the software you've just installed take a look at the documentation for VDT 1.10.1.

Start Services

After you install the VDT, none of the services are running. To start the services, you need to run vdt-control:

> cd $VDT_LOCATION
> . setup.sh
> vdt-control --on

This will install each service into a system-wide location (such as root's crontab, /etc/xinetd.d, or /etc/init.d). For programs that are run via an init script, the script is run.

More documentation on vdt-control

Configure sudo for web services GRAM

In order to use web services GRAM, sudo must be installed and configured. The web services run as the globus user, but need to submit jobs as different users. Configuring sudo for this is more secure than allowing GRAM to run as root. The configuration is also documented in the post-install/README file, and it will differ in one or two ways from what we show below, so you should use that file as a reference:
Runas_Alias GLOBUSUSERS = user1, user2
globus ALL=(GLOBUSUSERS) \ 
       NOPASSWD: /opt/vdt/globus/libexec/globus-gridmap-and-execute \ 
       -g /etc/grid-security/grid-mapfile \ 
       /opt/vdt/globus/libexec/globus-job-manager-script.pl *
globus ALL=(GLOBUSUSERS) \ 
       NOPASSWD: /opt/vdt/globus/libexec/globus-gridmap-and-execute \ 
       -g /etc/grid-security/grid-mapfile \ 
       /opt/vdt/globus/libexec/globus-gram-local-proxy-tool *
Note that you must replace 'user1, user2' with a list of comma-separated user id names. If you prefer, you can allow Globus to sudo to all users except root by substituting the following Runas_Alias line for the one above:
Runas_Alias GLOBUSUSERS = ALL, !root

Set up the web services Globus gatekeeper for your batch system

In order to work with your batch system, the web services Globus gatekeeper (GRAM) must have additional packages installed that are specific to your batch system. To set up any of these, first make sure the batch system's command line tools are in your path. Next, install the appropriate package.
> pacman -get http://vdt.cs.wisc.edu/vdt_1101_cache:Globus-WS-Condor-Setup
    or
> pacman -get http://vdt.cs.wisc.edu/vdt_1101_cache:Globus-WS-LSF-Setup
    or
> pacman -get http://vdt.cs.wisc.edu/vdt_1101_cache:Globus-WS-PBS-Setup

Set up the pre-web services Globus gatekeeper for your batch system

In order to work with your batch system, the pre-web services Globus gatekeeper (GRAM) must have additional packages installed that are specific to your batch system. To set up any of these, first make sure the batch system's command line tools are in your path. Next, install the appropriate package.
> pacman -get http://vdt.cs.wisc.edu/vdt_1101_cache:Globus-Condor-Setup
    or
> pacman -get http://vdt.cs.wisc.edu/vdt_1101_cache:Globus-LSF-Setup
    or
> pacman -get http://vdt.cs.wisc.edu/vdt_1101_cache:Globus-PBS-Setup
    or
> pacman -get http://vdt.cs.wisc.edu/vdt_1101_cache:Globus-SGE-Setup

Configure MonaLisa

During installation the VDT configures MonaLisa with workable, yet inaccurate information. If you plan on running a MonaLisa server you'll want to run $VDT_LOCATION/vdt/setup/configure_monalisa.sh first.

Get a host certificate

If you plan on running any grid services you'll need a host certificate for your machine. See our instructions for more information.

Get service certificates

If you use authenticated MDS, you will need an 'ldap' service certificate. If you use DRM or a service that depends on Apache (VOMS or jClarens) you will need an 'http' service certificate. These are stored in subdirectories of /etc/grid-security/. For instance:
> ls -l /etc/grid-security/ldap
total 12
-rw-r-----    1 daemon   daemon       1194 Feb  2 13:34 ldapcert.pem
-rw-r-----    1 daemon   daemon       1351 Feb  2 13:34 ldapcert_request.pem
-r--------    1 daemon   daemon        887 Feb  2 13:34 ldapkey.pem

> ls -l /etc/grid-security/http
total 12
-rw-rw-r--    1 daemon   daemon       1193 Feb  2 13:34 httpcert.pem
-rw-r--r--    1 daemon   daemon       1379 Feb  2 13:34 httpcert_request.pem
-r--------    1 daemon   daemon        887 Feb  2 13:34 httpkey.pem

Set up a Globus gridmap file

The gridmap file -- located at /etc/grid-security/grid-mapfile -- is also required to run any grid services. It contains a mapping of users' certificate subjects to local UNIX accounts. A user must be in your gridmap file in order for them to take advantage of any grid services you're running. See the Globus documentation for more information.

Set up Simple CA

Most people do not need to set up a certificate authority because they already have access to one. However, you might want to set up a CA for testing purposes and you can use the Globus Simple CA included in the VDT. You will need to run a couple of commands in order to set it up:
$GLOBUS_LOCATION/setup/globus/setup-simple-ca
(This will ask you several questions, including the name of your CA
and your passphrase)

$GLOBUS_LOCATION/setup/globus_simple_ca_HASH_setup/setup-gsi -default
(The HASH will be replaced with the hash for your CA--the first 
command will print it out)
For more details, see: Globus's Simple CA directions

Adding VDT services to TCP wrappers

The VDT runs several services via inetd/xinetd. If you're running TCP wrappers you'll need to modify your access policies to include these new servers. See our TCP wrappers documentation for more details.

Try out Nest

If you want to try out Nest, just install the Nest package. Note that it will reconfigure the GridFTP server to use Nest for access to all storage. You can disable this by editing GridFTP's configuration file.

Configure the Generic Information Provider

The VDT includes the Generic Information Provider (GIP), which can provide information via MDS (the GRIS) that matches the GLUE Schema. If you don't understand this, you can probably skip it. If you want to have your site be able to accept LCG jobs or be part of Open Science Grid, you almost certainly want it. The VDT and VDT-Gatekeeper packages install the GIP by default. If you don't use those packages or the OSG installation, then install the Generic-Information-Provider package. In either case, run the configure_gip script to set them up.

Working with PBS and LSF queues

If you are running PBS or LSF and your site has multiple queues, user's can specify which queue they wish to use when they submit jobs. For GRAM 2 (pre-web services) jobs, they specify it in their RSL with a string like:

...(queue=longjobs)...
Globus verifies that users specify correct queue names and rejects jobs if they do not. If you add or delete queues, you need to tell Globus to rebuild its list of queues.

For PBS:
You can rebuild the list of queues with the following commands:

> cd $VDT_LOCATION
> . setup.sh
> globus/setup/globus/setup-globus-job-manager-pbs

You can verify that the queues are listed by looking at:

$VDT_LOCATION/globus/share/globus_gram_job_manager/pbs.rvf
You do not need to restart any processes after you do this.

For LSF:
You can rebuild the list of queues with the following commands:

> cd $VDT_LOCATION
> . setup.sh
> globus/setup/globus/setup-globus-job-manager-lsf

You can verify that the queues are listed by looking at:

$VDT_LOCATION/globus/share/globus_gram_job_manager/lsf.rvf
You do not need to restart any processes after you do this.

Rate Limit the Globus Jobmanager

The problem

GRAM 2 (a.k.a pre-web services GRAM) has a known problem: every request to submit a job will create a new job manager process, and this process will poll the underlying batch system at ten second intervals. When there are a lot of job managers, the computer can be overwhelmed.

One good solution for this problem is for clients to use Condor-G to submit jobs, and to make sure the Condor Grid Monitor is turned on. If users have installed Condor-G from the VDT, it is turned on. Grid Monitor details from the Condor 6.8 manual

However, the Condor-G grid monitor is a partial solution for several reasons. First of all, it requires participation from all clients. Any single client that doesn't use Condor-G or doesn't use the grid monitor can bring GRAM 2 to its knees. Second, there is a single grid monitor per-user so if a lot of individual clients submit jobs there is a still a problem. Third, there are rare occasions where the grid monitor fails to work correctly. Fourth, the grid monitor still relies on the job manager to work correctly, and restarts it when a job finished. If many jobs finish at the same time, there can be a lot of job managers running.

A solution

A solution is to limit how many GRAM 2 job managers can be running at a time. Unfortunately, Globus does not provide a way to do this. We have patched Globus in the VDT to give you a method. Before we explain it, we will explain how a job manager is created.

When a user submits a job, the Globus gatekeeper handles the authentication and authorization of the user. The gatekeeper is started by the standard xinetd process: one gatekeeper is started for each connection. As soon as the gatekeeper has authorized the user successfully, the gatekeeper starts a new process which becomes the job manager. (For you developer types, the gatekeeper uses fork() and exec()). After it is started, the gatekeeper exits.

Xinetd has facilities to limit how many processes are created. However, because the gatekeeper exits after the job manager is created, xinetd will not limit the total number of job managers.

Our patch to the gatekeeper allows xinetd to control how many job managers are created. To use the new behavior, you need to do two things: edit the gatekeeper configuration and edit the xinetd configuration.

Edit the gatekeeper configuration

To edit the gatekeeper configuration, you need to add a single line to $VDT_LOCATION/globus/etc/globus-gatekeeper.conf. There are three options, and the first two will allow you to rate-limit the job managers with xinetd.

Option Meaning
-launch_method dont_fork After authorization, the gatekeeper becomes the job manager, so you can rate limit the job managers with xinetd. (Technically speaking, the gatekeeper just does an exec() instead of a fork()/exec() combination.) This does have one interesting side effect. The globus-gatekeeper.log file will no longer have the following message:
PID: 19792 -- Notice: 0: Child 19793 started
but will instead have the following message:
Starting child 12345
-launch_method fork_and_wait This option leaves the gatekeeper running as long as the job manager runs. This is slightly safer than the dont_fork option because it does not change the gatekeeper's log file at all. The downside is that there is a gatekeeper process for each job manager, but these should have a small impact because they are not doing anything interesting.
-launch_method fork_and_exit This is the original behavior that does not let you rate-limit the job managers. This is the default.

Edit the xinetd configuration

Edit the xinetd configuration in three steps:

  1. Disable the Globus gatekeeper. (Note, this will not stop running gatekeepers or job managers.)
    vdt-control --off globus-gatekeeper
    
  2. Edit $VDT_LOCATION/etc/services/xinetd-globus-gatekeeper. Change the following line:
    instances = UNLIMITED
    to something like this:
    instances = 100
  3. Re-enable the Globus gatekeeper:
    vdt-control --on globus-gatekeeper