Guided Tours: EASY SP
In this section you will find information on how to start using the EASY SP Scheduler on the IBM SP system. (Most information, but not all, is valid on the other systems as well. The EASY scheduler on
Lucidor is described on
PDC EASY - an abstract description
The EASY SP Scheduler is a system that originates from Argonne
National Laboratory, USA. It has been modified here at PDC to better
suit our needs. Most of these additions have been done to make the
scheduler work within the framework of our particular SP-2
configuration and also to take into account that we are using AFS and
Kerberos. These changes should be largely transparent to the user,
but to be able to cope with an inhomogeneous machine resource such as
the SP-2 Strindberg, some additional flags have been added to some of
the commands. This document is intended as a supplement to the
EASY User's Guide
(260kB compressed PostScript) from Argonne and covers the PDC local
This document follows as close as possible the printed section
numbering of the EASY User's Guide found at Argonne. Hence the
somewhat odd numbering in the table of contents of the printed
PDC EASY - Rules
A job is started during one of the following periods ``Day'',
``Night'', ``NightWE'' or it is "Running". The period
given to a job is decided on the basis of what kind of resources
However, please note these limits are subject of operator control.
You can list the period limits using the command "spq -L".
Easy has a cycle time of a week. During the cycle each period is
scanned at least once. Time limits are with respect to GMT, i.e.,
7AM corresponds to eight o'clock if it's winter or nine o'clock in the
summer if you are in Sweden.
Jobs are only started if they are able to complete within
the period of time. Please note that if you request a time
that is on the minute within the limit of a period, easy will not
start your job if it is delayed a few seconds!
Jobs will only be able to write their result to disk if they have
valid Kerberos tickets when the write operation occurs. Normally
a ticket expires after 10 hours, for jobs that run longer than
that, tickets with longer lifetimes must be obtained. See the
Guided Tour for details.
For interactive development we recommend the
interactive nodes, see below.
PDC EASY - Commands
The following parts of the ANL Easy command set are supported at PDC. You
get access to them by typing ``module add sp easy local'' which you anyhow
always ought to do.
The following commands are extensions to the ANL Easy command set and
are described below in this text.
The following commands of the ANL Easy command set are not available
PDC EASY - Extensions
Spattach aims at interactive users who have MPL, MPI, or HPF similar
codes. However there is nothing that forbids a batch or PVM user to
make use of it. You run your parallel or serial program either as a
command or in a sub-shell when using spattach.
It all is very similar to run an ordinary Unix program.
Spattach exports information such as number of nodes and
name of nodes to its sub-shell or sub-program. This makes
it quite convenient to use.
Spattach is completely silent unless you tell it otherwise.
Among the switches are help, number of nodes, how
long to run, if to send mail, how verbose to be and a
By adding the option `-i' you attach to the interactive
pool. Do not ask for more interactive nodes than the
ones which already have been set up. Otherwise you might
have to wait for quite some time.
Be a good neighbor, don't use the interactive nodes in
exclusive mode. You will have to share them with others.
I want a mix of nodes
Since there are different brands of nodes, it is possible to
request a certain mix of nodes for your job. This apply to both
spsubmit and spattach.
This might be useful if you have a code working in a master/slave
fashion and need more memory or larger disks on one of your nodes.
# I need one W node and eight T nodes
strindberg% spsubmit -p 1W8T -t 60 -j mpi ./a.out
The example will give you 9 nodes, with the `W' node being first.
Specifying -p1W1T1Z1T will give you four nodes, the first being
of brand W, the second a T, a Z and a T again.
PDC Supplement to ANL EASY
Spsummary displays a short user summary, which includes number of jobs
started, dequeued jobs and aggregated allocation time in minutes.
Spwhen gives an estimate of when a job might start. The estimate
is based upon the current situation: How many nodes are considered
up and running, the end--time of the jobs currently running and how
the jobs waiting in line looks like.
The results of spwhen changes if a job that is ahead in line
terminates sooner than expected. Consider spwhen to show
a current best guess.
If there, for example, are three jobs in the machine: One small
currently running, the second being a big one and the third another
small one being possible to back--fill. Spwhen might evaluate that
the third small one will be able to start immediately.
However, the job currently running, terminates sooner than
expected. It's now the big, as in many nodes, jobs turn. Not the
third one, which is not being backfilled.
In other words, spwhen is fragile.
The Spwhen applet is a Java applet you run from your browser. It gives
you a graphical representation of the job queue which you can use
to get an estimate of when your job will start. To run the applet
and for further details on how to use it, see the spwhen
Spjobstatistic display statistics about a job. The statistics
include CPU utilization, memory usage, process historic, disk
usage, network usage which all are collected during execution of
The statistics should, as all statistics, be taken with a grain
of salt. They are meant as a tool for locating errors and
improving efficiency, not to evaluate different program against
Spjobstatistics have three levels of verbosity.
The first, when given only a job id, will display a overview of
all aspects of the job.
strindberg% spjobstatistic 1998120106311635
The second, when given a job id and a category (`cpu', `net',
`unix' or `disk'), will display a overview of all aspects of the
strindberg% spjobstatistic 1998113010004999 cpu net
The third, when given a job id, a nodename and a category,
will display detailed statistics for category on the node
strindberg% spjobstatistic 1998112808473926 r01n05 cpu
The following variables can be used within a batch-script submitted
SP_JID Job ID given by easy.
SP_EASY_HOME Home directory of easy.
SP_ARGS From spsubmits ``Command Line Arguments''.
SP_INITIALDIR From spsubmits ``Initial directory''.
SP_SUBMIT_HOST The node from which the job was submitted.
SP_PROCS Number of allocated nodes.
SP_NODES Allocated nodes
SP_HOSTFILE The file that contains all allocated
host names. There is one allocated host
on each row.