|
- How do I get an account on the Beowulf?
- How do I connect to the machine and what's it's name?
- Connecting without password to asgard.
- How do I compile a MPI program on the machine?
- And running the compiled program?
- One or two processors per node (ppn=2) due to memory or multiple threads?
- What do I do if I exceed my walltime limit?
- How can I get more information about my submitted jobs?
- More about the queueing system?
- Distributing the data before starting the computations?
Questions and Answers
- How do I get an account on the Beowulf?
- Fill in this form and send it to your local Beowulf
coordinator (see "Group reference people" at the bottom) by mail.
Account Form for Hreidar (GMS 060830)
========================
Institution (mark one of following)
math (D-MATH)
phys (D-PHYS)
werk (D-MATL)
geop (D-ERDW)
biol (D-BIOL)
bio2 (D-BIOL Prof. Ban)
bio3 (D-BIOL Prof. Pelkmans)
other
Group:
Name:
First Name:
Phone:
E-Mail:
Account validity:
In absence of a limiting date, a length of time for which the
account should remain valid or the word "indefinite", the
validity of the account will be restricted to 6 months.
Login Name (nethz-username):
Initial Password (first 4 letters):
The finishing characters of the password will be sent to the
applicant with the confirmation of the account creation. The
resulting initial password must be changed on the 1st login.
Shell (mark one of following)
bash
tcsh
To be e-mailed to the System Management by the Group Contact
Your local Beowulf coordinator will pass this form to the system
administrator and you will be informed when and if your account is
created.
Top
- How do I connect to the machine and what's it's name?
- Access to the machine is allowed only through secure shell
(ssh). That goes for login as well as file transfer. On special
occasions file transfer might also be done by ftp connection
initiated FROM Asgard towards the external machine.
The name of the machine is asgard[.ethz.ch], so that the login
will usually be effected by the command "ssh asgard".
On Unix machines ssh is usually available or can
be installed by the (local) system manager. The sources are
available e.g. at the Sunsite
mirror of Switch. For PCs and Macs there is
a site
license available at the ETHZ (search for "secure").
It is recommended to log out if you are not using the machine over
a period of several hours or over night (otherwise you will miss
the news in the motd).
The machine is configured to use ssh 2 (there is no fall back to ssh 1).
Top
- Connecting without password to asgard.
- Generate a key pair on your normal account with ssh-keygen. This key pair consists of a
private and a public key. The private key can be protected with a
passphrase. Since the file containing the private key is protected
with your normal login password, this is not really necessary: you
can give an empty passphrase (ie. hit enter). Then, copy the
public key (you can find it in $HOME/.ssh/identity.pub) to asgard into the
file $HOME/.ssh/authorized_keys. Now,
you should be able to log into asgard without giving the password
on asgard. If you left the passphrase emtpy, you don't need to
type anything.
If the above statement "you don't need to type anything"
is not true and you are asked "Are you sure you want to
continue connecting (yes/no)?" because the host is not known
to your system, then you should set up your configuration
correctly. If you answer yes to the above question, ssh tries to
add the host key of the remote host to the file known_hosts. If this is not found or not
writeable, then this operation fails and you are asked over and
over again. By specifying
UserKnownHostsFile $HOME/.ssh/known_hosts
in the file $HOME/.ssh/config.
Top
- How do I compile a MPI
program on the machine?
- Log into the machine and compile the program using mpicc (for LAM) or /usr/local/apli/mpich/bin/mpicc (for
MPICH)
instead of gcc. This automatically adds
the necessary includes paths, libraries and library paths.
Top
- And running the compiled program?
- You need to submit a job to the queueing system. This is done with
the qsub command. The method is somewhat
different for LAM and MPICH: I have set up an example for both of
them: LAM example and MPICH example.
Submitting the job is done using
qsub -l nodes=4 pbs.script where 4 is the
number of requested processors.
When you enter the the given qsub
command a job identifier is returned. This may look like this:
$ qsub -l nodes=4 pbs.mpich
39717.asgard01.ethz.ch
When the job has finished, two new files should be in the
directory: pbs.mpich.o39717 (contains
the output of the pbs.mpich script to
standard out) and pbs.mpich.e39717 (contains
the output of the pbs.mpich script to
standard error).
Top
- One or two processors per node (ppn=2) due to memory or multiple threads?
- All Asgard nodes are (at the moment) double-CPU machines with 1GB
memory and 1GB swap space per node. This poses certain
requirements on the usage of the nodes. Some of the main points
are described in the following paragraphs.
- Single-process jobs
- If your process uses two processors (threading), you should
reserve both of them by using ppn=2
in the qsub call. If you don't do that, you are potentially
allowing another job to run on the same node. This job will
then take away from you the resources that you expect to be
using.
If your process needs between 1/2 and 1 GB of memory, you
should reserve the whole node, by specifying ppn=2. If you don't do that, you are
risking that either another user or even another of your own
jobs using between 1/2 and 1 GB memory will be run on the same
node. The two jobs together could then need more than the
available physical memory and start swapping. In such a case
the walltime can potentially increase by a factor of 10 or
more, causing the job abort due to time limit.
If your process needs between 1 and 2 GB of memory, it is
paramount to use the ppn=2
parameter. Even so it is practically guaranteed that your job
will swap and the ratio CP-time/wallclock will become pretty
bad. You should start wondering if Asgard is the right machine
for you (or if you have money to buy more memory for it).
If your process needs more than 2 GB of memory, you should run
it somewhere else. I have seen jobs saying they used more than
that, but I would attribute it more to bad accounting or good
luck than to anything else.
- Parallel jobs
- If you are running parallel jobs (i.e. using more than 1
node), you should practically ALWAYS use the ppn=2 parameter. That will guarantee that
you are not sharing the node with another user who is doing
strange things, it will make things easier on the scheduling
(which does NOT guarantee, that two of your ppn=1 jobs will
run on the same nodes), and will help with the automatic
process deletion at the end of your job.
With MPI you do not have to reprogram your code to be able to
run two processes on one node, but depending on the MPI
version you use, your batch script might need a slight
adjustment.
Of course, all the comments made about the single-process job
memory still apply (i.e. "1/2 to 1 GB: one process per node,
1-2GB: you will be SLOW, more then 2 GB: forget it").
If you want to use both CPUs on a node, add ppn=2 as shown in this example:
qsub -l nodes=4:ppn=2 pbs.lam
Top
- What do I do if I exceed my walltime limit?
- An empty pbs.mpich.e39717 means no
errors. If you get
=>> PBS: job killed: walltime 654 exceeded limit 600
your job was running too long. Try using
qsub -l nodes=4,walltime=1200 pbs.mpich
where 1200 is the number of seconds your job is allowed to
run. See the man page of qsub for more
information.
Top
- How can I get more information about my submitted jobs?
- You can get a list of all submitted jobs at a specific node using
qstat:
$ qstat -a
asgard01.ethz.ch:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
37949.asgard01. parcom04 small job_s3 -- 4 -- -- 00:10 R --
39717.asgard01. pfrauenf small pbs 30267 4 -- -- 00:10 R 00:01
39720.asgard01. parcom14 small job_a1 30424 8 -- -- 00:10 R 00:00
$ qstat -a @n69
n69.asgard.net:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
47323.asgard01. avoellmy large transims.b 18184 96 -- 500mb 08:00 R 00:04
With qstatall you can get informations
about the jobs on all queueing servers, but beware, there may be
quite a lot of them. So use qstatall -u
username.
Or you can get more information about a particular job:
$ qstat -f 39717.asgard01.ethz.ch
Job Id: 39717.asgard01.ethz.ch
Job_Name = pbs
Job_Owner = pfrauenf@gate01.asgard.net
resources_used.cput = 00:00:39
resources_used.mem = 6668kb
resources_used.vmem = 23444kb
resources_used.walltime = 00:01:39
job_state = R
queue = small
server = asgard01.ethz.ch
Checkpoint = u
ctime = Fri May 12 10:25:11 2000
Error_Path = asgard01.ethz.ch:/asgard/home/math/pfrauenf/mpich/latbw/pbs.e3
9717
exec_host = n77/1+n76/1+n75/1+n74/1
Hold_Types = n
Join_Path = n
Keep_Files = n
Mail_Points = a
mtime = Fri May 12 10:25:12 2000
Output_Path = asgard01.ethz.ch:/asgard/home/math/pfrauenf/mpich/latbw/pbs.o
39717
Priority = 0
qtime = Fri May 12 10:25:11 2000
Rerunable = True
Resource_List.nodect = 4
Resource_List.nodes = 4
Resource_List.walltime = 00:10:00
session_id = 30267
Variable_List = PBS_O_HOME=/asgard/home/math/pfrauenf,
PBS_O_LANG=de_CH.ISO-8859-1,PBS_O_LOGNAME=pfrauenf,
PBS_O_PATH=/usr/local/apli/KAI/KCC.flex-3.4g-1/KCC_BASE/bin:/usr/local
/apli/bin:/usr/pbs/bin:/asgard/home/math/pfrauenf/bin:/usr/local/bin:/u
sr/bin:/usr/X11R6/bin:/bin:/usr/games/bin:/usr/games:/opt/gnome/bin:.,
PBS_O_MAIL=/var/spool/mail/pfrauenf,PBS_O_SHELL=/bin/bash,
PBS_O_HOST=asgard01.ethz.ch,
PBS_O_WORKDIR=/asgard/home/math/pfrauenf/mpich/latbw,
PBS_O_QUEUE=feed01
comment = Job started on Fri May 12 at 10:25
etime = Fri May 12 10:25:11 2000
Top
- More about the queueing system?
- Have a look at the man pages of qalter,
qdel, qhold,
qmove, qmsg,
qrerun, qrls,
qselect, qstat
and qsub.
qservers shows how many jobs are
submitted to the different queue servers:
$ qservers
Server Max Tot Que Run Hld Wat Trn Ext Status
---------------- ---- ---- ---- ---- ---- ---- ---- ---- ----------
asgard01.ethz.ch 0 3 0 3 0 0 0 0 Active
n1.asgard.net 0 0 0 0 0 0 0 0 Idle
n23.asgard.net 0 84 36 48 0 0 0 0 Active
n46.asgard.net 0 141 5 136 0 0 0 0 Active
n69.asgard.net 0 1 0 1 0 0 0 0 Active
qstatall -Q shows the distribution of the
jobs in the queues:
$ qstatall -Q
Queue Max Tot Ena Str Que Run Hld Wat Trn Ext Type
---------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----------
creeper_r 0 0 yes yes 0 0 0 0 0 0 Route
feed01 0 0 yes yes 0 0 0 0 0 0 Route
small 0 1 yes yes 0 1 0 0 0 0 Execution
medium 0 1 yes yes 1 0 0 0 0 0 Execution
midnight 0 0 yes no 0 0 0 0 0 0 Execution
l1 0 0 yes yes 0 0 0 0 0 0 Route
l2 0 0 yes yes 0 0 0 0 0 0 Route
large_r 0 0 yes yes 0 0 0 0 0 0 Route
runner_r 0 0 yes yes 0 0 0 0 0 0 Route
mystery 0 0 yes no 0 0 0 0 0 0 Route
huge 0 0 yes no 0 0 0 0 0 0 Route
feed1 0 0 yes yes 0 0 0 0 0 0 Route
huge_a 0 0 yes no 0 0 0 0 0 0 Execution
feed23 0 0 yes yes 0 0 0 0 0 0 Route
creeper 48 83 yes yes 35 48 0 0 0 0 Execution
feed46 0 0 yes yes 0 0 0 0 0 0 Route
runner 138 141 yes yes 5 136 0 0 0 0 Execution
feed69 0 0 yes yes 0 0 0 0 0 0 Route
large 0 1 yes yes 0 1 0 0 0 0 Execution
Top
- Distributing the data before starting the computations?
-
In the Modus operandi, sections 'Disk space'
and 'Efficiency tips', users are adviced to copy their data (and
programs) to the local scratch disk of each compute node before
starting the actual computation.
The following predefined variables are available:
- HOME
- Your home directory.
- HOME_SRV
- Server for the HOME directory.
- WORK
- Your working directory. Per default identical to HOME. If you need a REALLY big space you can
apply for it separately (through your contact person).
- WORK_SRV
- Server for the WORK directory.
- ARCH
- Your archive directory. Per default identical to HOME. If some of your data is not used on a
daily production basis but eats up your HOME or WORK space,
you should consider using the archive server.
CAUTION! This directory is NOT to be used by batch jobs.
CAUTION! This directory should be limited to files which
change only sporadically (if at all).
You can apply for ARCH space through
your contact person.
- ARCH_SRV
- Server for the ARCH directory.
- LOCAL_SCR
- Your scratch directory on the compute node. If it doesn't exist,
you can create it yourself (mkdir
$LOCAL_SCR). The files in this space should exist only
while you are using the compute node (your responsibility).
Examples for setting up the local scratch directory and copying the
data in file from the file server:
- bash Shell
-
if [ ! -d $LOCAL_SCR ] ; then mkdir $LOCAL_SCR ; fi
rcp $WORK_SRV:$WORK/file $LOCAL_SCR
- csh and tcsh Shells
-
if (! -d $LOCAL_SCR) then; mkdir $LOCAL_SCR; endif
rcp $WORK_SRV:$WORK/file $LOCAL_SCR
Top
|