Working Efficiently on Abacus4
The status of the queue can be inspected using
To see why a job is waiting and does not start, use the option
Information about the Loadleveler classes which are available can be
. In particular, the maximum CPU time which a job
in the class can consume is shown.
If, because of maintenance, no more jobs should be started on a
specific node, the class
will be drained first, then
. For this reason, there are occasionally more
resource available in the classes
that there are
In summary, the actual requirements for CPU time should be estimated
as exactly as possible, so that a job which would run for a rather
short time does not have to compete for resources with longer running
On Abacus4, there are several commands, like
, which are
available in multiple versions - from AIX as well as from a different
source, e.g. GNU. If the environment variable
is set, then
returns only the non
AIX pages. The AIX pages can be obtained
| Manpage for GNU
man C df
| Manpage for AIX
man -M/ df
| Manpage for AIX
Parallel jobs on Abacus4 should in general be submitted so that a
small number of tasks is run on each of several nodes, rather than a
large number of tasks on a single node. Sufficient ConsumableResources
are more likely to be available for such jobs and the memory
utilisation for the cluster as a whole will be improved.
Thus, to run a job requiring 8 tasks, following would be sensible choices:
# @ node = 2
# @ tasks_per_node = 4
# @ node = 4
# @ tasks_per_node = 2
Because the InfiniBand connection between the nodes is both very fast
and has a very low latency, the impact on performance of using
multiple nodes is minimised.
Access to the HPC systems is only available via secure methods such as SSH and SCP.
Please note that the following will only work within the FU network. If
you are outside the network, you need to set up a
VPN connection (in German).
To connect to one of the HPC systems, the following command is used:
$ ssh <username>@<system name>.zedat.fu-berlin.de
$ ssh email@example.com
You will then be asked for your ZEDAT password.
If you want to start a program on the remote system which opens a window on your
Linux computer under X, use the option
. This also works on Mac OS X, but
from Version 10.8 on the package XQuartz
With the Linux command
you can setup a local directory that refers to a directory on a remote machine.
First you create a local directory on your own Linux machine, e.g.
$ mkdir my_remote_dir
You then enter the following
$ sshfs <username>@<system name>.zedat.fu-berlin.de: my_remote_dir
To close the connection you use
$ fusermount -u my_remote_dir