The MCC computer cluster of LDM at PSI

Accounts Log in: To log in to the cluster's master node, open a shell/console and type

ssh username@mcc

You will then be prompted for your password. Once in mcc, you can access all other nodes by doing:

ssh nodename

where nodename is one of the following: mcc20, mcc21, mcc22, mcc23, mcc24, mcc25, mcc26, mcc27,mcc28, mcc29, mcc30, mcc31, mcc32, mcc01, mcc02, mcc03, mcc04. Depending on the account that you have, certain differences apply regarding the place (i.e. folder) from where you can run your jobs and the nodes that you can use. Those differences are discussed below.
Should you have any questions or problems regarding your account, contact U. Filges and E. Rantsiou

Account:
  • l_mc01 (LNS group users)
  • This is a common account with many users, and its use is only for LNS group members . Users of this account submit their jobs and store their files, executables, etc., within the /home/l_mc01/mpi directory. Each user that has access to this account has either his/her own subdictory into /home/l_mc01/mpi and/or access to existing 'project' directories in there.
    Jobs submitted under that account, can make use of the LNS's part of the cluster, i.e., nodes mcc01 - mcc04.
    ATTENTION LNS MATLAB USERS: LNS group members that run extensive, memory-demanding jobs, should prefer running their jobs on node mcc04, as it comes with a 192GB of memory!!
  • l_yourlastname (LDM/GFA group users)
  • Those are individual accounts (only one user per account) for the LDM and GFA group members . Users of this accounts submit their jobs and store their files, executables, etc within the /home/l_yourlastname directory. The user can create his/her own folder within that directory and submit jobs from anywhere within /home/l_yourlastname.
    Jobs submitted under those accounts, can make use of nodes mcc20-mcc32 (i.e., only the LDM/GFA part of the cluster. Read section Running Jobs for details).

Software A variety of software and compilers are available on MCC. Some of them are available as modules, which means that the user has to 'load' them, before being able to use them. With the command module avail, a list of the available modules apears. Then by doing module load software (where 'software' is one of the items of the list that module avail gave), the desired softawe can be loaded. A module unload software will unload the specific software. Finally, a module list, will show the list of software that is currently loaded.
Here is some information regarding some of the available software and their use on the cluster.
  • McStas
  • How to use McStas @ PSI:

    McStas Linux version can be found on the following clusters: llc.psi.ch and lnsl15.psi.ch with any AFS account, and on MCC (the parallel cluster of LDM under autorisation of U.Filges).

    McStas can be started using command lines (mcstas *.instr) or preferably using the graphical user interface by calling mcgui in Terminal (for Windows users, X-Windows client like ReflectionX are required). See the following pdf documentation.

    The version 1.12c of McStas is currently available @ PSI running with three graphical interfaces (PGPLOT, Matlab, Scilab). If possible, it is recommanded to work with PGPLOT.

    McStas can be installed on single Windows computer on request. (Please contact U.Filges or E. Rantsiou .

    Most of the neutron beam-line of the SINQ facility have been modelled, description and instrument file (*.instr, McStas geometry input) can be found on this page (work in progress).

    General information is available on the Official McStas webpage and on the ILL McStas webpage.

    For details on how to run McStas in parallel, check the Running Parallel Jobs section below.

  • MCNPX
  • The following information are only relevant to users with access to the MCNP source code. Users that simply have a MCNP executable, can skip forward to "Running Parallel Jobs".
    Since MCNP comes with individual (personal) licenses, each user who obtains an MCNP license, has to install his/her version locally on their account. The software comes with installation instructions, and if you are granted permission to the source code, then you most likely already know how to compile it and install it. However, here are some basic instructions for those users who wish to compile their own version of MCNP on MCC:
    Installation instructions for MCNPX

Running Parallel Jobs As of July 2012, the LDM/GFA part of MCC is equipped with the batch system SLURM (v2.4.1). What that practically means, is that all runs in nodes mcc20-mcc32 must be submitted using a submission script.

Submission Script Basics

The submission script to use in order to start a run, should look something like this: Example submission script (if you are a McStas or MCNPX user, make sure to read section "Submission scripts for McStas and MCNPX").
You can copy this file and use it as your submission script (you can rename it to anything you like), after doing the necessary alterations. For that, you should take a look at the instructions below and also at the explanation provided within the file.

From all the lines contained within the submission script, 5 of them must be altered so that they correspond to your current run:

#SBATCH -J job_name
job_name should be replaced by the name you want your run to have when it appears in the queue list
#SBATCH -N 6
6 should be replaced with the number of nodes * you wish your run to occupy.
#SBATCH --time=00:30:00
The duration of your run in hh:mm:ss. Always make sure to give an accurate amount of time for your run (and then some) in order to make sure that your job won't be terminated by the batch system before it completes.
#SBATCH --partition=short
short should be replace with the right partition name. Read below for the desciption of the available partitions.
mpirun -np #ofcores your_executable
And this is the actual "run" command, where #ofcores should be replaced with the number of cores * you wish your run to use (take a look below to check the limitations in number or cores you can use per run, depending on the partition you are running in).

(*) Keep in mind that each node has 24 cores, as described at the beginning of this page.

Choosing Partition for Your Run

There are currently 4 available partitions (typing "sinfo" in your terminal will give you the following info):

short : Maximum run duration of 1 hr (no upper limit on amount of cores to use).
medium : Maximum run duration of 2 days. Maximum number of available cores for a medium run is 144 ( = 6 nodes).
long : Maximum run duration of 7 days. Maximum number of available cores for a long run is 168 ( = 7 nodes).
test : Maximum run duration of 2 hr. Maximum number of available cores for a test run is 24 ( = 1 node).
The test partition is reserved only for running short (e.g., performance related etc.) tests. Test jobs will always run on node mcc32.

Based on the above characteristics of the partitions, you should choose which partition is right for your job, and define it into your submission script.
As an example: Your run needs 3 hours to complete when running on 48 cores. The relevent lines in the submission script should be altered to:

#SBATCH -N 2
#SBATCH --time=03:00:00
#SBATCH --partition=medium
mpirun -np 48 your_executable


Submitting a Job

Now that your submission script is in order (let's say you've named it "submission_script"), you can submit the job, by doing:

sbatch submission_script


Submission Scripts for McStas and MCNPX

The submission scripts provided here are identical to the one given above, with one difference: the actual "run command". Both MCNPX and McStas require a few more parameters added to the "mpirun" command, other that the -np and the name of the executable to run.

        --MCNPX
When running mcnpx, one needs to specify an input and an output file. Something like this:         mpirun -np 36 mcnpx i=input n=output.
Here's an example submission script for MCNPX.


        --McStas
In order to run your something.instr file in McStas, you need to create the executable first, which is usually named something.out.
As a first step, you need to translate your instrument file into C,
a)
mcstas -I /full/path/of/directory/with/instrument/file -t -o something.c something.instr

and then create the executable of the C code:
b)
mpicc -O2 -w -ax -o something.out something.c -lm -DUSE_MPI

You can now use the mpirun command to start the run:
c)
mpirun -np 24 something.out --ncount=1000000 lambda_min=3 lambda_max=10

where --ncount is the number of neutrons you want to use, and in place of lambda_min=3 lambda_max=10 you should write down all the parameters that your instrument file has.
Note: It is up to you to decide whether steps a) and b) above will be included into your submission script or not (for example you have already the .out file, from a previous compilation and therefore there is no need to create it again). If not, you can simply just execute those two steps from within your log-in shell before submitting your submission script, which should only include step c) in this case. The example script provided below contains all three steps for completion.
Here's an example submission script for McStas.

IMPORTANT: If you are submitting McStas jobs as an l_mc01 user, make sure to use the full path for the mpirun command (/afs/psi.ch/project/sinq/sl6-64/bin/mpirun) in your submission script (as it is done in the example file given above). This will save you from serious trouble.


Monitoring Your Jobs and the Cluster

In order to monitor the status and progress of your jobs, here's a list of commands that you can use in your log-in shell:
  • sinfo
  • Information on the available partitions
  • squeue
  • Gives a list of all jobs currently running or waiting to run. By using squeue -u your_username you can get a list that will include only your jobs.
  • scontrol
  • scontrol will give you a prompt that will look like:
    scontrol:


    --Typing show jobs will give a detailed list of all the jobs. Typing instead show job id (where id is the ID number of the job) will give you detailed information on a specific job.
    --Typing show partitions will give a detailed description of the available partitions (more details that what squeue gives).
    --Type exit or quit to exit scontrol.
  • scancel ID
  • will cancel the specified job, whether it's already running or pending.

You can also have a graphical view of the cluster status (present and past) by launching a webbrowser after having logged in mcc10, and visiting: http://localhost/ganglia

FAQs