Examples

1. Simple 'hostname' job

Setting up some folders to perform experiments

We'll setup a root directory, called 'examples', where we'll store all the input files, scripts and results we're going to test.

ijimenez@login:~/examples$ mkdir hostname
ijimenez@login:~/examples$ cd hostname
ijimenez@login:~/examples/hostname$

The easiest example: where do you run?

To illustrate that we have no prior knowledge of which node is goin to execute our tasks, we'll program a little script to send us the hostname who will be running our scripts. And within the file we do a simple echo of the environment variable $HOSTNAME. The hostname.sh file comes with only two lines:

ijimenez@login:~/examples/hostname$ vi hostname.sh
#!/bin/bash
echo "Hello, this is your script, running on $HOSTNAME"

And now we'll setup the job file for hostname.sh. Comments are explained on the file:

ijimenez@login:~/examples/hostname$ vi hostname.sub
#!/bin/bash ##########################################
# Options and parameters for SGE:
##########################################
# (1) Name of the job to identify (flag -N)
# The parameter passed to -N is an alias to the script
# -----------------------------------------
#$ -N The-easiest-job
#
# (2) We'll redirect the output files to
# our working directory (flags -cwd, -e, -o)
# ---------------------------------------
#$ -cwd
#$ -o __hostname.out
#$ -e __hostname.err
#
# (3)Finally, we call the script
# ------------------------------
sh hostname.sh

And now we simply run the job, telling the scheduler to register the .sub file:

ijimenez@login:~/examples/hostname$ qsub hostname.sub
Your job 153605 ("The-easiest-job") has been submitted
ijimenez@login:~/examples/hostname$ 

Finally, we check the contents of the output file:

    ijimenez@login:~/examples/hostname$ more __hostname.out 
    Hello, this is your script, running on node09

2. Mandelbrot with Matlab

In this example, we'll create a Mandelbrot fractal with Matlab. First, we'll test our code interactively on a qrsh session and then we'll create some Matlab and SGE scripts to run the process in batch.

1. Interactive session:

Xshell:\> ssh -X ijimenez@hpc.dtic.upf.edu
Looking up host 'hpc.dtic.upf.edu'...
Host 'hpc.dtic.upf.edu' resolved to 193.145.51.37.
Connecting to 193.145.51.37:22...
Connection established.
To escape to local shell, press 'Ctrl+Alt+]'.
Linux login 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64
[ ... ]
ijimenez@login:~$ qrsh
Linux node02 3.2.0-4-amd64 #1 SMP Debian 3.2.51-1 x86_64
[ ... ]
Last login: Thu Dec 12 13:26:11 2013 from 192.168.7.6
ijimenez@node02:~$ matlab & 

This is the source for the  Mandelbrot set we're going to create: http://www.mathworks.com/help/distcomp/examples/illustrating-three-approaches-to-gpu-computing-the-mandelbrot-set.html

The Mandelbrot code depends on four variables we set at the beginning, and will model the behavior of the Mandelbrot set:

>> maxIterations = 500;
gridSize = 1000;
xlim = [-0.748766713922161, -0.748766707771757];
ylim = [ 0.123640844894862,  0.123640851045266];

The full Matlab code is shown on the next figure:

If we execute the code, we'll obtain this figure:

 

2. Mandelbrot set in serial batch: 

Let's create the usual directory struct:

ijimenez@node04:~/Matlab$ mkdir Mandelbrot
ijimenez@node04:~/Matlab$ cd Mandelbrot
ijimenez@node04:~/Mandelbrot$ mkdir data
ijimenez@node04:~/Mandelbrot$ mkdir script
ijimenez@node04:~/Mandelbrot$ mkdir job-out
ijimenez@node04:~/Mandelbrot$ mkdir out

Let's create now the Matlab script:

ijimenez@node04:~/Mandelbrot$ cd script 
ijimenez@node04:~/Mandelbrot$ matlab

We can write it using Matlab's own editor. We'll introduce some modifications, as long as we are not going to handle a drawing but an image object:

% Show
cpuTime = toc( t );
%set( gcf, 'Position', [200 200 600 600] );
% imagesc( x, y, count );
% axis image
% colormap( [jet();flipud( jet() );0 0 0] )
% title( sprintf( '%1.2fsecs (without GPU)', cpuTime ) );
imwrite((count/max(count(:)))*255,jet,'/homedtic/ijimenez/Matlab/Mandelbrot/out/mandelbrot.png');

The full code of our simulation is shown below:

Some notes on the script:

a) we'll control the time trhourh the tic() - toc() commands.

b) The 'imwrite' command gets three values: the image matrix, the colormap and the file where we want to write the data. We apply a transformation to every element of the matrix to 'colorize' the value. Also, we'll use Matlab's builtin colormap 'jet'. 

c) We're going to write the output file in the /scratch/Mandelbrot/out rather than on $HOME, and we'll move the data back with the submission script:

#!/bin/bash
 
# Load modules directive
. /etc/profile.d/modules.sh
 
# Copy sources to the SSD:
#
 
# First, make sure to delete previous versions of the sources:
# -----------------------------------------------------------
if [ -d /scratch/Mandelbrot ]; then
        rm -Rf /scratch/Mandelbrot
fi
 
# Second, replicate the structure of the experiment's folder:
# -----------------------------------------------------------
mkdir /scratch/Mandelbrot
mkdir /scratch/Mandelbrot/data
mkdir /scratch/Mandelbrot/error
mkdir /scratch/Mandelbrot/script
mkdir /scratch/Mandelbrot/out
 
# Thir, copy the experiment's data:
# ----------------------------------
cp -rp /homedtic/ijimenez/Matlab/Mandelbrot/data/* /scratch/Mandelbrot/data
cp -rp /homedtic/ijimenez/Matlab/Mandelbrot/script/* /scratch/Mandelbrot/script
 
# Fourth, prepare the submission parameters:
# Remember SGE options are marked up with '#$':
# ---------------------------------------------
# Requested resources:
#
# Simulation name
# ----------------
#$ -N "Mandelbrot"
#
# Expected walltime: ten minutes maximum
# ---------------------------------------
#$ -l h_rt=00:10:00
#
# Shell
# -----
#$ -S /bin/bash
#
# Output and error files go on the user's home:
# -------------------------------------------------
#$ -o /homedtic/ijimenez/Matlab/Mandelbrot/job-out/Mandelbrot.out
#$ -e /homedtic/ijimenez/Matlab/Mandelbrot/job-out/Mandelbrot.err
#
# Send me a mail when processed and when finished:
# ------------------------------------------------
#$ -m bea
#$ -M  my.email@upf.edu
# Start script
# --------------------------------
#
# Print some informational data:
# ------------------------------
printf "Starting execution of job Mandelbrot with ID: $JOB_ID from user: $SGE_O_LOGNAME\n"
printf "Starting at `date`\n"
printf "Calling Matlab now\n"
printf "_____________________\n"
#
# Execute the Matlab script
# -------------------------
/soft/MATLAB/R2013b/bin/matlab -nosplash -nojvm -nodesktop -r "run /scratch/Mandelbrot/script/Mandelbrot.m"
#
# Copy data back, if any
# ----------------------
printf "Matlab processing done. Moving data back\n"
cp -rf /scratch/Mandelbrot/out/* /homedtic/ijimenez/Matlab/Mandelbrot/out
printf "_________________\n"
#
# Clean the crap:
# ---------------
printf "Removing local scratch directories...\n"
if [ -d /scratch/Mandelbrot ]; then
        rm -Rf /scratch/Mandelbrot
fi
printf "Job done. Ending at `date`\n"

That's it!. Submit the job:

ijimenez@login:~/Matlab/Mandelbrot$ qsub mandelbrot.sub 

and wait for it to end:

ijimenez@login:~/Matlab/Mandelbrot$ qstat
job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID 
-----------------------------------------------------------------------------------------------------------------
 154071 0.47500 Mandelbrot ijimenez     r     01/07/2014 11:47:59 short.q@node06                     1  

Once it finishes, we'll get the output time in ./job-out/Mandelbrot.out:

Starting execution of job Mandelbrot with ID: 154071 from user: ijimenez
Starting at Tue Jan  7 11:47:59 CET 2014
Calling Matlab now
_____________________
Warning: No display specified.  You will not be able to display graphics on the screen.
Warning: No window system found.  Java option 'MWT' ignored.
 
                            < M A T L A B (R) >
                  Copyright 1984-2013 The MathWorks, Inc.
                    R2013b (8.2.0.701) 64-bit (glnxa64)
                              August 13, 2013
 
 
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
 
Elapsed time is 109.338394 seconds.
Matlab processing done. Moving data back.
_________________
Removing local scratch directories...
Job done. Ending at Tue Jan  7 11:49:53 CET 2014

and the output image file will be in ./out/mandelbrot.png. We can use the gpicview utility to view the file:

ijimenez@login:~/Matlab/Mandelbrot/out$ gpicview mandelbrot.png 

 

 

3. Montecarlo with Matlab and MDCS

An illustrating example for configuring and running jobs on Matlab Distributed Computing Server (MDCS).

 

MDCS is already installed in the cluster. You only need to configure Matlab on your workstation inorder to run Matlab codes on the cluster.

The example runs a Montecarlo simulation on the workstation before, then on the cluster.

 
 
 
 
Running Montecarlo on the local Matlab (workstation)
 
 
Configuring and Running Montecarlo using MDCS.
 
 
1- Download snow.remote.r2013b.zip or snow.remote.r2014b.zip (depending on your Malab version) for windows systems. And snow.remote.r2013b.tar or snow.remote.r2014b.tar for Linux/Mac systems from the cluster /SOFT/Matlab/MDCS/ directory. You can download it to your worksatation using any SFTP client.
 
 
 
 
 

2- Uncompress the download and place contain into "MATLABROOT/toolbox/local", where MATLABROOT refers to the local Matlab installation directory.

Refer to the following table to find Matlab default directory.

Operating system Matlab directory
Windows C:\Program Files\MATLAB\R20xx\
Linux /usr/local/MATLAB/R20xx/
Mac /Applications/Matlab/R20xx/

xx -> matlab version

The download will contain four files and one folder as shown in the image.

 

3- Start Matlab on the local machine and configure it to submit jobs directly to the cluster, by calling configCluster.  And set your cluster user name.

You only have to do this once per Matlab version.

 

Depending on the job type, the sending method differs slightly. Basically, there are two types of jobs:

  • Serial: The job is atomic (uses only one core) and does not parallelize.

  • Communicating: The job is internally divided on threads and communicate themselves.

Additionally, you have to setup a secure communication channel between your workstation and the cluster, to be able to send your scripts and source files over the network and to retrieve the results.

4 – Submit jobs to the cluster (batch command).

Serial Job handling   

 

c = parcluster;

j = c.batch(@filename, 1, {});
 

 

The first time you submit a job, you will be prompt to select the cluster authentication method. You can either choose to use the cluster user password or to establish a password-less authentication method.

 

If you decide to use your cluster password press “No” in the User Credentials box, then introduce your password on the appearing dialog box.

 
 

Otherwise, if you decide to use a password-less authentication method select "yes" in the previous User Credential box, then indicate the directory to find the private key.

 

 

You are also asked to supply the passphrase. Select “yes” and introduce the phrase in the case anyone was provided when generating the key pair, otherwise select “no”.

 
 

Parallel job handling  

           c = parcluster;

           nan = c.batch(@filename, 1, {}, ‘matlabpool’, number_workers);

The procedure is the same as serial jobs, with two main differences:

  • The for loop in the code has to be replaced with the parfor loop.

  • Ant the Matlabpool function has to be addede in the exexution line, with the number of wokers needed.

matlabpool shows that is a parallelized job. number_workers cannot be higher than 31, because the number of workers is limited to 32.

 
 
 

Recover results

 

fetchOutputs is the function used to recover the result of a submitted job.

 

j.fetchOutputs{:}


 

Other important functions are submit, wait, diary and load.

 

Configure jobs

 

Additional information can be specified before submitting a job.


- Email Notification (when the job is running, exiting, or aborting).

     ClusterInfo.setEmailAddress('noko@upf.edu')


- Memory Usage.

     ClusterInfo.setMemUsage('2G')


- Queue name.

     ClusterInfo.setQueueName('default.q')


- Walltime.

     ClusterInfo.setWallTime('00:30:00')

 
 

4. Password-less connection

UNIX/Linux.

  • Generate a pair of private and public key in your local machine with the keygen command. Do not enter any passphrase, just press Enter.

         

  nicul@pc1nico:~$ ssh-keygen -t rsa

  Enter passphrase (empty for no passphrase):

  Enter same passphrase again:

  Your identification has been saved in /homedtic/noko/.ssh/id_rsa.

  Your public key has been saved in /homedtic/noko/.ssh/id_rsa.pub.


 

  • Copy the public key in the authentication system of the cluster.

 nicul@pc1nico:~/.ssh$ ssh-copy-id  -i  id_rsa.pub user@hpc.dtic.upf.edu

 

  • Establish connection to the cluster.

 

  ssh hpc.dtic.upf.edu

 

Windows

  • Generate a pair of private and public key in the cluster with the keygen command, just like you did for Linux but in the cluster and not on the user local machine.

 

 noko@login:~$ ssh-keygen -t rsa

 

  • Copy contain of the public key in the authorized_keys file found at homedtic/user/.ssh/ directory of the cluster.

 

 noko@login:~/.ssh$ cat id_rsa.pub /homedtic/noko/.ssh/authorized_keys

 

  • If the authorized_keys file does not exist, create it.

 

 noko@login:~/.ssh$ vi authorized_keys

 

  • Copy the private key to any directory of your choice on the local machine. This is the directory where you will redirect the Matlab configuration in case you select this password-less authentication method.

5. Performance AMD and Intel

We use a simple matlab function to compare the performance of the two processor models available in the cluster.

Pocessor Sockets Cores Speed Year Technology
Intel 2 4 2.0 2009 Dedicated
AMD 4 16 2.4 2012 Shared

 

The simulation code simply calculate the signal power received by a user at some distance away from the antenna. User and antenna positions are randomlly choosen from two sets of random generated values. 

 

We run the simulation in both processors queue types, and find out that Intel processor are a little bit fatser than the AMD processors, as they can run the simulation with less time.

 

Conclusion

Despite the difference manufacturing years, both processors perform well. Dealing with floating-point operations that require a single core, Intel processors nodes are advisable. However, when performing tasks that require various cores, the AMD processors nodes are preferable, because Intel cores are few and will soon be exhausted.