4.6. Submitting cuda jobs

In this section, we have submitted a basic job through the cuda queue using the qsub command.

Cuda is a parallel computing language and programming model invented by NVIDIA. It enables dramatic improvements in computing performance by harnessing the power of the Graphics Processing Unit (GPU). 

First of all, we have created the file as we have explained before, all Grid Engine options are preceded by the string #$, and all other lines in the file are executed by the shell (the default shell is /bin/bash):

#!/bin/bash

#$ -N CudaSample
#$ -l gpu=1
#$ -q default.q
#$ -e $HOME/logs/$JOB_NAME-$JOB_ID.err
#$ -o $HOME/logs/$JOB_NAME-$JOB_ID.out
module load cuda/7.5
/soft/cuda/NVIDIA_CUDA-7.5_Samples/0_Simple/vectorAdd/vectorAdd

IMPORTANT: This qsub sample script request a cuda resource, this means that your job will be executed in a single cuda card(all cuda cores of this card will be available for your job). Please do not request more than one gpu, cuda hosts only own one physical cuda card; if you change it, your job will be waiting forever.

Script options:

-N This script is called MyFirstJob2Cuda:

-l Resource type.

-q Parameter is set to cuda.q to use the multicore processors GPUs.

-e error files are created in the cuda/error

-o output files are created in the cuda/output and cuda/error directories.

 

 

 

 

 

 

 

To send the job through the cuda queue, it is necessary to load the module cuda/7.5.

module load cuda/7.5

This script executes a sample file.

/soft/cuda/NVIDIA_CUDA-7.5_Samples/0_Simple/vectorAdd/vectorAdd.

If you want to do some tests, in the following path you can find other samples to use:

/soft/cuda/NVIDIA_CUDA-7.5_Samples/

When the file is saved, we create the folders:

Finally, we launch the job and we monitor it with ‘qstat’ command:

In this case, we have obtained a success execution, you can check the output directory to see the results: