User Tools

Site Tools


wiki:supercomputingdemo

Parallel Processing

Submitting big jobs on summit

The real power in supercomputing comes from the ability to parallel process. This means, using many cores for each job. By requesting more and more cores, we can split the job we are doing across multiple CPUs.

To use multiple cores, you need to use software and programs that are designed for parallel processing.

For example, I used the program bowtie to align some short sequence reads to the genome.

Not all software is designed to run on multiple cores. Look for options such as -p, -n, or -T and look for terms called parallel, multiple cores, multiple threads in the instructions.

To use multiple cores, I matched -ntasks with the number of threads my program used.

###################
#!/usr/bin/bash
#SBATCH --job-name=bowtie
#SBATCH --nodes=1
#SBATCH --ntasks=4

#execute a bowtie command
bowtie -p 4 /pathToIndex/ce11 seqFile1.fastq outFileName.sam

###################

In the above example the number of ntasks matches the number after -p in the bowtie command.

By using multiple cores, you can really speed up your jobs, do many jobs in parallel, and make your computing much more efficient.

This is what happened when I used 1 - 24 cores for the bowtie command.

:!: Quick Tip: You need to match the number of ntasks in your #SBATCH header to the number of threads or cores you request in the command part of your script!

:!: Quick Tip: You can't request more than 24 cores per single haswell node. Therefore, ntasks divided by nodes should never exceed 24 for shas partitions.

Upcoming Summit Workshop

“Getting Started on Summit” Workshop
Sept 26-27

Summit support


Best Practices

wiki/supercomputingdemo.txt · Last modified: 2018/08/30 09:38 by erin