HPC Logo  
Fulton HomeASU Home
 
Starting and Managing Jobs Monday November 23, 2009

Scheduling and managing jobs on Saguaro

Saguaro uses the Moab scheduler on top of the Torque resource manager to handle the workload of its users. The cluster is currently configured with three main queues for users use.

Queues

Serial: for single or dual processor jobs that only need to run on a single node.
Medium: for jobs that require 3 processors or more up to 128 processors.
Large: for jobs that require more than 128 processors
To see all available* queues use the command:
qstat -q
*please note that you may not have access to all queues. Here is a simple guide to interfacing with the scheduler. To submit jobs into the cluster for scheduling it is best to use the command:
qsub (options) (job_script) the options can be included in the job script or on the command line. A simple job script could look like:
	#PBS -N Sleep_test
	#PBS -q "serial"
	#PBS -l nodes=1:ppn=1
	#PBS -l advres=bmaxwell.525	
	#PBS -l walltime=00:10:00

	sleep 600

In this simple example the job script is asking for one node with at least one processor(ppn), using an advance reservation named bmaxwell.525(these are setup by request), with a wall time of ten minutes. The -N option is the name of the job, and the -q specifies the queue name.

Once a job has been submitted to the scheduler, there are a few ways to keep track of whats going on. to see jobs that are running, pending or blocked use the command:
qstat
or
showq -u
Both will show the jobs that you the user have submitted to the cluster. If you want to see all jobs that have been submitted then use the command:
showq
by itself it will show all the jobs that are running by all users. If you see your job is blocked or has been pending for a long time type the command:
checkjob -v (job_id)
or
qstat -f (job_id)
this will give you a very verbose output with the reasons why the job can not currently start. If you job is pending and you would like to see when it might start, or at least when the scheduler thinks it might start, use the command
showstart (job_id)
If you decide for one reason or another that you want to stop your job immediately then use the command:
mjobctl -c (job_id)
or
qdel (job_id)

HPC
Goldwater Center, 650 E Tyler St
Tempe, AZ 85287-5206
hpc@asu.edu
480.727.0536

Copyright © 2000 - 2006 Arizona State University Ira A. Fulton School of Engineering | Privacy Statement | Accessibility | Content Manager |