PBS Queue Commands
PBS Queue Commands
onThunder
TableofContents
1. Introduction
2. AnatomyofaBatchScript
2.1. SpecifyYourShell
2.2. RequiredPBSDirectives
2.2.1. NumberofNodesandProcessesPerNode
2.2.2. HowLongtoRun
2.2.3. WhichQueuetoRunIn
2.2.4. YourProjectID
2.3. TheExecutionBlock
3. SubmittingYourJob
4. SimpleBatchScriptExample
5. JobManagementCommands
6. OptionalPBSDirectives
6.1. JobIdentificationDirectives
6.1.1. ApplicationName
6.1.2. JobName
6.2. JobEnvironmentDirectives
6.2.1. InteractiveBatchShell
6.2.2. ExportAllVariables
6.2.3. ExportSpecificVariables
6.3. ReportingDirectives
6.3.1. RedirectingStdoutandStderr
6.3.2. SettingupEmailAlerts
6.4. JobDependencyDirectives
7. EnvironmentVariables
7.1. PBSEnvironmentVariables
7.2. OtherImportantEnvironmentVariables
8. ExampleScripts
8.1. MPIScript
8.2. MPIScript(accessingmorememoryperprocess)
8.3. OpenMPScript
8.4. SHMEMScript
8.5. HybridMPI/OpenMPScript
1.Introduction
Onlargescalecomputers,manyusersmustshareavailableresources.Becauseofthis,youcannotjustlogonto
oneofthesesystems,uploadyourprograms,andstartrunningthem.Essentially,yourprograms(calledbatchjobs)
haveto"getinline"andwaittheirturn.And,thereismorethanoneoftheselines(calledqueues)fromwhichto
choose.Somequeueshaveahigherprioritythanothers(liketheexpresscheckoutatthegrocerystore).The
queuesavailabletoyouaredeterminedbytheprojectsthatyouareinvolvedwith.
Thejobsinthequeuesaremanagedandcontrolledbyabatchqueuingsystem,withoutwhich,userscouldoverload
systems,resultingintremendousperformancedegradation.Thequeuingsystemwillrunyourjobassoonasitcan
whilestillhonoringthefollowing:
Meetingyourresourcerequests
Notoverloadingsystems
Runninghigherpriorityjobsfirst
Maximizingoverallthroughput
AtAFRL,weusethePBSProfessionalqueuingsystem.ThePBSmoduleshouldbeloadedautomaticallyforyouat
login,allowingyouaccesstothePBScommands.
2.AnatomyofaBatchScript
Abatchscriptissimplyasmalltextfilethatcanbecreatedwithatexteditorsuchasviornotepad.Youmaycreate
yourownfromscratch,orstartwithoneofthesamplebatchscriptsavailablein$SAMPLES_HOME.Althoughthe
specificsofabatchscriptwilldifferslightlyfromsystemtosystem,abasicsetofcomponentsarealwaysrequired,
andafewcomponentsarejustalwaysgoodideas.Thebasiccomponentsofabatchscriptmustappearinthe
followingorder:
SpecifyYourShell
RequiredPBSDirectives
TheExecutionBlock
IMPORTANT:NotallapplicationsonLinuxsystemscanreadDOSformattedtextfiles.PBSdoesnothandle^M
characterswell,nordosomecompilers.Toavoidcomplications,pleaseremembertoconvertallDOSformatted
ASCIItextfileswiththedos2unixutilitybeforeuseonanyHPCsystem.Usersarealsocautionedagainstrelyingon
ASCIItransfermodetostripthesecharacters,assomefiletransfertoolsdonotperformthisfunction.
2.1.SpecifyYourShell
Firstofall,rememberthatyourbatchscriptisascript.It'sagoodideatospecifywhichshellyourscriptiswrittenin.
Unlessyouspecifyotherwise,PBSwilluseyourdefaultloginshelltorunyourscript.TotellPBSwhichshelltouse,
startyourscriptwithalinesimilartothefollowing,whereshelliseitherbash,sh,ksh,csh,tcsh,orzsh:
#!/bin/shell
2.2.RequiredPBSDirectives
ThenextblockofyourscriptwilltellPBSabouttheresourcesthatyourjobneedsbyincludingPBSdirectives.
Thesedirectivesareactuallyaspecialformofcomment,beginningwith"#PBS".Asyoumightsuspect,the#
charactertellstheshelltoignoretheline,butPBSreadsthesedirectivesandusesthemtosetvarious
values.IMPORTANT!!AllPBSdirectivesMUSTcomebeforethefirstlineofexecutablecodeinyourscript,
otherwisetheywillbeignored.
Everyscriptmustincludedirectivesforthefollowing:
Thenumberofnodesandprocessespernodeyouarerequesting
Themaximumamountoftimeyourjobshouldrun
Whichqueueyouwantyourjobtorunin
YourProjectID
PBSalsoprovidesadditionaloptionaldirectives.Thesearediscussedin OptionalPBSDirectives ,below.
2.2.1.NumberofNodesandProcessesPerNode
BeforePBScanscheduleyourjob,itneedstoknowhowmanynodesyouwant.Beforeyourjobcanberun,itwill
alsoneedtoknowhowmanyprocessesyouwanttorunoneachofthosenodes.Ingeneral,youwouldspecifyone
processpercore,butyoumightwantmoreorfewerprocessesdependingontheprogrammingmodelyouareusing.
See ExampleScripts (below)foralternateusecases.
Boththenumberofnodesandprocessespernodearespecifiedusingthesamedirectiveasfollows,whereN1isthe
numberofnodesyouarerequestingandN2isthenumberofprocessespernode(mustbe1,2,4,6,9,18,or36):
#PBSlselect=N1:ncpus=36:mpiprocs=N2
Thevalueofncpusreferstothenumberofphysicalcoresavailableoneachnode,andmustalwaysbesetto36for
standardcomputenodes.
GPUnodeswillrequirencpus=28,plustheextraargumentofngpus=1:
#PBSlselect=N1:ncpus=28:mpiprocs=N2:ngpus=1
Largememorynodeswillrequirencpus=36,plustheextraargumentofbigmem=1:
#PBSlselect=N1:ncpus=36:mpiprocs=N2:bigmem=1
PHInodeswillrequirencpus=28,plustheextraargumentofnmics=2:
#PBSlselect=N1:ncpus=28:mpiprocs=N2:nmics=2
Anexceptiontothisruleisthetransferqueue,whichusesthedirectivebelow:
#PBSlselect=1:ncpus=1
2.2.2.HowLongtoRun
Next,PBSneedstoknowhowlongyourjobwillrun.Forthis,youwillhavetomakeanestimate.Therearethree
thingstokeepinmind.
1.Yourestimateisalimit.Ifyourjobhasn'tcompletedwithinyourestimate,itwillbeterminated.
2.Yourestimatewillaffecthowlongyourjobwaitsinthequeue.Ingeneral,shorterjobswillrunbeforelonger
jobs.
3.Eachqueuehasamaximumtimelimit.Youcannotrequestmoretimethanthequeueallows.
Tospecifyhowlongyourjobwillrun,includethefollowingdirective:
#PBSlwalltime=HHH:MM:SS
2.2.3.WhichQueuetoRunIn
Now,PBSneedstoknowwhichqueueyouwantyourjobtorunin.Youroptionsherearedeterminedbyyourproject.
Mostusersonlyhaveaccesstothedebug,standard,andbackgroundqueues.Otherqueuesexist,butaccessto
thesequeuesisrestrictedtoprojectsthathavebeengrantedspecialprivilegesduetourgencyorimportance,and
theywillnotbediscussedhere.Astheirnamessuggest,thestandardanddebugqueuesshouldbeusedfornormal
daytodayanddebuggingjobs.Thebackgroundqueue,however,isabitspecialbecausealthoughithasthelowest
priority,jobsthatruninthisqueuearenotchargedagainstyourprojectallocation.Usersmaychoosetoruninthe
backgroundqueueforseveralreasons:
1.Youdon'tcarehowlongittakesforyourjobtobeginrunning.
2.Youaretryingtoconserveyourallocation.
3.Youhaveusedupyourallocation.
Toseethelistofqueuesavailableonthesystem,usetheshow_queuescommand.Tospecifythequeueyouwant
yourjobtorunin,includethefollowingdirective:
#PBSqqueue_name
2.2.4.YourProjectID
PBSnowneedstoknowwhichprojectIDtochargeforyourjob.Youcanusetheshow_usagecommandtofindthe
projectsthatareavailabletoyouandtheirassociatedprojectIDs.Intheshow_usageoutput,projectIDsappearin
thecolumnlabeled"Subproject."Note:Userswithaccesstomultipleprojectsshouldrememberthattheprojectthey
specifymaylimittheirchoiceofqueues.
TospecifytheProjectIDforyourjob,includethefollowingdirective:
#PBSAProject_ID
2.3.TheExecutionBlock
OncethePBSdirectiveshavebeensupplied,theexecutionblockmaybegin.Thisisthesectionofyourscriptthat
containstheactualworktobedone.Awellwrittenexecutionblockwillgenerallycontainthefollowingstages:
EnvironmentSetupThismightincludesettingenvironmentvariables,loadingmodules,creating
directories,copyingfiles,initializingdata,etc.Asthelaststepinthisstage,youwillgenerallycdtothe
directorythatyouwantyourscripttoexecutein.Otherwise,yourscriptwouldexecutebydefaultinyour
homedirectory.Mostusersuse"cd$PBS_O_WORKDIR"torunthebatchscriptfromthedirectorywherethey
typedqsubtosubmitthejob.
CompilationYoumayneedtocompileyourapplicationifyoudon'talreadyhaveaprecompiled
executableavailable.
LaunchingIfyourapplicationusesIntelMPI,launchitwiththempiruncommand.IfitusesSGIMPT,
launchitwiththempiexec_mpt.
CleanupThisusuallyincludesarchivingyourresultsandremovingtemporaryfilesanddirectories.
3.SubmittingYourJob
Onceyourbatchscriptiscomplete,youwillneedtosubmitittoPBSforexecutionusingtheqsubcommand.For
example,ifyouhavesavedyourscriptintoatextfilenamedrun.pbs,youwouldtype"qsubrun.pbs".
Occasionallyyoumaywanttosupplyoneormoredirectivesdirectlyontheqsubcommandline.Directivessupplied
inthiswayoverridethesamedirectivesiftheyarealreadyincludedinyourscript.Thesyntaxtosupplydirectiveson
thecommandlineisthesameaswithinascriptexceptthat#PBSisnotused.Forexample:
qsublwalltime=HHH:MM:SSrun.pbs
4.SimpleBatchScriptExample
Thebatchscriptbelowcontainsalloftherequireddirectivesandcommonscriptcomponentsdiscussedabove.This
examplestarts32processes.EachThundernodehas16cores,so32processesrequire2nodes.Thejobis
submittedtothestandardqueuetorunforatmost12hours.
#!/bin/bash
##RequiredPBSDirectives
#PBSAProject_ID
#PBSqstandard
#PBSlselect=2:ncpus=36:mpiprocs=36
#PBSlwalltime=12:00:00
#PBSjoe
##ExecutionBlock
#EnvironmentSetup
#cdtoyourscratchdirectory
#$JOBDIRisadirectorythatiscreatedwhenyourjobruns.
#Fileswithin$JOBDIRareprotectedfromourfilescrubber
#forthedurationofthejob,plus30daysafteritfinishes.
#FilesNOTwithina$JOBDIRaresusceptibletothenormal
#30dayfilelifespan.
cd${JOBDIR}
#Launching
#copyexecutablefrom$HOMEandsubmitit
cp${HOME}/my_prog.exe.
mpiexec_mptn32./my_prog.exe>my_prog.out
#Cleanup
#archiveyourresults
#Usingthe"heredocument"syntax,createajobscript
#forarchivingyourdata.
cd${WORKDIR}
rmfarchive_job
cat>archive_job<<END
#!/bin/bash
#PBSlwalltime=12:00:00
#PBSqtransfer
#PBSAProject_ID
#PBSlselect=1:ncpus=1
#PBSjoe
#PBSS/bin/bash
cd${WORKDIR}
#Singlefilesareeasierandfastertoretrievefromthe
#archivethanmultiplefiles,socreatea.tarfileofyour
#results.
tarcvf${PBS_JOBID}.tar${JOBDIR}
#Createadirectoryin$ARCHIVE_HOMEnamedafterthis
#PBSjobidandplaceyour.tarfilethere.
archivemkdirC${ARCHIVE_HOME}}${PBS_JOBID}
archiveputC${ARCHIVE_HOME}/${PBS_JOBID}${PBS_JOBID}.tar
archivels${ARCHIVE_HOME}/${PBS_JOBID}
#Removescratchdirectoryfromthefilesystem.
cd${WORKDIR}
rmrf${JOBDIR}
END
#Submitthearchivejobscript.
qsubarchive_job
5.JobManagementCommands
ThetablebelowcontainscommandsformanagingyourjobsinPBS.
JobManagementCommands
Command Description
qsub Submitajob.
qstat Checkthestatusofajob.
qview Amoreuserfriendlyversionofqstat.
qstatq DisplaythestatusofallPBSqueues.
show_queues Amoreuserfriendlyversionof"qstatq".
qdel Deleteajob.
qhold Placeajobonhold.
qrls Releaseajobfromhold.
tracejob Displayjobaccountingdatafromacompletedjob.
pbsnodes DisplayhoststatusofallPBSbatchnodes.
qpeek Letsyoupeekatthestdoutandstderrofyourrunningjob.
qhist Displayadetailedhistoryofaspecificjob.
6.OptionalPBSDirectives
Inadditiontotherequireddirectivesmentionedabove,PBShasmanyotherdirectives,butmostuserswillonlyuse
afewofthem.Someofthemoreusefuldirectivesarelistedbelow.
6.1.JobIdentificationDirectives
Jobidentificationdirectivesallowyoutoidentifycharacteristicsofyourjobs.Thesedirectivesarevoluntary,but
stronglyencouraged.Thefollowingtablecontainssomeusefuljobidentificationdirectives.
JobIdentificationDirectives
Directive Options Description
lapplication application_name Identifytheapplicationbeingused.
N job_name Nameyourjob.
6.1.1.ApplicationName
The"lapplication"directiveallowsyoutoidentifytheapplicationbeingusedbyyourjob.Thishelpstheprogramto
accuratelyassessapplicationusageandtoensurethatadequatesoftwarelicensesandappropriatesoftwareare
purchased.Tousethisdirective,addalineinthefollowingformtoyourbatchscript:
#PBSlapplication=application_name
Ortoyourqsubcommand
qsublapplication=application_name
Whereapplication_nameischosenfromalistofacceptablenamesthatismaintained
in$SAMPLES_HOME/Application_Name/application_namesoneachsystem.
6.1.2.JobName
The"N"directiveallowsyoutodesignateanameforyourjob.Inadditiontobeingeasiertorememberthana
numericjobID,thePBSenvironmentvariable,$PBS_JOBNAME,inheritsthisvalueandcanbeusedinsteadofthejob
IDtocreatejobspecificoutputdirectories.Tousethisdirective,addalineinthefollowingformtoyourbatchscript:
#PBSNjob_20
Ortoyourqsubcommand
qsubNjob_20...
6.2.JobEnvironmentDirectives
Jobenvironmentdirectivesallowyoutocontroltheenvironmentinwhichyourscriptwilloperate.Thefollowingtable
containsafewusefuljobenvironmentdirectives.
JobEnvironmentDirectives
Directive Options Description
I Requestaninteractivebatchshell.
V Exportallenvironmentvariablestothejob.
v variable_list Exportspecificenvironmentvariablestothejob.
6.2.1.InteractiveBatchShell
The"I"directiveallowsyoutorequestaninteractivebatchshell.Withinthatshell,youcanperformnormalUnix
commands,includinglaunchingparalleljobs.Touse"I",appendittotheendofyourqsubrequest.Forexample,the
qsubcommandbelowrequests2nodes(totalof72cores)for1hour.
qsubAProject_IDqdebuglselect=2:ncpus=36:mpiprocs=36lwalltime=1:00:00I
6.2.2.ExportAllVariables
The"V"directivetellsPBStoexportalloftheenvironmentvariablesfromyourloginenvironmentintoyourbatch
environment.Tousethisdirective,addalineinthefollowingformtoyourbatchscript:
#PBSV
Ortoyourqsubcommand
qsubV...
6.2.3.ExportSpecificVariables
The"v"directivetellsPBStoexportspecificenvironmentvariablesfromyourloginenvironmentintoyourbatch
environment.Tousethisdirective,addalineinoneofthefollowingformstoyourbatchscript:
#PBSvmy_variable
Ortoyourqsubcommand
qsubvmy_variable
Usingeitherofthesemethods,multiplecommaseparatedvariablescanbeincluded.Itisalsopossibletosetvalues
forvariablesexportedinthisway,asfollows:
qsubvmy_variable=my_value,...
6.3.ReportingDirectives
Reportingdirectivesallowyoutocontrolwhathappenstostandardoutputandstandarderrormessagesgeneratedby
yourscript.Theyalsoallowyoutospecifyemailoptionstobeexecutedatthebeginningandendofyourjob.
6.3.1.RedirectingStdoutandStderr
Bydefault,messageswrittentostdoutandstderrarecapturedforyouinfilesnamedx.ojob_idandx.ejob_id,
respectively,wherexiseitherthenameofthescriptorthenamespecifiedwiththe"N"directive,andjob_idisthe
IDofthejob.Ifyouwanttochangethisbehavior,the"o"and"e"directivesallowyoutoredirectstdoutandstderr
messagestodifferentnamedfiles.The"j"directiveallowsyoutocombinestdoutandstderrintothesamefile.
RedirectionDirectives
Directive Options Description
e Filename Definestandarderrorfile.
o Filename Definestandardoutputfile.
j oe Mergestderrandstdoutintostdout.
j eo Mergestderrandstdoutintostderr.
6.3.2.SettingupEmailAlerts
Manyuserswanttobenotifiedwhentheirjobsbeginandend.The"m"directivemakesthispossible.Ifyouusethis
directive,youwillalsoneedtosupplythe"M"directivewithoneormoreemailaddressestobeused.
EmailDirectives
Directive Options Description
m b Sendemailwhenthejobbegins.
m e Sendemailwhenthejobends.
M Emailaddress(es) Sendmailtoaddress(es).
Forexample:
#PBSmbe
#[email protected],[email protected]
6.4.JobDependencyDirectives
Jobdependencydirectivesallowyoutospecifydependenciesthatyourjobmayhaveonotherjobs.Thisallows
userstocontroltheorderjobsrunin.Thesedirectiveswillgenerallytakethefollowingform:
#PBSWdepend=dependency_expression
wheredependency_expressionisacommadelimitedlistofoneormoredependencies,andeachdependencyisof
theform:
type:jobids
wheretypeisoneofthedirectiveslistedbelow,andjobidsisacolondelimitedlistofoneormorejobIDsthatyour
jobisdependentupon.
JobDependencyDirectives
Directive Description
after Executethisjobafterlistedjobshavebegun.
afterok Executethisjobafterlistedjobshaveterminatedwithouterror.
afternotok Executethisjobafterlistedjobshaveterminatedwithanerror.
afterany Executethisjobafterlistedjobshaveterminatedforanyreason.
before Listedjobsmayberunafterthisjobbeginsexecution.
beforeok Listedjobsmayberunafterthisjobterminateswithouterror.
beforenotok Listedjobsmayberunafterthisjobterminateswithanerror.
beforeany Listedjobsmayberunafterthisjobterminatesforanyreason.
Forexample,runajobaftercompletion(successorfailure)ofjobID1234:
#PBSWdepend=afterany:1234
Or,runajobaftersuccessfulcompletionofjobID1234:
#PBSWdepend=afterok:1234
Formoreinformationaboutjobdependencies,seetheqsubmanpage.
7.EnvironmentVariables
7.1.PBSEnvironmentVariables
WhiletherearemanyPBSenvironmentvariables,youonlyneedtoknowafewimportantonestogetstartedusing
PBS.ThetablebelowliststhemostimportantPBSenvironmentvariablesandhowyoumightgenerallyusethem.
FrequentlyUsedPBSEnvironmentVariables
PBSVariable Description
$PBS_JOBID Jobidentifierassignedtojoborjobarraybythebatch
system.
$PBS_O_WORKDIR Theabsolutepathofdirectorywhereqsubwasexecuted.
$PBS_JOBNAME Thejobnamesuppliedbytheuser.
ThefollowingadditionalPBSvariablesmaybeusefultosomeusers.
OtherPBSEnvironmentVariables
PBSVariable Description
$PBS_ARRAY_INDEX Indexnumberofsubjobinjobarray.
$PBS_ENVIRONMENT Indicatesjobtype:PBS_BATCHor
PBS_INTERACTIVE
$PBS_NODEFILE Filenamecontainingalistofvnodesassignedtothe
job.
$PBS_O_HOST Hostnameonwhichtheqsubcommandwasexecuted.
$PBS_O_PATH ValueofPATHfromsubmissionenvironment.
$PBS_O_SHELL ValueofSHELLfromsubmissionenvironment.
$PBS_QUEUE Thenameofthequeuefromwhichthejobisexecuted.
7.2.OtherImportantEnvironmentVariables
InadditiontothePBSenvironmentvariables,thetablebelowlistsafewothervariableswhicharenotgenerally
required,butmaybeimportantdependingonyourjob.
OtherImportantEnvironmentVariables
Variable Description
$OMP_NUM_THREADS ThenumberofOpenMPthreadspernode
$MPI_DSM_DISTRIBUTE Ensuresthatmemoryisassignedclosesttothe
physicalcorewhereeachMPIprocessisrunning
$BC_CORES_PER_NODE Thenumberofcorespernodeforthecomputenode
onwhichajobisrunning.
$BC_MEM_PER_NODE Theapproximatemaximumuseraccessiblememory
pernode(inintegerMBytes)forthecomputenode
onwhichajobisrunning.
$BC_MPI_TASKS_ALLOC ThenumberofMPItasksallocatedforajob.
$BC_NODE_ALLOC Thenumberofnodesallocatedforajob.
8.ExampleScripts
Allofthescriptexamplesshownbelowcontaina"Cleanup"sectionwhichdemonstrateshowtoautomatically
archiveyourdatausingthetransferqueueandcleanupyour$WORKDIRafteryourjobcompletes.Usingthismethod
helpstoavoiddataloss,andensuresthatyourallocationisnotchargedforidlecoreswhileperformingfiletransfer
operations.
8.1.MPIScript
Thefollowingscriptisfora288coreMPIjobrunningfor20hoursinthestandardqueue.Toruna288corejob,we
need8nodeswith36coreseach.
Notetheuseofthe$BC_MPI_TASKS_ALLOCvariabletodefinethenumberofMPIprocessestostart.
#!/bin/ksh
##RequiredDirectives
#PBSlselect=8:ncpus=36:mpiprocs=36
#PBSlwalltime=20:00:00
#PBSqstandard
#PBSAProject_ID
##OptionalDirectives
#PBSNtestjob
#PBSjoe
#[email protected]
#PBSmbe
##ExecutionBlock
#EnvironmentSetup
#cdtoyourscratchdirectory
#$JOBDIRisadirectorythatiscreatedwhenyourjobruns.
#Fileswithin$JOBDIRareprotectedfromourfilescrubber
#forthedurationofthejob,plus30daysafteritfinishes.
#FilesNOTwithina$JOBDIRaresusceptibletothenormal
#30dayfilelifespan.
cd${JOBDIR}
#stageinputdatafromarchive
archivegetC${ARCHIVE_HOME}/my_data_dir"*.dat"
#copytheexecutablefrom$HOME
cp${HOME}/my_prog.exe.
#Launching
mpiexec_mptn${BC_MPI_TASKS_ALLOC}./my_prog.exe>my_prog.out
#Cleanup
#archiveyourresults
#Usingthe"heredocument"syntax,createajobscript
#forarchivingyourdata.
cd${WORKDIR}
rmfarchive_job
cat>archive_job<<END
#!/bin/bash
#PBSlwalltime=12:00:00
#PBSqtransfer
#PBSAProject_ID
#PBSlselect=1:ncpus=1
#PBSjoe
cd${WORKDIR}
#Singlefilesareeasierandfastertoretrievefromthe
#archivethanmultiplefiles,socreatea.tarfileofyour
#results.
tarcvf${PBS_JOBID}.tar${JOBDIR}
#Createadirectoryin$ARCHIVE_HOMEnamedafterthis
#PBSjobidandplaceyour.tarfilethere.
archivemkdirC${ARCHIVE_HOME}${PBS_JOBID}
archiveputC${ARCHIVE_HOME}/${PBS_JOBID}${PBS_JOBID}.tar
archivels${ARCHIVE_HOME}/${PBS_JOBID}
#Removescratchdirectoryfromthefilesystem.
cd${WORKDIR}
rmrf${JOBDIR}
END
#Submitthearchivejobscript.
qsubarchive_job
8.2.MPIScript(accessingmorememoryperprocess)
Bydefault,anMPIjobrunsoneprocesspercore,withallprocessessharingtheavailablememoryonthenode.If
youneedmorememoryperprocess,thenyourjobneedstorunfewerMPIprocessespernode.
Thefollowingscriptrequests8nodes(288cores)butusesonlyonecorepernode.Thisstarts8MPIprocesses,
eachwithaccesstoabout126GBytesofmemory.Thejobrunsfor20hoursinthestandardqueue.
Notetheuseofthe$BC_MPI_TASKS_ALLOCenvironmentvariabletodefinethenumberofMPIprocessestostart.
#!/bin/ksh
##RequiredDirectives
#PBSlselect=8:ncpus=36:mpiprocs=1
#PBSlwalltime=20:00:00
#PBSqstandard
#PBSAProject_ID
##OptionalDirectives
#PBSNtestjob
#PBSjoe
#[email protected]
#PBSmbe
##ExecutionBlock
#EnvironmentSetup
#cdtoyourscratchdirectory
#$JOBDIRisadirectorythatiscreatedwhenyourjobruns.
#Fileswithin$JOBDIRareprotectedfromourfilescrubber
#forthedurationofthejob,plus30daysafteritfinishes.
#FilesNOTwithina$JOBDIRaresusceptibletothenormal
#30dayfilelifespan.
cd${JOBDIR}
#stageinputdatafromarchive
archivegetC${ARCHIVE_HOME}/my_data_dir"*.dat"
#copytheexecutablefrom$HOME
cp${HOME}/my_prog.exe.
#Launching
mpiexec_mptn${BC_MPI_TASKS_ALLOC}./my_prog.exe>my_prog.out
#Cleanup
#archiveyourresults
#Usingthe"heredocument"syntax,createajobscript
#forarchivingyourdata.
cd${WORKDIR}
rmfarchive_job
cat>archive_job<<END
#!/bin/bash
#PBSlwalltime=12:00:00
#PBSqtransfer
#PBSAProject_ID
#PBSlselect=1:ncpus=1
#PBSjoe
#PBSS/bin/bash
cd${WORKDIR}
#Singlefilesareeasierandfastertoretrievefromthe
#archivethanmultiplefiles,socreatea.tarfileofyour
#results.
tarcvf${PBS_JOBID}.tar${JOBDIR}
#Createadirectoryin$ARCHIVE_HOMEnamedafterthis
#PBSjobidandplaceyour.tarfilethere.
archivemkdirC${ARCHIVE_HOME}${PBS_JOBID}
archiveputC${ARCHIVE_HOME}/${PBS_JOBID}${PBS_JOBID}.tar
archivels${ARCHIVE_HOME}/${PBS_JOBID}
#Removescratchdirectoryfromthefilesystem.
cd${WORKDIR}
rmrf${JOBDIR}
END
#Submitthearchivejobscript.
qsubarchive_job
8.3.OpenMPScript
ThefollowingscriptisforanOpenMPjobusingonethreadpercoreonasinglenodeandrunningfor20hoursinthe
standardqueue.ThenumberofOpenMPthreadsissetusingthe$OMP_NUM_THREADSenvironmentwhichissetby
PBSusingtheompthreadsoptioninyourselectstatement:
#PBSlselect=1:ncpus=36:mpiprocs=36:ompthreads=36
Tostartyourapplicationwithfewerthan36threads,setompthreadstoanumberlessthan36:
#PBSlselect=1:ncpus=36:mpiprocs=36:ompthreads=N
Nisthenumberofthreads(upto36)thatyouwishtorunon.
#!/bin/ksh
##RequiredDirectives
#PBSlselect=1:ncpus=36:mpiprocs=36:ompthreads=36
#PBSlwalltime=20:00:00
#PBSqstandard
#PBSAProject_ID
##OptionalDirectives
#PBSNtestjob
#PBSjoe
#[email protected]
#PBSmbe
##ExecutionBlock
#EnvironmentSetup
#cdtoyourscratchdirectory
#$JOBDIRisadirectorythatiscreatedwhenyourjobruns.
#Fileswithin$JOBDIRareprotectedfromourfilescrubber
#forthedurationofthejob,plus30daysafteritfinishes.
#FilesNOTwithina$JOBDIRaresusceptibletothenormal
#30dayfilelifespan.
cd${JOBDIR}
#stageinputdatafromarchive
archivegetC${ARCHIVE_HOME}/my_data_dir"*.dat"
#copytheexecutablefrom$HOME
cp${HOME}/my_prog.exe.
#Launching
mpiexec_mptn${BC_MPI_TASKS_ALLOC}>my_prog.out
#Cleanup
#archiveyourresults
#Usingthe"heredocument"syntax,createajobscript
#forarchivingyourdata.
cd${WORKDIR}
rmfarchive_job
cat>archive_job<<END
#!/bin/bash
#PBSlwalltime=12:00:00
#PBSqtransfer
#PBSAProject_ID
#PBSlselect=1:ncpus=1
#PBSjoe
#PBSS/bin/bash
cd${WORKDIR}
#Singlefilesareeasierandfastertoretrievefromthe
#archivethanmultiplefiles,socreatea.tarfileofyour
#results.
tarcvf${PBS_JOBID}.tar${JOBDIR}
#Createadirectoryin$ARCHIVE_HOMEnamedafterthis
#PBSjobidandplaceyour.tarfilethere.
archivemkdirC${ARCHIVE_HOME}${PBS_JOBID}
archiveputC${ARCHIVE_HOME}/${PBS_JOBID}${PBS_JOBID}.tar
archivels${ARCHIVE_HOME}/${PBS_JOBID}
#Removescratchdirectoryfromthefilesystem.
cd${WORKDIR}
rmrf${JOBDIR}
END
#Submitthearchivejobscript.
qsubarchive_job
8.4.SHMEMScript
Thefollowingscriptisfora36coreSHMEMjobrunningfor20hoursinthestandardqueue.
Thescriptrequests1node,with36cores.SinceeachSHMEMthreadrequiresaccesstothesamepoolofmemory
aseachotherthread,thesejobsarelimitedtoasinglenodeofThunder,whichis36cores.
Notetheuseofthe$BC_CORES_PER_NODEenvironmentvariabletosetthevaluesofboth.Tostartyourapplication
withfewerthan16threads,replace$BC_CORES_PER_NODEwithalowervalue,likeso:
exportBC_CORES_PER_NODE=N
Nisthenumberofthreads(fewerthan16)thatyouwishtorunon.
#!/bin/ksh
##RequiredDirectives
#PBSlselect=1:ncpus=36:mpiprocs=36
#PBSlwalltime=20:00:00
#PBSqstandard
#PBSAProject_ID
##OptionalDirectives
#PBSNtestjob
#PBSjoe
#[email protected]
#PBSmbe
##ExecutionBlock
#EnvironmentSetup
#cdtoyourscratchdirectory
#$JOBDIRisadirectorythatiscreatedwhenyourjobruns.
#Fileswithin$JOBDIRareprotectedfromourfilescrubber
#forthedurationofthejob,plus30daysafteritfinishes.
#FilesNOTwithina$JOBDIRaresusceptibletothenormal
#30dayfilelifespan.
cd${JOBDIR}
#stageinputdatafromarchive
archivegetC${ARCHIVE_HOME}/my_data_dir"*.dat"
#copytheexecutablefrom$HOME
cp${HOME}/my_prog.exe.
#Launching
mpiexec_mptn${BC_CORES_PER_NODE}./my_prog.exe>my_prog.out
#Cleanup
#archiveyourresults
#Usingthe"heredocument"syntax,createajobscript
#forarchivingyourdata.
cd${WORKDIR}
rmfarchive_job
cat>archive_job<<END
#!/bin/bash
#PBSlwalltime=12:00:00
#PBSqtransfer
#PBSAProject_ID
#PBSlselect=1:ncpus=1
#PBSjoe
#PBSS/bin/bash
cd${WORKDIR}
#Singlefilesareeasierandfastertoretrievefromthe
#archivethanmultiplefiles,socreatea.tarfileofyour
#results.
tarcvf${PBS_JOBID}.tar${JOBDIR}
#Createadirectoryin$ARCHIVE_HOMEnamedafterthis
#PBSjobidandplaceyour.tarfilethere.
archivemkdirC${ARCHIVE_HOME}${PBS_JOBID}
archiveputC${ARCHIVE_HOME}/${PBS_JOBID}${PBS_JOBID}.tar
archivels${ARCHIVE_HOME}/${PBS_JOBID}
#Removescratchdirectoryfromthefilesystem.
cd${WORKDIR}
rmrf${JOBDIR}
END
#Submitthearchivejobscript.
qsubarchive_job
8.5.HybridMPI/OpenMPScript
Thefollowingscriptuses8nodes(288cores)withoneMPItaskpernodeandonethreadpercore.Thenumberof
threadsperMPIprocesswillequalthenumberofcorespernode.
Notetheuseoftheuseoftheselectstatementbelowtosetboth$BC_MPI_TASKS_ALLOC(mpiprocs=1)
and$OMP_NUM_THREADS(ompthreads=36).
Tostartyourapplicationwithfewerthan36threads,setompthreadstoanumberlessthan36:
#PBSlselect=1:ncpus=36:mpiprocs=1:ompthreads=N
Nisthenumberofthreads(upto36)thatyouwishtorunon.
#!/bin/ksh
##RequiredDirectives
#PBSlselect=8:ncpus=36:mpiprocs=1:ompthreads=36
#PBSlwalltime=20:00:00
#PBSqstandard
#PBSAProject_ID
##OptionalDirectives
#PBSNtestjob
#PBSjoe
#[email protected]
#PBSmbe
##ExecutionBlock
#EnvironmentSetup
#cdtoyourscratchdirectory
#$JOBDIRisadirectorythatiscreatedwhenyourjobruns.
#Fileswithin$JOBDIRareprotectedfromourfilescrubber
#forthedurationofthejob,plus30daysafteritfinishes.
#FilesNOTwithina$JOBDIRaresusceptibletothenormal
#30dayfilelifespan.
cd${JOBDIR}
#stageinputdatafromarchive
archivegetC${ARCHIVE_HOME}/my_data_dir"*.dat"
#copytheexecutablefrom$HOME
cp${HOME}/my_prog.exe.
#Launching
mpiexec_mptn${BC_MPI_TASKS_ALLOC}omplace./my_prog.exe>my_prog.out
#Cleanup
#archiveyourresults
#Usingthe"heredocument"syntax,createajobscript
#forarchivingyourdata.
cd${WORKDIR}
rmfarchive_job
cat>archive_job<<END
#!/bin/bash
#PBSlwalltime=12:00:00
#PBSqtransfer
#PBSAProject_ID
#PBSlselect=1:ncpus=1
#PBSjoe
#PBSS/bin/bash
cd${WORKDIR}
#Singlefilesareeasierandfastertoretrievefromthe
#archivethanmultiplefiles,socreatea.tarfileofyour
#results.
tarcvf${PBS_JOBID}.tar${JOBDIR}
#Createadirectoryin$ARCHIVE_HOMEnamedafterthis
#PBSjobidandplaceyour.tarfilethere.
archivemkdirC${ARCHIVE_HOME}${PBS_JOBID}
archiveputC${ARCHIVE_HOME}/${PBS_JOBID}${PBS_JOBID}.tar
archivels${ARCHIVE_HOME}/${PBS_JOBID}
#Removescratchdirectoryfromthefilesystem.
cd${WORKDIR}
rmrf${JOBDIR}
END
#Submitthearchivejobscript.
qsubarchive_job