buildtest.scheduler.pbs
¶
Module Contents¶
Classes¶
The PBSJob models a PBS Job with helper methods to retrieve job state, check if job is running/pending/suspended. We have methods |
Attributes¶
- buildtest.scheduler.pbs.logger¶
- class buildtest.scheduler.pbs.PBSJob(jobID)[source]¶
Bases:
buildtest.scheduler.job.Job
The PBSJob models a PBS Job with helper methods to retrieve job state, check if job is running/pending/suspended. We have methods to poll job state, gather job results upon completion and cancel job.
- is_suspended()[source]¶
Return
True
if job is suspended which would be in one of these statesH
,U
,S
.
- success()[source]¶
This method determines if job was completed successfully and returns
True
if exit code is 0.According to https://help.altair.com/2021.1.3/PBS%20Professional/PBSAdminGuide2021.1.3.pdf section 14.9 Job Exit Status Codes we have the following
Exit Code: X < 0 - Job could not be executed
Exit Code: 0 <= X < 128 - Exit value of Shell or top-level process
Exit Code: X >= 128 - Job was killed by signal
Exit Code: X == 0 - Job executed was a successful
- poll()[source]¶
This method will poll the PBS Job by running
qstat -x -f -F json <jobid>
which will report job data in JSON format that can be parsed to extract the job state. In PBS the active job state can be retrieved by reading propertyjob_state
property. Shown below is an example output[pbsuser@pbs tests]$ qstat -x -f -F json 157.pbs { "timestamp":1630683518, "pbs_version":"19.0.0", "pbs_server":"pbs", "Jobs":{ "157.pbs":{ "Job_Name":"pbs_hold_job", "Job_Owner":"pbsuser@pbs", "job_state":"H", "queue":"workq", "server":"pbs", "Checkpoint":"u", "ctime":"Fri Aug 20 23:14:08 2021", "Error_Path":"pbs:/tmp/GitHubDesktop/buildtest/var/tests/generic.pbs.workq/hold/pbs_hold_job/da6d5b57/stage/pbs_hold_job.e157", "Hold_Types":"u", "Join_Path":"n", "Keep_Files":"n", "Mail_Points":"a", "mtime":"Fri Aug 20 23:14:08 2021", "Output_Path":"pbs:/tmp/GitHubDesktop/buildtest/var/tests/generic.pbs.workq/hold/pbs_hold_job/da6d5b57/stage/pbs_hold_job.o157", "Priority":0, "qtime":"Fri Aug 20 23:14:08 2021", "Rerunable":"True", "Resource_List":{ "ncpus":1, "nodect":1, "nodes":1, "place":"scatter", "select":"1:ncpus=1", "walltime":"00:02:00" }, "substate":20, "Variable_List":{ "PBS_O_HOME":"/home/pbsuser", "PBS_O_LOGNAME":"pbsuser", "PBS_O_PATH":"/tmp/GitHubDesktop/buildtest/bin:/tmp/github/buildtest/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/pbs/bin:/home/pbsuser/.local/bin:/home/pbsuser/bin", "PBS_O_MAIL":"/var/spool/mail/pbsuser", "PBS_O_SHELL":"/bin/bash", "PBS_O_WORKDIR":"/tmp/GitHubDesktop/buildtest/var/tests/generic.pbs.workq/hold/pbs_hold_job/da6d5b57/stage", "PBS_O_SYSTEM":"Linux", "PBS_O_QUEUE":"workq", "PBS_O_HOST":"pbs" }, "Submit_arguments":"-q workq /tmp/GitHubDesktop/buildtest/var/tests/generic.pbs.workq/hold/pbs_hold_job/da6d5b57/stage/pbs_hold_job.sh", "project":"_pbs_project_default" } } }