buildtest.scheduler.lsf
¶
Module Contents¶
Attributes¶
- buildtest.scheduler.lsf.logger¶
- class buildtest.scheduler.lsf.LSFJob(jobID)[source]¶
Bases:
buildtest.scheduler.job.Job
This is a base class for holding job level data and common methods for used for batch job submission.
- is_pending()[source]¶
Check if Job is pending which is reported by LSF as
PEND
. ReturnTrue
if there is a match otherwise returnsFalse
- is_running()[source]¶
Check if Job is running which is reported by LSF as
RUN
. ReturnTrue
if there is a match otherwise returnsFalse
- is_complete()[source]¶
Check if Job is complete which is in
DONE
state. ReturnTrue
if there is a match otherwise returnFalse
- is_suspended()[source]¶
Check if Job is in suspended state which could be in any of the following states: [
PSUSP
,USUSP
,SSUSP
]. We returnTrue
if job is in one of the states otherwise returnFalse
- is_failed()[source]¶
Check if Job failed. We return
True
if job is inEXIT
state otherwise returnFalse
- poll()[source]¶
Given a job id we poll the LSF Job by retrieving its job state, output file, error file and exit code. We run the following commands to retrieve following states
Job State:
bjobs -noheader -o 'stat' <JOBID>
Exit Code:
bjobs -noheader -o 'EXIT_CODE' <JOBID>'
- gather()[source]¶
This method will retrieve the output and error file for a given jobID using the following commands.
$ bjobs -noheader -o 'output_file' 70910 hold_job.out
$ bjobs -noheader -o 'error_file' 70910 hold_job.err
We will gather job record at onset of job completion by running
bjobs -o '<format1> <format2>' <jobid> -json
. The format fields extracted from job are the following:“job_name”
“stat”
“user”
“user_group”
“queue”
“proj_name”
“pids”
“exit_code”
“from_host”
“exec_host”
“submit_time”
“start_time”
“finish_time”
“nthreads”
“exec_home”
“exec_cwd”
“output_file”
“error_file”
Shown below is the output format and we retrieve the job records defined in RECORDS property
$ bjobs -o 'job_name stat user user_group queue proj_name pids exit_code from_host exec_host submit_time start_time finish_time nthreads exec_home exec_cwd output_file error_file' 58652 -json { "COMMAND":"bjobs", "JOBS":1, "RECORDS":[ { "JOB_NAME":"hold_job", "STAT":"PSUSP", "USER":"shahzebsiddiqui", "USER_GROUP":"GEN014ECPCI", "QUEUE":"batch", "PROJ_NAME":"GEN014ECPCI", "PIDS":"", "EXIT_CODE":"", "FROM_HOST":"login1", "EXEC_HOST":"", "SUBMIT_TIME":"May 28 12:45", "START_TIME":"", "FINISH_TIME":"", "NTHREADS":"", "EXEC_HOME":"", "EXEC_CWD":"", "OUTPUT_FILE":"hold_job.out", "ERROR_FILE":"hold_job.err" } ] }