buildtest.executors.cobalt

This method implements CobaltExecutor class which is defines how cobalt executor submit job to Cobalt scheduler.

Module Contents

Classes

CobaltExecutor

The CobaltExecutor class is responsible for submitting jobs to Cobalt Scheduler.

CobaltJob

The CobaltJob class performs operation on cobalt job upon job submission such

Attributes

logger

buildtest.executors.cobalt.logger
class buildtest.executors.cobalt.CobaltExecutor(name, settings, site_configs, account=None, max_pend_time=None)[source]

Bases: buildtest.executors.base.BaseExecutor

The CobaltExecutor class is responsible for submitting jobs to Cobalt Scheduler. The class implements the following methods:

  • load: load Cobalt executors from configuration file

  • dispatch: submit Cobalt job to scheduler

  • poll: poll Cobalt job via qstat and retrieve job state

  • gather: gather job record including output, error, exit code

Initiate a base executor, meaning we provide a name (also held by the BuildExecutor base that holds it) and the loaded dictionary of config opts to parse.

Parameters
type = cobalt
launcher = qsub
load(self)[source]

Load the a Cobalt executor configuration from buildtest settings.

launcher_command(self, numprocs, numnodes)[source]
dispatch(self, builder)[source]

This method is responsible for dispatching job to Cobalt Scheduler by invoking builder.run() which runs the build script. If job is submitted to scheduler, we get the JobID and pass this to CobaltJob class. At job submission, cobalt will report the output and error file which can be retrieved using qstat. We retrieve the cobalt job record using builder.job.gather().

Parameters

builder (buildtest.buildsystem.base.BuilderBase) – An instance object of BuilderBase type

poll(self, builder)[source]

This method is responsible for polling Cobalt job by invoking the builder method builder.job.poll(). We check the job state and existence of output file. If file exists or job is complete, we gather the results and return from function. If job is pending we check if job time exceeds max_pend_time time limit and cancel job.

Parameters

builder (buildtest.buildsystem.base.BuilderBase) – An instance object of BuilderBase type

gather(self, builder)[source]

This method is responsible for moving output and error file in the run directory. We need to read <JOBID>.cobaltlog file which contains output of exit code by performing a regular expression (exit code of.)(\d+)(\;). The cobalt log file will contain a line: task completed normally with an exit code of 0; initiating job cleanup and removal

Parameters

builder (buildtest.buildsystem.base.BuilderBase) – An instance object of BuilderBase type

class buildtest.executors.cobalt.CobaltJob(jobID)[source]

Bases: buildtest.executors.job.Job

The CobaltJob class performs operation on cobalt job upon job submission such as polling job, gather job record, cancel job. We also retrieve job state and determine if job is pending, running, complete, suspended.

is_pending(self)[source]

Return True if job is pending otherwise returns False. When cobalt recieves job it is in starting followed by queued state. We check if job is in either state.

is_running(self)[source]

Return True if job is running otherwise returns False. Cobalt job state for running job is is marked as running

is_complete(self)[source]

Return True if job is complete otherwise returns False. Cobalt job state for completed job job is marked as exiting

is_suspended(self)[source]

Return True if job is suspended otherwise returns False. Cobalt job state for suspended is marked as user_hold

is_cancelled(self)[source]

Return True if job is cancelled otherwise returns False. Job state is cancelled which is set by class cancel method

cobalt_log(self)[source]

Return job cobalt.log file

output_file(self)[source]

Return job output file

error_file(self)[source]

Return job error file

exitcode(self)[source]

Return job exit code

poll(self)[source]

Poll job by running qstat -l --header State <jobid> which retrieves job state.

gather(self)[source]

Gather Job state by running qstat -lf <jobid> which retrieves all fields. The output is in text format which is parsed into key/value pair and stored in a dictionary. This method will return a dict containing the job record

$ qstat -lf 347106
   JobID: 347106
       JobName           : hold_job
       User              : shahzebsiddiqui
       WallTime          : 00:10:00
       QueuedTime        : 00:13:14
       RunTime           : N/A
       TimeRemaining     : N/A
cancel(self)[source]

Cancel job by running qdel <jobid>. This method is called if job timer exceeds max_pend_time if job is pending.