Overview

We assume you are familiar with general concepts presented in getting started and your next step is to configure buildtest to run at your site. This guide will present you the necessary steps to get you started.

When you clone buildtest, we provide a default configuration that can be used to run on your laptop or workstation that supports Linux or Mac. The buildtest configuration uses a JSON schemafile settings.schema.json. for validating your configuration. We have published the schema guide for settings schema which you can find here.

Which configuration file does buildtest read?

buildtest will read configuration files in the following order:

  • Command line buildtest -c <config>.yml build

  • User Configuration - $HOME/.buildtest/config.yml

  • Default Configuration - $BUILDTEST_ROOT/buildtest/settings/config.yml

Default Configuration

Buildtest comes with a default configuration that can be found at buildtest/settings/config.yml relative to root of repo. At the start of buildtest execution, buildtest will load the configuration file and validate the configuration with JSON schema settings.schema.json. If it’s fails to validate, buildtest will raise an error.

We recommend you copy the default configuration as a template to configure buildtest for your site. To get started you should copy the file in $HOME/.buildtest/config.yml. Please run the following command:

$ cp $BUILDTEST_ROOT/buildtest/settings/config.yml $HOME/.buildtest/config.yml

Shown below is the default configuration provided by buildtest.

$ cat $BUILDTEST_ROOT/buildtest/settings/config.yml
system:
  generic:
    # specify list of hostnames where buildtest can run for given system record
    hostnames: [".*"]

    # system description
    description: Generic System
    # specify module system used at your site (environment-modules, lmod)
    moduletool: N/A
    # boolean to determine if buildspecs provided in buildtest repo should be loaded in buildspec cache
    load_default_buildspecs: True

    executors:
      # define local executors for running jobs locally
      local:
        bash:
          description: submit jobs on local machine using bash shell
          shell: bash

        sh:
          description: submit jobs on local machine using sh shell
          shell: sh

        csh:
          description: submit jobs on local machine using csh shell
          shell: csh

        zsh:
          description: submit jobs on local machine using zsh shell
          shell: zsh

        python:
          description: submit jobs on local machine using python shell
          shell: python

    # compiler block
    compilers:
      # regular expression to search for compilers based on module pattern. Used with 'buildtest config compilers find' to generate compiler instance
      # find:
      #  gcc: "^(gcc)"
      #  intel: "^(intel)"
      #  cray: "^(craype)"
      #  pgi: "^(pgi)"
      #  cuda: ^(cuda)"
      #  clang: "^(clang)"

      # declare compiler instance which can be site-specific. You can let 'buildtest config compilers find' generate compiler section
      compiler:
        gcc:
          builtin_gcc:
            cc: gcc
            fc: gfortran
            cxx: g++

    # location of log directory
    # logdir: /tmp/

    # specify location where buildtest will write tests
    # testdir: /tmp

    # specify one or more directory where buildtest should load buildspecs
    # buildspec_roots: []

    cdash:
      url: https://my.cdash.org/
      project: buildtest
      site: generic
      buildname: tutorials

As you can see the layout of configuration starts with keyword system which is used to define one or more systems. Your HPC site may contain more than one cluster, so you should define your clusters with meaningful names as this will impact when you reference executors in buildspecs. In this example, we define one cluster called generic which is a dummy cluster used for running tutorial examples. The required fields in the system scope are the following:

"required": ["executors", "moduletool", "load_default_buildspecs","hostnames", "compilers"]

The hostnames field is a list of nodes that belong to the cluster where buildtest should be run. Generally, these hosts should be your login nodes in your cluster. buildtest will process hostnames field across all system entry using re.match until a hostname is found, if none is found we report an error.

In this example we defined two systems machine, machine2 with the following hostnames.

system:
  machine1:
    hostnames:  ['loca$', '^1DOE']
  machine2:
    hostnames: ['BOB|JOHN']

In this example, none of the host entries match with hostname DOE-7086392.local so we get an error since buildtest needs to detect a system before proceeding.

buildtest.exceptions.BuildTestError: "Based on current system hostname: DOE-7086392.local we cannot find a matching system  ['machine1', 'machine2'] based on current hostnames: {'machine1': ['loca$', '^1DOE'], 'machine2': ['BOB|JOHN']} "

Let’s assume you we have a system named mycluster that should run on nodes login1, login2, and login3. You can specify hostnames as follows.

system:
  mycluster:
    hostnames: ["login1", "login2", "login3"]

Alternately, you can use regular expression to condense this list

system:
  mycluster:
    hostnames: ["login[1-3]"]

Configuring Module Tool

You should configure the moduletool property to the module-system installed at your site. Valid options are the following:

# environment-modules
moduletool: environment-modules

# for lmod
moduletool: lmod

# specify N/A if you don't have modules
moduletool: N/A

buildspec roots

buildtest can discover buildspec using buildspec_roots keyword. This field is a list of directory paths to search for buildspecs. For example we clone the repo https://github.com/buildtesters/buildtest-cori at $HOME/buildtest-cori and assign this to buildspec_roots as follows:

buildspec_roots:
  - $HOME/buildtest-cori

This field is used with the buildtest buildspec find command. If you rebuild your buildspec cache via --rebuild option, buildtest will search for all buildspecs in directories specified by buildspec_roots property. buildtest will recursively find all .yml extension and validate each buildspec with appropriate schema.

Load Default Buildspecs

By default buildtest will add the $BUILDTEST_ROOT/tutorials and $BUILDTEST_ROOT/general_tests to search path when searching for buildspecs with buildtest buildspec find command. This can configured via load_default_buildspecs property which expects a boolean value.

By default we enable this property, however in practice you would want to disable this load_default_buildspecs: False if you only care about running your facility tests.

What is an executor?

An executor is responsible for running the test and capture output/error file and return code. An executor can be local executor which runs tests on local machine or batch executor that can be modelled as partition/queue. A batch executor is responsible for dispatching job, then poll job until its finish, and gather job metrics from scheduler.

Executor Declaration

The executors is a JSON object, that defines one or more executors. The executors are grouped by their type followed by executor name. In this example we define two local executors bash, sh and one slurm executor called regular:

system:
  generic:
    executors:
      local:
        bash:
          shell: bash
          description: bash shell
        sh:
          shell: sh
          description: sh shell
      slurm:
        regular:
          queue: regular

The LocalExecutors are defined in section local where each executor must be unique name and they are referenced in buildspec using executor field in the following format:

executor: <system>.<type>.<name>

For instance, if a buildspec wants to reference the LocalExecutor bash from the generic cluster, you would specify the following in the buildspec:

executor: generic.local.bash

In our example configuration, we defined a bash executor as follows:

executors:
  # define local executors for running jobs locally
  local:
    bash:
      description: submit jobs on local machine using bash shell
      shell: bash

The local executors requires the shell key which takes the pattern "^(/bin/bash|/bin/sh|/bin/csh|/bin/tcsh|/bin/zsh|sh|bash|csh|tcsh|zsh|python).*". Any buildspec that references this executor will submit job using bash shell.

You can pass options to shell which will get passed into each job submission. For instance if you want all bash scripts to run in login shell you can specify bash --login:

executors:
  local:
    login_bash:
      shell: bash --login

Then you can reference this executor as executor: generic.local.login_bash and your tests will be submitted via bash --login /path/to/test.sh.

Once you define your executors, you can query the executors via buildtest config executors command.

Configuring test directory

The default location where tests are written is $BUILDTEST_ROOT/var/tests where $BUILDTEST_ROOT is the root of buildtest repo. You may specify testdir in your configuration to instruct where tests can be written. For instance, if you want to write tests in /tmp you can set the following:

testdir: /tmp

Alternately, one can specify test directory via buildtest build --testdir <path> which has highest precedence and overrides configuration and default value.

Configuring log path

You can configure where buildtest will write logs using logdir property. For example, in example below buildtest will write log files $HOME/Documents/buildtest/var/logs. buildtest will resolve variable expansion to get real path on filesystem.

# location of log directory
logdir: $HOME/Documents/buildtest/var/logs

logdir is not required field in configuration, if it’s not specified then buildtest will write logs based on tempfile library which may vary based on platform (Linux, Mac).

The buildtest logs will start with buildtest_ followed by random identifier with a .log extension.

buildtest will write the same log file in $BUILDTEST_ROOT/buildtest.log which can be used to fetch last build log. This can be convenient if you don’t remember the directory path to log file.

before_script for executors

Often times, you may want to run a set of commands for a group of tests before running a test. We can do this using this using the before_script field which is defined in each executorthat is of string type that expects bash commands.

This can be demonstrated with an executor name local.e4s responsible for building E4S Testsuite

local:
  e4s:
    description: "E4S testsuite locally"
    shell: bash
    before_script: |
      cd $SCRATCH
      git clone https://github.com/E4S-Project/testsuite.git
      cd testsuite
      source /global/common/software/spackecp/luke-wyatt-testing/spack/share/spack/setup-env.sh
      source setup.sh

The e4s executor attempts to clone E4S Testsuite in $SCRATCH and activate a spack environment and run the initialize script source setup.sh. buildtest will write a before_script.sh for every executor. This can be found in var/executors directory as shown below

$ tree var/executors/
var/executors/
|-- local.bash
|   |-- before_script.sh
|-- local.e4s
|   |-- before_script.sh
|-- local.python
|   |-- before_script.sh
|-- local.sh
|   |-- before_script.sh


4 directories, 4 files

The before_script field is available for all executors and if its not specified the file will be empty. Every test will source these scripts for the appropriate executor.

Cori @ NERSC

Shown below is the configuration file used at Cori.

$ wget -q -O - https://raw.githubusercontent.com/buildtesters/buildtest-cori/devel/config.yml 2>&1
system:
  gerty:
    hostnames:
    - gert01.nersc.gov
    load_default_buildspecs: false
    moduletool: environment-modules
    executors:
      local:
        bash:
          description: submit jobs on local machine using bash shell
          shell: bash
        sh:
          description: submit jobs on local machine using sh shell
          shell: sh
        csh:
          description: submit jobs on local machine using csh shell
          shell: csh
        python:
          description: submit jobs on local machine using python shell
          shell: python
    compilers:
      compiler:
        gcc:
          builtin_gcc:
            cc: /usr/bin/gcc
            cxx: /usr/bin/g++
            fc: /usr/bin/gfortran
    cdash:
      url: https://my.cdash.org
      project: buildtest-cori
      site: gerty

  perlmutter:
    hostnames:
    - login*
    load_default_buildspecs: false
    moduletool: lmod
    executors:
      defaults:
        pollinterval: 60
        launcher: sbatch
        max_pend_time: 90
      local:
        bash:
          description: submit jobs on local machine using bash shell
          shell: bash
        sh:
          description: submit jobs on local machine using sh shell
          shell: sh
        csh:
          description: submit jobs on local machine using csh shell
          shell: csh
        python:
          description: submit jobs on local machine using python shell
          shell: python
    compilers:
      find:
        gcc: ^(gcc)
      compiler:
        gcc:
          builtin_gcc:
            cc: /usr/bin/gcc
            cxx: /usr/bin/g++
            fc: /usr/bin/gfortran

    cdash:
      url: https://my.cdash.org
      project: buildtest-cori
      site: perlmutter
  cori:
    hostnames:
    - cori*
    load_default_buildspecs: false
    moduletool: environment-modules
    cdash:
      url: https://my.cdash.org
      project: buildtest-cori
      site: cori
    executors:
      defaults:
        pollinterval: 30
        launcher: sbatch
        max_pend_time: 300
      local:
        bash:
          description: submit jobs on local machine using bash shell
          shell: bash
        sh:
          description: submit jobs on local machine using sh shell
          shell: sh
        csh:
          description: submit jobs on local machine using csh shell
          shell: csh
        python:
          description: submit jobs on local machine using python shell
          shell: python
        e4s:
          description: E4S testsuite locally
          shell: bash
          before_script: |
            module load e4s/20.10
            cd $SCRATCH/testsuite
            source setup.sh

      slurm:
        haswell_debug:
          qos: debug
          cluster: cori
          options:
          - -C haswell
          description: debug queue on Haswell partition
        haswell_shared:
          qos: shared
          cluster: cori
          options:
          - -C haswell
          description: shared queue on Haswell partition
        haswell_regular:
          qos: normal
          cluster: cori
          options:
          - -C haswell
          description: normal queue on Haswell partition
        haswell_premium:
          qos: premium
          cluster: cori
          options:
          - -C haswell
          description: premium queue on Haswell partition
        knl_flex:
          description: overrun queue on KNL partition
          qos: overrun
          cluster: cori
          options:
          - -C knl
        bigmem:
          description: bigmem jobs
          cluster: escori
          qos: bigmem
          max_pend_time: 300
        xfer:
          description: xfer qos jobs
          qos: xfer
          cluster: escori
          options:
          - -C haswell
        compile:
          description: compile qos jobs
          qos: compile
          cluster: escori
          options:
          - -N 1
        knl_debug:
          qos: debug
          cluster: cori
          options:
          - -C knl,quad,cache
          description: debug queue on KNL partition
        knl_regular:
          qos: normal
          cluster: cori
          options:
          - -C knl,quad,cache
          description: normal queue on KNL partition
        knl_premium:
          qos: premium
          cluster: cori
          options:
          - -C knl,quad,cache
          description: premium queue on KNL partition
        knl_low:
          qos: low
          cluster: cori
          options:
          - -C knl,quad,cache
          description: low queue on KNL partition
        knl_overrun:
          description: overrun queue on KNL partition
          qos: overrun
          cluster: cori
          options:
          - -C knl
          - --time-min=01:00:00
        gpu:
          description: submit jobs to GPU partition
          options:
          - -C gpu
          cluster: escori
          max_pend_time: 300
        e4s:
          description: E4S runner
          cluster: cori
          max_pend_time: 20000
          options:
          - -q regular
          - -C knl
          - -t 10
          - -n 4
          before_script:  |
            module load e4s/20.10
            cd $SCRATCH/testsuite
            source setup.sh

    compilers:
      find:
        gcc: ^(gcc|PrgEnv-gnu)
        cray: ^(PrgEnv-cray)
        intel: ^(intel|PrgEnv-intel)
        cuda: ^(cuda/)
        upcxx: ^(upcxx)
      compiler:
        gcc:
          builtin_gcc:
            cc: /usr/bin/gcc
            cxx: /usr/bin/g++
            fc: /usr/bin/gfortran
          PrgEnv-gnu/6.0.5:
            cc: gcc
            cxx: g++
            fc: gfortran
            module:
              load:
              - PrgEnv-gnu/6.0.5
              purge: false
          PrgEnv-gnu/6.0.7:
            cc: gcc
            cxx: g++
            fc: gfortran
            module:
              load:
              - PrgEnv-gnu/6.0.7
              purge: false
          PrgEnv-gnu/6.0.9:
            cc: gcc
            cxx: g++
            fc: gfortran
            module:
              load:
              - PrgEnv-gnu/6.0.9
              purge: false
          gcc/6.1.0:
            cc: gcc
            cxx: g++
            fc: gfortran
            module:
              load:
              - gcc/6.1.0
              purge: false
          gcc/7.3.0:
            cc: gcc
            cxx: g++
            fc: gfortran
            module:
              load:
              - gcc/7.3.0
              purge: false
          gcc/8.1.0:
            cc: gcc
            cxx: g++
            fc: gfortran
            module:
              load:
              - gcc/8.1.0
              purge: false
          gcc/8.2.0:
            cc: gcc
            cxx: g++
            fc: gfortran
            module:
              load:
              - gcc/8.2.0
              purge: false
          gcc/8.3.0:
            cc: gcc
            cxx: g++
            fc: gfortran
            module:
              load:
              - gcc/8.3.0
              purge: false
          gcc/9.3.0:
            cc: gcc
            cxx: g++
            fc: gfortran
            module:
              load:
              - gcc/9.3.0
              purge: false
          gcc/10.1.0:
            cc: gcc
            cxx: g++
            fc: gfortran
            module:
              load:
              - gcc/10.1.0
              purge: false
          gcc/6.3.0:
            cc: gcc
            cxx: g++
            fc: gfortran
            module:
              load:
              - gcc/6.3.0
              purge: false
          gcc/8.1.1-openacc-gcc-8-branch-20190215:
            cc: gcc
            cxx: g++
            fc: gfortran
            module:
              load:
              - gcc/8.1.1-openacc-gcc-8-branch-20190215
              purge: false
        cray:
          PrgEnv-cray/6.0.5:
            cc: cc
            cxx: CC
            fc: ftn
            module:
              load:
              - PrgEnv-cray/6.0.5
              purge: false
          PrgEnv-cray/6.0.7:
            cc: cc
            cxx: CC
            fc: ftn
            module:
              load:
              - PrgEnv-cray/6.0.7
              purge: false
          PrgEnv-cray/6.0.9:
            cc: cc
            cxx: CC
            fc: ftn
            module:
              load:
              - PrgEnv-cray/6.0.9
              purge: false
        intel:
          PrgEnv-intel/6.0.5:
            cc: icc
            cxx: icpc
            fc: ifort
            module:
              load:
              - PrgEnv-intel/6.0.5
              purge: false
          PrgEnv-intel/6.0.7:
            cc: icc
            cxx: icpc
            fc: ifort
            module:
              load:
              - PrgEnv-intel/6.0.7
              purge: false
          PrgEnv-intel/6.0.9:
            cc: icc
            cxx: icpc
            fc: ifort
            module:
              load:
              - PrgEnv-intel/6.0.9
              purge: false
          intel/19.0.3.199:
            cc: icc
            cxx: icpc
            fc: ifort
            module:
              load:
              - intel/19.0.3.199
              purge: false
          intel/19.1.2.254:
            cc: icc
            cxx: icpc
            fc: ifort
            module:
              load:
              - intel/19.1.2.254
              purge: false
          intel/16.0.3.210:
            cc: icc
            cxx: icpc
            fc: ifort
            module:
              load:
              - intel/16.0.3.210
              purge: false
          intel/17.0.1.132:
            cc: icc
            cxx: icpc
            fc: ifort
            module:
              load:
              - intel/17.0.1.132
              purge: false
          intel/17.0.2.174:
            cc: icc
            cxx: icpc
            fc: ifort
            module:
              load:
              - intel/17.0.2.174
              purge: false
          intel/18.0.1.163:
            cc: icc
            cxx: icpc
            fc: ifort
            module:
              load:
              - intel/18.0.1.163
              purge: false
          intel/18.0.3.222:
            cc: icc
            cxx: icpc
            fc: ifort
            module:
              load:
              - intel/18.0.3.222
              purge: false
          intel/19.0.0.117:
            cc: icc
            cxx: icpc
            fc: ifort
            module:
              load:
              - intel/19.0.0.117
              purge: false
          intel/19.0.8.324:
            cc: icc
            cxx: icpc
            fc: ifort
            module:
              load:
              - intel/19.0.8.324
              purge: false
          intel/19.1.0.166:
            cc: icc
            cxx: icpc
            fc: ifort
            module:
              load:
              - intel/19.1.0.166
              purge: false
          intel/19.1.1.217:
            cc: icc
            cxx: icpc
            fc: ifort
            module:
              load:
              - intel/19.1.1.217
              purge: false
          intel/19.1.2.275:
            cc: icc
            cxx: icpc
            fc: ifort
            module:
              load:
              - intel/19.1.2.275
              purge: false
          intel/19.1.3.304:
            cc: icc
            cxx: icpc
            fc: ifort
            module:
              load:
              - intel/19.1.3.304
              purge: false
        cuda:
          cuda/9.2.148:
            cc: nvcc
            cxx: nvcc
            fc: None
            module:
              load:
              - cuda/9.2.148
              purge: false
          cuda/10.0.130:
            cc: nvcc
            cxx: nvcc
            fc: None
            module:
              load:
              - cuda/10.0.130
              purge: false
          cuda/10.1.105:
            cc: nvcc
            cxx: nvcc
            fc: None
            module:
              load:
              - cuda/10.1.105
              purge: false
          cuda/10.1.168:
            cc: nvcc
            cxx: nvcc
            fc: None
            module:
              load:
              - cuda/10.1.168
              purge: false
          cuda/10.1.243:
            cc: nvcc
            cxx: nvcc
            fc: None
            module:
              load:
              - cuda/10.1.243
              purge: false
          cuda/10.2.89:
            cc: nvcc
            cxx: nvcc
            fc: None
            module:
              load:
              - cuda/10.2.89
              purge: false
          cuda/11.0.2:
            cc: nvcc
            cxx: nvcc
            fc: None
            module:
              load:
              - cuda/11.0.2
              purge: false
          cuda/shifter:
            cc: nvcc
            cxx: nvcc
            fc: None
            module:
              load:
              - cuda/shifter
              purge: false
        upcxx:
          upcxx/2019.9.0:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx/2019.9.0
              purge: false
          upcxx/2020.3.0:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx/2020.3.0
              purge: false
          upcxx/2020.3.2:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx/2020.3.2
              purge: false
          upcxx/2020.3.8-snapshot:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx/2020.3.8-snapshot
              purge: false
          upcxx/2020.10.0:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx/2020.10.0
              purge: false
          upcxx/2020.11.0:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx/2020.11.0
              purge: false
          upcxx/bleeding-edge:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx/bleeding-edge
              purge: false
          upcxx/nightly:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx/nightly
              purge: false
          upcxx-bupc-narrow/2019.9.0:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx-bupc-narrow/2019.9.0
              purge: false
          upcxx-bupc-narrow/2020.3.0:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx-bupc-narrow/2020.3.0
              purge: false
          upcxx-bupc-narrow/2020.3.2:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx-bupc-narrow/2020.3.2
              purge: false
          upcxx-bupc-narrow/2020.3.8-snapshot:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx-bupc-narrow/2020.3.8-snapshot
              purge: false
          upcxx-bupc-narrow/2020.11.0:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx-bupc-narrow/2020.11.0
              purge: false
          upcxx-bupc-narrow/bleeding-edge:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx-bupc-narrow/bleeding-edge
              purge: false
          upcxx-cs267/2020.10.0:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx-cs267/2020.10.0
              purge: false
          upcxx-extras/2020.3.0:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx-extras/2020.3.0
              purge: false
          upcxx-extras/2020.3.8:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx-extras/2020.3.8
              purge: false
          upcxx-extras/master:
            cc: upcxx
            cxx: upcxx
            fc: None
            module:
              load:
              - upcxx-extras/master
              purge: false

Default Executor Settings

We can define default configurations for all executors using the defaults property.

executors:
  defaults:
    pollinterval: 10
    launcher: sbatch
    max_pend_time: 90
    account: nstaff

The launcher field is applicable for batch executors in this case, launcher: sbatch inherits sbatch as the job launcher for all slurm executors.

The account: nstaff will instruct buildtest to charge all jobs to account nstaff from Slurm Executors. The account option can be set in defaults field to all executors or defined per executor instance which overrides the default value.

Poll Interval

The pollinterval field is used to poll jobs at set interval in seconds when job is active in queue. The poll interval can be configured on command line using buildtest build --poll-interval which overrides the configuration value.

Note

pollinterval, launcher and max_pend_time have no effect on local executors.

Max Pend Time

The max_pend_time is maximum time job can be pending within an executor, if it exceeds the limit buildtest will cancel the job.

The max_pend_time option can be overridden per executor level for example the section below overrides the default to 300 seconds:

bigmem:
  description: bigmem jobs
  cluster: escori
  qos: bigmem
  max_pend_time: 300

The max_pend_time is used to cancel job only if job is pending in queue, it has no impact if job is running. buildtest starts a timer at job submission and every poll interval (pollinterval field) checks if job has exceeded max_pend_time only if job is pending. If job pendtime exceeds max_pend_time limit, buildtest will cancel job the job using the appropriate scheduler command like (scancel, bkill, qdel). Buildtestwill remove cancelled jobs from poll queue, in addition cancelled jobs won’t be reported in test report.

For more details on max_pend_time click here.

Specifying QoS (Slurm)

At Cori, jobs are submitted via qos instead of partition so we model a slurm executor named by qos. The qos field instructs which Slurm QOS to use when submitting job. For example we defined a slurm executor named haswell_debug which will submit jobs to debug qos on the haswell partition as follows:

executors:
  slurm:
    haswell_debug:
      qos: debug
      cluster: cori
      options:
      - -C haswell

The cluster field specifies which slurm cluster to use (i.e sbatch --clusters=<string>). In-order to use bigmem, xfer, or gpu qos at Cori, we need to specify escori cluster (i.e sbatch --clusters=escori).

buildtest will detect slurm configuration and check qos, partition, cluster match with buildtest configuration. In addition, buildtest supports multi-cluster job submission and monitoring from remote cluster. This means if you specify cluster field buildtest will poll jobs using sacct with the cluster name as follows: sacct -M <cluster>.

The options field is use to specify any additional options to launcher (sbatch) on command line. For instance, slurm.gpu executor, we use the options: -C gpu to submit to Cori GPU cluster which requires sbatch -M escori -C gpu. Any additional #SBATCH options are defined in buildspec for more details see batch scheduler support.

PBS Executors

buildtest supports PBS scheduler which can be defined in the executors section. Shown below is an example configuration using one pbs executor named workq. The property queue: workq defines the name of PBS queue that is available in your system.

 1system:
 2  generic:
 3    hostnames: ['.*']
 4
 5    moduletool: N/A
 6    load_default_buildspecs: True
 7    executors:
 8      defaults:
 9         pollinterval: 10
10         launcher: qsub
11         max_pend_time: 30
12      pbs:
13        workq:
14          queue: workq
15    compilers:
16      compiler:
17        gcc:
18          default:
19            cc: /usr/bin/gcc
20            cxx: /usr/bin/g++
21            fc: /usr/bin/gfortran

buildtest will detect the PBS queues in your system and determine if queues are active and enabled before submitting job to scheduler. buildtest will run qstat -Q -f -F json command to check for queue state which reports in JSON format and check if queue has the fields enabled: "True" or started: "True" set in the queue definition. If these values are not set, buildtest will raise an exception.

Shown below is an example with one queue workq that is enabled and started.

 1$ qstat -Q -f -F json
 2{
 3    "timestamp":1615924938,
 4    "pbs_version":"19.0.0",
 5    "pbs_server":"pbs",
 6    "Queue":{
 7        "workq":{
 8            "queue_type":"Execution",
 9            "total_jobs":0,
10            "state_count":"Transit:0 Queued:0 Held:0 Waiting:0 Running:0 Exiting:0 Begun:0 ",
11            "resources_assigned":{
12                "mem":"0kb",
13                "ncpus":0,
14                "nodect":0
15            },
16            "hasnodes":"True",
17            "enabled":"True",
18            "started":"True"
19        }
20    }
21}

PBS Limitation

Note

Please note that buildtest PBS support relies on job history set because buildtest needs to query job after completion using qstat -x. This can be configured using qmgr by setting set server job_history_enable=True. For more details see section 13.15.5.1 Enabling Job History in PBS 2020.1 Admin Guide

CDASH Configuration

buildtest can be configured to push test to CDASH. The default configuration file provides a CDASH configuration for buildtest project is the following.

cdash:
  url: https://my.cdash.org/
  project: buildtest
  site: generic
  buildname: tutorials

The cdash section can be summarized as follows:

  • url: URL to CDASH server

  • project: Project Name in CDASH server

  • site: Site name that shows up in CDASH entry. This should be name of your system name

  • buildname: Build Name that shows up in CDASH, this can be any name you want.

The cdash settings can be used with buildtest cdash command. For more details see CDASH Integration.