Overview¶
We assume you are familiar with general concepts presented in getting started and your next step is to configure buildtest to run at your site. This guide will present you the necessary steps to get you started.
When you clone buildtest, we provide a default configuration that can be used to run on your laptop or workstation that supports Linux or Mac. The buildtest configuration uses a JSON schemafile settings.schema.json. for validating your configuration. We have published the schema guide for settings schema which you can find here.
Which configuration file does buildtest read?¶
buildtest will read configuration files in the following order:
Command line
buildtest -c <config>.yml build
Environment variable - BUILDTEST_CONFIGFILE
User Configuration -
$HOME/.buildtest/config.yml
Default Configuration -
$BUILDTEST_ROOT/buildtest/settings/config.yml
Default Configuration¶
Buildtest comes with a default configuration that can be found at buildtest/settings/config.yml
relative to root of repo. At the start of buildtest execution, buildtest will load
the configuration file and validate the configuration with JSON schema settings.schema.json
.
If it’s fails to validate, buildtest will raise an error.
We recommend you copy the default configuration as a template to configure buildtest for your site.
Shown below is the default configuration provided by buildtest.
system:
generic:
# specify list of hostnames where buildtest can run for given system record
hostnames: [".*"]
# system description
description: Generic System
# specify module system used at your site (environment-modules, lmod)
moduletool: N/A
executors:
# define local executors for running jobs locally
local:
bash:
description: submit jobs on local machine using bash shell
shell: bash
sh:
description: submit jobs on local machine using sh shell
shell: sh
csh:
description: submit jobs on local machine using csh shell
shell: csh
zsh:
description: submit jobs on local machine using zsh shell
shell: zsh
# compiler block
compilers:
# regular expression to search for compilers based on module pattern. Used with 'buildtest config compilers find' to generate compiler instance
# find:
# gcc: "^(gcc)"
# intel: "^(intel)"
# cray: "^(craype)"
# pgi: "^(pgi)"
# cuda: ^(cuda)"
# clang: "^(clang)"
# declare compiler instance which can be site-specific. You can let 'buildtest config compilers find' generate compiler section
compiler:
gcc:
builtin_gcc:
cc: gcc
fc: gfortran
cxx: g++
# location of log directory
# logdir: /tmp/
# specify location where buildtest will write tests
# testdir: /tmp
# specify one or more directory where buildtest should load buildspecs
# buildspec_roots: []
cdash:
url: https://my.cdash.org/
project: buildtest
site: generic
buildname: tutorials
As you can see the layout of configuration starts with keyword system
which is
used to define one or more systems. Your HPC site may contain more than one cluster,
so you should define your clusters with meaningful names as this will impact when you
reference executors in buildspecs. In this example, we define one
cluster called generic
which is a dummy cluster used for running tutorial examples. The
required fields in the system scope are the following:
"required": ["executors", "moduletool", "hostnames", "compilers"]
The hostnames
field is a list of nodes that belong to the cluster where buildtest should be run. Generally,
these hosts should be your login nodes in your cluster. buildtest will process hostnames field across
all system entry using re.match until a hostname is found, if
none is found we report an error.
In this example we defined two systems machine, machine2 with the following hostnames.
system:
machine1:
hostnames: ['loca$', '^1DOE']
machine2:
hostnames: ['BOB|JOHN']
In this example, none of the host entries match with hostname DOE-7086392.local so we get an error since buildtest needs to detect a system before proceeding.
buildtest.exceptions.BuildTestError: "Based on current system hostname: DOE-7086392.local we cannot find a matching system ['machine1', 'machine2'] based on current hostnames: {'machine1': ['loca$', '^1DOE'], 'machine2': ['BOB|JOHN']} "
Let’s assume you we have a system named mycluster
that should run on nodes login1
, login2
, and login3
.
You can specify hostnames as a list of strings
system:
mycluster:
hostnames: ["login1", "login2", "login3"]
Alternately, you can use regular expression to condense this list
system:
mycluster:
hostnames: ["login[1-3]"]
Configuring Module Tool¶
If your system supports environment-modules or
Lmod for managing user environment then you can
configure buildtest to use the module tool. This can be defined via moduletool
property.
# environment-modules
moduletool: environment-modules
# for lmod
moduletool: lmod
# specify N/A if you don't have modules
moduletool: N/A
The moduletool property is used for detecting compilers when you run buildtest config compilers find
.
buildspec roots¶
buildtest can discover buildspec using buildspec_roots
keyword. This field is a list
of directory paths to search for buildspecs. For example we clone the repo
https://github.com/buildtesters/buildtest-cori at $HOME/buildtest-cori and assign
this to buildspec_roots as follows:
buildspec_roots:
- $HOME/buildtest-cori
This field is used with the buildtest buildspec find
command. If you rebuild
your buildspec cache via --rebuild
option, buildtest will search for all buildspecs in
directories specified by buildspec_roots property. buildtest will recursively
find all .yml extension and validate each buildspec with appropriate schema.
By default buildtest will add the $BUILDTEST_ROOT/tutorials
and $BUILDTEST_ROOT/general_tests
to search path when searching for buildspecs with buildtest buildspec find
command. This
is only true if there is no root buildspec directory specified which can be done via buildspec_roots
or –root option.
Configuring Executors¶
An executor is responsible for running the test and capture output/error file and return code. An executor can be local executor which runs tests on local machine or batch executor that can be modelled as partition/queue. A batch executor is responsible for dispatching job, then poll job until its finish, and gather job metrics from scheduler.
Executor Declaration¶
The executors
is a JSON object, that defines one or more executors. The executors
are grouped by their type followed by executor name. In this example we define two
local executors bash
, sh
and one slurm executor called regular
:
system:
generic:
executors:
local:
bash:
shell: bash
description: bash shell
sh:
shell: sh
description: sh shell
slurm:
regular:
queue: regular
The LocalExecutors are defined in section local where each executor must be
unique name and they are referenced in buildspec using executor
field in the following format:
executor: <system>.<type>.<name>
For instance, if a buildspec wants to reference the LocalExecutor bash from the generic cluster, you would specify the following in the buildspec:
executor: generic.local.bash
In our example configuration, we defined a bash
executor as follows:
executors:
# define local executors for running jobs locally
local:
bash:
description: submit jobs on local machine using bash shell
shell: bash
The local executors require the shell
key which is one of supported shells in your system. On Linux/Mac system
you can find all supported shells in file /etc/shells. Any buildspec that references this executor will submit
job using bash
shell.
You can pass options to shell which will get passed into each job submission.
For instance if you want all bash scripts to run in login shell you can specify bash --login
:
executors:
local:
login_bash:
shell: bash --login
Then you can reference this executor as executor: generic.local.login_bash
and your
tests will be submitted via bash --login /path/to/test.sh
.
Once you define your executors, you can query the executors via buildtest config executors
command.
Disable an executor¶
buildtest will run checks for every executor instance depending on the executor type, for instance
local executors such as bash, sh, csh executor will be checked to see if shell is
valid by checking the path. If shell doesn’t exist, buildtest will raise an error. You
can circumvent this issue by disabling the executor via disable
property. A disabled executor won’t
serve any jobs which means any buildspec that reference the executor won’t create a test.
In this next example the executor zsh is disabled which can be used if you don’t have zsh on your system
executors:
local:
zsh:
shell: zsh
disable: true
Default commands run per executors¶
You can configure an executor to run a set of commands when using an executor. We
can do this via before_script
field that is a string type that can be used to specify
shell commands.
In this example below we have a bash executor will define some shell code that will be run when using this executor. The content of the before_script will be inserted in a shell script that is sourced by all tests.
local:
bash:
description: submit jobs on local machine using bash shell
shell: bash
before_script: |
today=$(date "+%D")
echo "Today is $today, running test with user: $(whoami)"
buildtest will write a before_script.sh
for every executor.
This can be found in $BUILDTEST_ROOT/var/executors
directory as shown below
$ find $BUILDTEST_ROOT/var/executor -type f
/Users/siddiq90/Documents/GitHubDesktop/buildtest/var/executor/generic.local.bash/before_script.sh
/Users/siddiq90/Documents/GitHubDesktop/buildtest/var/executor/generic.local.csh/before_script.sh
/Users/siddiq90/Documents/GitHubDesktop/buildtest/var/executor/generic.local.zsh/before_script.sh
/Users/siddiq90/Documents/GitHubDesktop/buildtest/var/executor/generic.local.sh/before_script.sh
If you run a test using this executor you will see the code is inserted from before_script.sh which is sourced for all given test.
$ cat $BUILDTEST_ROOT/var/executor/generic.local.bash/before_script.sh
#!/bin/bash
today=$(date "+%D")
echo "Today is $today, running test with user: $(whoami)"
Specifying Modules¶
You can configure executors to load modules, purge or restore from collection which will be run for all tests that use the executor. This can be achieved via module property that can be defined in the executor definition. In this next example, we create a bash executor that will purge modules and load gcc. The purge property is a boolean, if set to True we will run module purge before loading commands. The load property is a list of modules to module load.
executors:
local:
bash:
shell: bash
module:
purge: True
load: ["gcc"]
Specifying QoS (Slurm)¶
At Cori, jobs are submitted via qos instead of partition so we model a slurm executor
named by qos. The qos
field instructs which Slurm QOS to use when submitting job. For
example we defined a slurm executor named haswell_debug which will submit jobs to debug
qos on the haswell partition as follows:
executors:
slurm:
haswell_debug:
qos: debug
cluster: cori
options: ["-C haswell"]
The cluster
field specifies which slurm cluster to use
(i.e sbatch --clusters=<string>
).
buildtest will detect slurm configuration and check qos, partition, cluster
match with buildtest configuration. In addition, buildtest supports multi-cluster
job submission and monitoring from remote cluster. This means if you specify
cluster
field buildtest will poll jobs using sacct with the
cluster name as follows: sacct -M <cluster>
.
The options
field is use to specify any additional options to launcher (sbatch
)
on command line. Any additional #SBATCH options are defined in buildspec for more details see
batch scheduler support.
Specify Slurm Partitions¶
You can specify slurm partitions instead of qos if your slurm cluster requires jobs to be submitted by partitions. This
can be done via partition
property. In this next example we define an executor name regular_hsw which maps
to slurm partition regular_hsw.
executors:
slurm:
regular_hsw:
partition: regular_hsw
description: regular haswell queue
buildtest will check if slurm partition is in up
state before adding executor. buildtest will be
performing these checks when validating configuration file and this avoids creating tests that reference
a partition that is in down state. Internally, we are running the following command for every defined
defined partition
$ sinfo -p regular_hsw -h -O available
up
Specifying Project Account¶
Batch jobs require project account to charge jobs and depending on your site this could be required in order to submit job. Some scheduler like Slurm can detect your default project account in that case you don’t need to specify on command line.
In your configuration file you can specify account
property which will inherit this
setting for all executors. You can specify account
property within an executor which will override the
default section.
In this example, we have two pbs executors testing and development. All pbs jobs will use the project account development
because this is defined in defaults
section however we can force all jobs using testing executor to charge
jobs to qa_test
.
executors:
defaults:
pollinterval: 10
maxpendtime: 90
account: development
pbs:
testing:
queue: test
account: qa_test
development:
queue: dev
Alternately, you can override configuration setting via buildtest build --account
command which will be applied
for all batch jobs.
Poll Interval¶
The pollinterval
field is used to poll jobs at set interval in seconds
when job is active in queue. The poll interval can be configured on command line
using buildtest build --pollinterval
which overrides the configuration value.
Note
pollinterval
and maxpendtime
have no effect on local executors.
Max Pend Time¶
The maxpendtime
is maximum time job can be pending
within an executor, if it exceeds the limit buildtest will cancel the job.
The maxpendtime option can be overridden per executor level for example the section below overrides the default to 300 seconds:
bigmem:
description: bigmem jobs
cluster: escori
qos: bigmem
maxpendtime: 300
The maxpendtime
is used to cancel job only if job is pending in queue, it has
no impact if job is running. buildtest starts a timer at job submission and every poll interval
(pollinterval
field) checks if job has exceeded maxpendtime only if job is pending.
If job pendtime exceeds maxpendtime limit, buildtest will
cancel job the job using the appropriate scheduler command like (scancel
, bkill
, qdel
).
Buildtestwill remove cancelled jobs from poll queue, in addition cancelled jobs won’t be
reported in test report.
For more details on maxpendtime click here.
PBS Executors¶
Note
buildtest PBS support relies on job history set because buildtest needs to query job after completion using qstat -x
. This
can be configured using qmgr
by setting set server job_history_enable=True
. For more details see section 13.15.5.1 Enabling Job History in PBS 2020.1 Admin Guide
buildtest supports PBS scheduler
which can be defined in the executors
section. Shown below is an example configuration using
one pbs
executor named workq
. The property queue: workq
defines
the name of PBS queue that is available in your system.
1system:
2 generic:
3 hostnames: ['.*']
4
5 moduletool: N/A
6 executors:
7 defaults:
8 pollinterval: 10
9 max_pend_time: 30
10 pbs:
11 workq:
12 queue: workq
13 compilers:
14 compiler:
15 gcc:
16 default:
17 cc: /usr/bin/gcc
18 cxx: /usr/bin/g++
19 fc: /usr/bin/gfortran
buildtest will detect the PBS queues in your system and determine if queues are active
and enabled before submitting job to scheduler. buildtest will run qstat -Q -f -F json
command to check for
queue state which reports in JSON format and check if queue has the fields enabled: "True"
or started: "True"
set
in the queue definition. If these values are not set, buildtest will raise an exception.
Shown below is an example with one queue workq that is enabled
and started
.
1$ qstat -Q -f -F json
2{
3 "timestamp":1615924938,
4 "pbs_version":"19.0.0",
5 "pbs_server":"pbs",
6 "Queue":{
7 "workq":{
8 "queue_type":"Execution",
9 "total_jobs":0,
10 "state_count":"Transit:0 Queued:0 Held:0 Waiting:0 Running:0 Exiting:0 Begun:0 ",
11 "resources_assigned":{
12 "mem":"0kb",
13 "ncpus":0,
14 "nodect":0
15 },
16 "hasnodes":"True",
17 "enabled":"True",
18 "started":"True"
19 }
20 }
21}
Configuring test directory¶
The default location where tests are written is $BUILDTEST_ROOT/var/tests where
$BUILDTEST_ROOT is the root of buildtest repo. You may specify testdir
in your
configuration to instruct where tests can be written. For instance, if
you want to write tests in /tmp you can set the following:
testdir: /tmp
Alternately, one can specify test directory via buildtest build --testdir <path>
which
has highest precedence and overrides configuration and default value.
Configuring log path¶
You can configure where buildtest will write logs using logdir
property. For
example, in example below buildtest will write log files $HOME/Documents/buildtest/var/logs
.
buildtest will resolve variable expansion to get real path on filesystem.
# location of log directory
logdir: $HOME/Documents/buildtest/var/logs
logdir
is not required field in configuration, if it’s not specified then buildtest will write logs
based on tempfile library which may vary
based on platform (Linux, Mac).
The buildtest logs will start with buildtest_ followed by random identifier with a .log extension.
CDASH Configuration¶
buildtest can be configured to push test to CDASH. The default configuration file provides a CDASH configuration for buildtest project is the following.
cdash:
url: https://my.cdash.org/
project: buildtest
site: generic
buildname: tutorials
The cdash section can be summarized as follows:
url
: URL to CDASH server
project
: Project Name in CDASH server
site
: Site name that shows up in CDASH entry. This should be name of your system name
buildname
: Build Name that shows up in CDASH, this can be any name you want.
The cdash settings can be used with buildtest cdash
command. For more details
see CDASH Integration (buildtest cdash).