Buildtest Tutorial on Perlmutter
This tutorial will be conducted on the Perlmutter system. If you need account access please obtain a user account.
Setup
Once you have a NERSC account, you can connect to any NERSC system. terminal client and ssh into perlmutter as follows:
ssh <user>@perlmutter-p1.nersc.gov
To get started please load the python module since you will need python 3.8 or higher to use buildtest. This can be done by running:
module load python
Next, you should Install buildtest by cloning the repository into your HOME directory:
git clone https://github.com/buildtesters/buildtest.git $HOME/buildtest
Note
Please make sure you create a python virtual environment before you proceed with this tutorial.
Once you have buildtest setup, please clone the following repository into your home directory:
git clone https://github.com/buildtesters/buildtest-nersc $HOME/buildtest-nersc
You will need to set the environment variable BUILDTEST_CONFIGFILE which will point to the configuration file required to use buildtest on Perlmutter.
export BUILDTEST_CONFIGFILE=$HOME/buildtest-nersc/config.yml
Once you are done, please navigate back to the root of buildtest by running:
cd $BUILDTEST_ROOT
The exercise can be found in directory buildtest/perlmutter_tutorial where you will have several exercises to complete. You can navigate to this directory by running:
cd $BUILDTEST_ROOT/perlmutter_tutorial
If you get stuck on any exercise, you can see the solution to each exercise in file “.solution.txt”
Note
For exercise 2 and 3, you can check the solution by running the shell script bash .solution.sh
Exercise 1: Performing Status Check
In this exercise, you will check the version of Lmod using the environment variable LMOD_VERSION and specify the the output using a regular expression. We will run the test with an invalid regular expression and see if test FAIL and rerun test until it PASS. Shown below is the example buildspec and please fix the highlighting lines in the test
buildspecs:
test_lmod_version:
type: FIXME
executors: 'perlmutter.local.bash'
run: echo $LMOD_VERSION
Todo
Run the test by running
buildtest build -b $BUILDTEST_ROOT/perlmutter_tutorial/ex1/module_version.yml
and you will notice failure in validationValidate the buildspec using
buildtest buildspec validate
to determine the errorFix the buildspec and rerun
buildtest buildspec validate
until we have a valid buildspec.Add a regular expression on
stdout
stream and make sure test failsCheck output of test via
buildtest inspect query
Update regular expression to match output with value of $LMOD_VERSION reported in test and rerun test until it passes.
Exercise 2: Querying Buildspec Cache
In this exercise you will learn how to use the Buildspecs Interface. Let’s build the cache by running the following:
buildtest buildspec find --directory $HOME/buildtest-nersc/buildspecs --rebuild -q
Todo
Find all tags
List all filters and format fields
Format tables via fields
name
,description
Filter buildspecs by tag
e4s
List all invalid buildspecs
Validate all buildspecs by tag
e4s
Show content of test
hello_world_openmp
Exercise 3: Query Test Report
In this exercise you will learn how to query test report. This can be done by
running buildtest report
.
Before you start, please run the following command:
buildtest bd -b $HOME/buildtest-nersc/buildspecs/apps/spack/
Todo
List all filters and format fields
Query all tests by returncode 0
Query all tests by tag
e4s
Print the total count of all failed tests
Let’s upload the tests to CDASH by running the following:
buildtest cdash upload $USER-buildtest-tutorial
Buildtest cdash integration via buildtest cdash upload
allows buildtest to push test results to CDASH server. The test results
are captured in report file typically shown via buildtest report
. CDASH allows one to easily process the test results in web-interface.
If you were successful in running above command, you should see a link to CDASH server https://my.cdash.org with link to test results, please click on the link to view your test results and briefly analyze the test results. Shown below is an example output
buildtest cdash upload $USER-buildtest-tutorial
Reading report file: /Users/siddiq90/Documents/github/buildtest/var/report.json
Uploading 110 tests
Build Name: siddiq90-buildtest-tutorial
site: generic
MD5SUM: a589c72bcdabdab9038600a2789e429f
You can view the results at: https://my.cdash.org//viewTest.php?buildid=2278337
Exercise 4: Specifying Performance Checks
In this exercise, you will be running the STREAM benchmark and use comparison operators to determine if test will pass based on the performance results. Shown below is the stream test that we will be using for this exercise
buildspecs:
stream_test:
type: script
executor: perlmutter.local.bash
description: Run stream test
env:
OMP_NUM_THREADS: 4
run: |
wget https://raw.githubusercontent.com/jeffhammond/STREAM/master/stream.c
gcc -openmp -o stream stream.c
./stream
metrics:
copy:
type: float
regex:
exp: 'Copy:\s+(\S+)\s+.*'
stream: stdout
item: 1
scale:
type: float
regex:
exp: 'Scale:\s+(\S+)\s+.*'
stream: stdout
item: 1
Todo
Run the stream test by running
buildtest build -b $BUILDTEST_ROOT/perlmutter_tutorial/ex4/stream.yml
Check the output of metrics
copy
andscale
by running buildtest inspect query -o stream_testUse the assert_ge: Greater Equal check with metric
copy
andscale
. Specify a reference value 50000 for metric copy and scale*Run the same test and examine output
Next try different reference value such as
5000
and rerun test and see output
Exercise 5: Running a Batch Job
In this exercise, you will submit a batch job that will run hostname
in the slurm cluster. Shown below is the example buildspec
buildspecs:
hostname_perlmutter:
description: run hostname on perlmutter
type: script
executor: 'perlmutter.slurm.debug'
tags: ["queues","jobs"]
sbatch: ["-t 5", "-n 1", "-N 1", "-C cpu"]
run: hostname
Take note that the test will run on executor perlmutter.slurm.debug
which corresponds to the slurm debug
queue on Perlmutter. The sbatch
options
specify the batch directives for running the job.
In this exercise you are requested to do the following:
Todo
Run the test with poll interval for 10 sec
$BUILDTEST_ROOT/perlmutter_tutorial/ex5/hostname.yml
and take note of output, you should see job is submitted to batch scheduler. Refer tobuildtest build --help
for list of complete optionsCheck the output of test via
buildtest inspect query
Update the test to make use of Multiple Executors and run test on both regular and debug queue and rerun the test.
Rerun same test and you should see two test runs for hostname_perlmutter one for each executor.
If you have completed this exercise, you should expect the following output from buildtest build
.
Test Summary
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ builder ┃ executor ┃ status ┃ checks (ReturnCode, Regex, Runtime) ┃ returncode ┃ runtime ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩
│ hostname_perlmutter/80e317c1 │ perlmutter.slurm.regular │ PASS │ N/A N/A N/A │ 0 │ 45.324512│
├───────────────────────────────────────┼─────────────────────────────┼────────┼─────────────────────────────────────┼────────────┼──────────┤
│ hostname_perlmutter/b1d7b318 │ perlmutter.slurm.debug │ PASS │ N/A N/A N/A │ 0 │ 75.54278 │
└───────────────────────────────────────┴─────────────────────────────┴────────┴─────────────────────────────────────┴────────────┴──────────┘