Benchmarking: simple quality assurance tests for TorXakis (sqatt)¶
Introduction¶
It is important to keep an eye on the performance of TorXakis.
Currently there is no continuous benchmarking process in place, however
you can check locally the performance by running the benchmarks for the
(current) develop version, and the version that you are working on.
To run the benchmarks and generate the reports on Unix systems you can
run:
cd test/sqatt
stack bench --ba "--output  `date +%s`-report-`git rev-parse HEAD`.html --csv `date +%s`-report-`git rev-parse HEAD`.csv"
On Windows PowerShell you can run:
cd test\sqatt
stack bench --ba "--output
$(get-date).ToString("yyyyMMdd-hhmmss")-report-$((git rev-parse
HEAD).Substring(0,8)).html --csv $(get-date).ToString("yyyyMMdd-hhmmss")-report-$((git rev-parse HEAD).Substring(0,8)).csv"
See the README in the sqatt folder for
more information.
Integration tests¶
To run the integration tests execute:
stack test
Individual development acceptance tests can be selected by passing the
match flag to hspec via stack:
stack test --test-arguments=--match=PATTERN
where PATTERN is a pattern is used to select the name of the test to
run. For instance, to run only the stimulus response tests the
following command can be used:
stack test --test-arguments=--match=Stimulus
Adding new tests¶
sqatt assumes that the TorXakis models and SUT source-code (if any)
are located in the examps folder, and the commands to be input to
TorXakis during the tests are stored in the test/examps folder. The
Examples.Paths module contains functions to build paths to the test
files using these conventions.
To add a new test case to be included in the development acceptance tests the following steps are needed:
- Create a new Haskell file in the - src/Examplesfolder, which will describe the test cases.
- Add tests cases by specifying: - The example name. 
- The paths to the .txs files that contain the TorXakis model. 
- The path to the file that contains TorXakis commands. 
- Command line arguments, if any, for the - TorXakisserver command (- txserver).
- If needed: the path to the SUT source code (currently only Java is supported), or the path to the TorXakis commands that will be used to run TorXakis as a SUT simulator. 
 
- Define a function named - exampleSetcontaining the list of examples that will be tested.
- Update the - src/Examples/All.hsfile to include the new example set in the list of all examples.
- Update the - sqatt.cabalfile to account for the the new file.
Some variation is of course allowed in these steps, but for the sake of maintainability it is recommended to follow them as close as possible.
When adding a new test use camel case folder names. The rationale behind this is that we want to be consistent with Haskell naming conventions regarding module folders.
We will illustrate the process of adding new development-acceptance
tests by using an example. Suppose we want to create a new test, named
MyTest, to test a new TorXakis model that is stored in
examps/MyTest/MyTestModel.txs. This model will be used to test a toy
SUT whose source code is located in examps/MyTest/MySUT.java.
Finally assume the commands to be input to TorXakis during the tests are
in test/examps/MyTest/MyTestCmds.txscmd. Note that these files do
not exist in the code-base, and are used as illustration only.
To define a new development-acceptance test that uses these files, start
by creating a file in the src/Examples folder named MyTest.hs.
Make sure that you enable the OverloadedStrings compiler extension,
and import Examples.Paths and Sqatt, while hiding Prelude’s
FilePath (since it conflicts with the one used by Sqatt):
{-# LANGUAGE OverloadedStrings #-}
module Examples.MyTest (exampleSet) where
import           Examples.Paths
import           Prelude        hiding (FilePath)
import           Sqatt
Next, we add a test case by defining a new TxsExample value:
test0 :: TxsExample
test0 = TxsExample
  { exampleName = "What the test is checking"
  , txsModelFiles = [txsFilePath "MyTest" "MyTestModel"]
  , txsCommandsFile = [txsCmdPath "MyTest" "MyTestCmds"]
  , txsServerArgs = []
  , sutExample =
      Just (JavaExample (javaFilePath "MyTest" "MySUT") ["Some", "SUT", "Args"])
  , expectedResult = Pass
  }
The record labels should be self explanatory. For more information
please check the existing sqatt tests (defined in src/Examples)
. Also note that we are repeating the string "MyTest", which could
be avoided. See the src/Examples folder to see how this is currently
done.
In the example above we specify a Java SUT. It is also possible to
specify no SUT or a TorXakis simulated one. The src/Examples folder
contains examples of these cases.
Once the example test0 is defined, the next step is to add it to a
TxsExampleSet:
exampleSet :: TxsExampleSet
exampleSet = TxsExampleSet "My Test" [test0]
If additional examples are needed, define them as explained above, and
add them to the list of the exampleSet value above.
To have this example set tested with sqatt add it to the
Examples.All module:
-- ...
module Examples.All
  ( allExamples
  )
where
import qualified Examples.Adder            as Adder
-- ...
import qualified Examples.MyTest as MyTest
-- ...
allExamples :: [TxsExampleSet]
allExamples = [ Adder.exampleSet
              -- ...
              , MyTest.exampleSet
              -- ...
              ]
Finally, update the sqatt.cabal file by adding Examples.MyTest
to the exposed-modules section, and run:
stack test --test-arguments="--match=MyTest"
Notice that we’re using the match flag to run only the newly added
test.
The log files will be written in the TorXakis root directory in a folder
named sqatt-logs (currently this cannot be configured). The logs are
structured hierarchically as follows: test date and time, test set name,
test name. To avoid problems in Windows systems, in the log folders
spaces will be replaced by underscores (_) and colons (:) will
be replaced by dashes (-).
Long-running tests¶
Some example cases are known to take longer than others. We run these tests in parallel with the rest, in order to prevent them from timing the whole build out. These tests have “#long” tag in their names. If you add a test that you expect to take longer, you can have them run in the Long Tests group by adding #long to their name.
Customers and Orders test is an example to such tests.
test = TxsExample
  { exampleName = "Customers & Orders Test #long"
  , txsModelFiles = [txsFilePath exampDir customersOrdersText]
...
The parallelization of these tests are handled by our CI provider by running two separate test jobs. Long running tests are run with “–match=#long” test parameter while faster tests are run with “–skip=#long”.
These jobs are scripted in test.sh and test_long.sh respectively.
Benchmarks¶
To run the benchmarks execute:
stack bench
Individual benchmarks can be selected by passing a pattern to the benchmark program.
stack bench --ba "-m glob GLOB_PATTERN"
where GLOB_PATTERN is a Unix style
glob-pattern
that is used to select the benchmark names. Benchmark names have the
form <benchmark set>/<single benchmark name>, for instance:
- Sequence/100 actions
- Sequence/100 data actions
- Parallel/100 data actions
So, for instance, to run only the data actions we can run stack as
follows:
stack bench --ba "-m glob */*data*"
If on the other hand, we would have wanted to run only the benchmark
named "Sequence/100 data actions" we could have run:
stack bench --ba "-m glob Sequence/*data*"
or even:
stack bench --ba "-m glob Seq*/*data*"
Note that globbing on the benchmarks works as though they were
filepaths, that’s why "*" won’t match any benchmarks.
To generate HTML output use the --output flag, for example:
stack bench --ba "-m glob Seq*/*data* --output report.html"
The benchmark program uses Criterion’s default main method, which supports a wide range of options. Run:
stack bench --ba "--help"
for the full list of options.
Tests for the benchmarks¶
Since it is desirable that the benchmarks do not fail, sqatt
includes a test suite for the benchmarks, which check for their sanity.
To run only this test suite use:
stack test sqatt:benchmarks-sanity-test
This is also useful when debugging a failed benchmark, since when running the benchmarks no files are logged.