Yesterday I had the pleasure of spending a few extra hours with John Hughes. As usual, it has been an impressive learning experience, so I decided - once again - to share some of the highlights from our discussions.
We modeled one of our internal Erlang applications, using a Quickcheck statem. The application was relatively small (i.e. <1000 LOC), it offered a pretty straightforward API and its code was already covered by a fair amount of unit tests, with code coverage surpassing 90%.
As already happened to someone else, I didn’t expect to find too many bugs, if not in the Quickcheck model itself. Once again, Quickcheck proved me to be very wrong, finding five or six corner cases which caused the tests to fail. It also revealed some interesting behaviour for the application which, even if ineherently correct, was not clearly documented.
It took me a couple of days to get familiar with the application and its API, to implement a decent model for it and to wrap up everything into a pull request and into these notes.
So, let’s start go through a couple of learnt lessons.
Is ‘zero’ a natural number?
There is no universal agreement about whether to include zero in the
set of natural numbers.
In Erlang QuickCheck, zero is considered a natural number, so it can
be generated by the eqc_gen:nat/0
generator. In fact, a nat()
normally
shrinks to 0
. This is easily verifiable by means of the
eqc_gen:sampleshrink/1
function which, given a generator G
,
prints a value generated by G
, followed by one way of shrinking
it:
> eqc_gen:sampleshrink(eqc_gen:nat()).
3
--> [0,2]
ok
> eqc_gen:sampleshrink(eqc_gen:nat()).
8
--> [0,2,6,7]
ok
If you want to exclude the zero from the possible outcomes, you can define something like:
non_zero_nat() ->
?SUCHTHAT(N, eqc_gen:nat(), N > 0).
Code coverage for Erlang Quickcheck
Quickcheck does not provide code coverage information out of the box, so you need to use an external coverage analysis tool such as cover to obtain that piece of information.
If you use the rebar build tool to compile and test your Erlang applications, there is an easy solution to the problem. In fact, even if rebar’s cover support is currently limited to the EUnit testing framework, that is going to be enough in most of the cases. The trick is to wrap your Quickcheck properties into EUnit test suites.
Simply add the EUnit header file after the Quickcheck one:
-include_lib("eqc/include/eqc.hrl").
...
-include_lib("eunit/include/eunit.hrl").
And wrap each Quickcheck property into a EUnit test case. The result should look something like this, where the timeout is expressed in seconds:
prop_sample_test_() ->
{ "prop_sample", {timeout, 10, ?_assert(eqc:quickcheck(?MODULE:prop_sample()))}}.
That’s it. You can now verify your Quickcheck properties via Eunit:
./rebar skip_deps=true eunit
And get your coverage reports in .eunit/index.html
.
Generic servers, the application controller and the terminate/1
function
One of the most interesting issues which emerged while implementing
our Quickcheck statem model was related to the
semantics
of the terminate/1
callback function for the
gen_server
OTP
behaviour.
In fact, while adding commands to our statem, we noticed a rarely occurring issue. I say “rarely” since it only appeared after increasing the number of generated test cases from the 100 (default value) to 1000:
eqc:quickcheck(eqc:numtests(1000, sample_eqc:sample_prop())).
The issue seemed to happen on an application restart (i.e. a stop
command immediately followed by a start
command). Both commands were
using the
application
module
to start and stop the Erlang application.
Some basic tracing revealed that the
application was not cleaning up things properly upon termination. The
guilty process seemed to be a gen_server
. We immediately noticed
that the process was not trapping exits, which prevented the
terminate/1
from executing. So far,
nothing too interesting. We enabled the trap_exit
flag for the
gen_server
and we re-run the tests.
At this point the tests passed, but they were taking very long time,
which was a bit
surprising. The terminate/2
function, containing the cleanup
code, seemed harmless at a first glance. Nonethless, the gen_server
process was hanging somewhere, so that the supervisor had to kill
it after the
configured shutdown
time, which was set to 5 seconds.
A second look at the cleanup function pointed to a suspicious
application:set_env/3
call. The server was trying to set an
application environment variable before dying. A rapid exploration of
the OTP for the
application_controller
module revealed the mystery.
In OTP, the application environment is handled via a ETS table, which
is accessed via the application_controller
process. Whilst reading
an environment variable is implemented as a mere lookup operation, writes
are serialized through the application controller. In other words,
it is not possible to set an environment variable from a
terminate/1
callback function in a gen_server which traps exits,
since in case of application termination the application controller is
busy shutting down the application itself and cannot handle other
requests, causing a deadlock. A minimal example showing the deadlock
can be found in the following
gist.
Even if the terminate/1
was covered by unit tests, the issue never
emerged since in that case the terminate
function was not called via
the application controller, but as the consequence of a normal exit signal.
Yet another example of a buggy function 100% covered by unit tests has been found. Praises to Quickcheck!
Show states
When implementing the model for a Quickcheck statem, you often end up
with counter-examples which are a direct consequence of bugs in your
properties, missing preconditions and postconditions or incorrect
state updates. In such cases, it’s a good idea to include model states
into the standard Quickcheck output. You can customize Quickcheck
pretty-printing by using eqc_gen:with_parameters/3
and enabling the
show_states
flag:
prop_sample() ->
?FORALL(Cmds, commands(?MODULE),
begin
{H, S, Res} = run_commands(?MODULE, Cmds),
with_parameters(
[{show_states, true}],
pretty_commands(
...
Alternatively, you can display the model states by using the
eqc_statem:show_states/1
function, useful if you are quick-checking
your properties from the shell:
eqc:quickcheck(eqc_statem:show_states(Prop)).
By including model states into the output you’ll tend to have some quite verbose output, so the suggestion is to keep the flag disabled by default and to enable it only when needed.
Generating unique lists
Sometimes you have a complex data structure (let’s call it an item) and you want to generate a list of items, ensuring that the list does not contain duplicates. You’re going to add the generated list to your state and you want to ensure that any element of the list already exists in the state.
The trick to solve the above in a way that ensures a decent shrinking behaviour for your generator is to split the existing problem into smaller ones. For the sake of simplicity, let’s assume that a complex item is a 2-elements tuple and that the first element of the tuple is an id which identifies a tuple. Let’s also assume that an id is a natural number.
What you can do is to generate a list of unique ids. You can use the
lists:usort/1
function to ensure there are not duplicates. The
condition ensures that the generated element is not already part of
the state.
uids(S) ->
?LET(L, list(nat()),
[X || X <- lists:usort(L), not lists:keymember(X, 1, S#state.items)]).
Starting from the generated list of unique ids, you can then construct a list of complex items:
items(S) ->
?LET(Ids, uids(S), [item(Id) || Id <- Ids]).
Where the item/1
function creates a complex data
structure starting from an identifier”
item(Id) ->
?LET(Value, some_value_generator(), {Id, Value}).
No spikes in time
One of the properties you may want to verify is that a single API request
doesn’t take longer than a certain amount of time. A simple way to
check this is to use the
timer:tc/1
function. There are a couple of important
considerations to be made about this function. For example, it is advised to run
it from a newly spawned process and to call erlang:garbage_collect/1
just before using it, to lower the possibilities the Erlang garbage collector
starts its job during the measurement. Nonethless, if you don’t need
extremely accurate measurements, the following approach may be good
enough for you.
Let’s assume you have an exec command in your Quickcheck statem:
exec_command(S) ->
Args = ...
{call, ?MODULE, exec, [Args]}.
Your ?MODULE:exec/1
function should then look something like:
exec(Args) ->
timer:tc(server, exec, [Args]).
And the postcondition would look something like:
exec_post(S, [Args], {Time, Result}) ->
Time < expected_time(exec, Args) andalso Result ...
Where:
expected_time(exec, Args) -> 10 * 1000. %% 10 ms
As a plus, you may want collect some statistics about times (which are
now part of the command results) in your property, via the
aggregate/2,3
and collect/2,3
functions.
Dealing with race conditions
In our case, it proved to be really hard to provoke the race condition
which caused our property to fail. This is why we wrapped our property into a
?ALWAYS(N, Prop)
to repeat the same test several time:
prop_sample() ->
?FORALL(Attempts, ?SHRINK(1, [100]),
?FORALL(Cmds, commands(?MODULE),
?ALWAYS(Attempts,
begin
...
end))).
Notice how we abused the ?SHRINK
macro to force Quickcheck to test
each case once while searching for a failure and then 100 times for
each shrinking attempt.
Amending the counterexample
In case of a failing property, Quickcheck stores a copy of the
counter-example found. The last counter-example can be retrieved by
the eqc:counterexample/0
function. Once fetched, it can be used to
repeat a test or to test a different property in the same case, for
example using the eqc:check(Prop)
function.
The eqc:counterexample/0
function returns a counterexample()
abstract data type. With the current version of Quickcheck, it’s
possible to decompose the return value into a list of two elements:
[N,C] = eqc:counterexample().
The second element (C
) is the actual counter-example, whilst the
first element (N
) represents the number of attempts you want to
re-try that test. If you want to re-try the counterexample a certain
amount of times, you can simply do:
eqc:check(sample_eqc:prop_sample(), [1000, B]).