Dear Diary, in SPARK 2014, we translate SPARK source code to the intermediate language WhyML, from which we generate our verification conditions (VCs). The automatic theorem provers used by the SPARK tool-set are SMT solvers, which share a common input language - SMTLIB - thanks to a longstanding standardisation effort by the academic community. Most years, the SMT community organises a public competition for SMT solvers; this is a friendly competition where solvers are benchmarked on a number of different problems of varying complexity.
Usually the first thing you do in an SMTLIB problem is to constrain the problem: for example you might declare to use the logic QF_BV if your problem only contains quantifier-free bitvectors, or you might use AUFNIRA if your problem contains quantifiers, functions, arrays, integers and reals using non-linear arithmetic. The idea is that many solvers exist that specialise on a certain kind of problem and so they can participate in only one track in this competition. For example there exists solvers that only deal with Boolean problems and bitvectors (Boolector), solvers that specialise on quantifier-free properties (MathSAT), and general purpose solvers that implement many theories and logics (CVC4, AltErgo, Z3). SPARK uses solvers from this category, and most of them are benchmarked in this competition.
It is a significant event when new theories and logics are added to the SMTLIB standard. The most recent addition is a theory of floating points, which the SPARK tools will eventually use to efficiently reason about floating point problems. This year the SMT competition will feature for the first time a floating point track, to which a number of solvers will enter.
However these competitions are only as good as the benchmarks they contain, so to help this community effort I have assembled a set of floating point benchmarks and submitted them for consideration. These benchmarks intend to represent the core problems of typical issues we or our users have come across:
- I have converted all the official challenges from the Proof-In-Use project.
- I have gone through some real (pun not intended) world source code and produced representative examples that capture the essence of some difficult issues. I have focused on instances where the answer is surprising.
- I have looked at a body of existing user-written rules for SPARK (these were based on real arithmetic, not floating point arithmetic) and distilled some interesting problems from them.
- And one benchmark actually made us notice the compiler issue relating to exponentiation that was described in a recent blog post.
So how do these benchmarks look like? Here is an example of a division function for non-negative floats that guards against division by zero and overflow. This is how the function might look like in SPARK:
function Defensive_Div (X, Y: Float) return Float
is (if Y = 0.0 or Y < X / Float'Last
else X / Y)
with Pre => X >= 0.0 and Y >= 0.0;
There are two obvious cases to deal with: division by zero and overflow (where we divide by a very small number and thus get infinity). The reason behind the overflow check is the following: we want to test if float'last < x / y. If we re-arrange the equation by first multiplying both sides by y (which we know to be positive) and then dividing both sides by float'last (which we also know to be positive), we end up with the overflow check in the above code. In both these cases (division by 0 and overflow) the function returns the largest floating point number. Below we show the floating point benchmark derived from this function. First, some preliminaries (the logic used, and some information about the benchmark itself):
(set-info :smt-lib-version 2.5)
(set-info :category crafted)
We need a helper function to check if a given floating point number is a valid SPARK floating point number - since in SPARK we exclude infinity and not-a-number. If you know LISP then the below should be somewhat familiar, if not, the way to read it is: we define a function (is_spark_float), it has one argument (f, which is a float), the function returns a boolean and we then provide a definition. A function call is just a list where the function is the first member, for example (not a) will call function 'not' on the argument 'a'.
(define-fun is_spark_float ((f Float32)) Bool
(not (or (fp.isNaN f) (fp.isInfinite f)))
We then define a useful constant for the largest floating point number. The number in hex is the bit-pattern in IEEE-754 binary interchange format for the largest positive floating point number that is not infinity.
(define-const float__last Float32 ((_ to_fp 8 24) #x7F7FFFFF))
We now introduce two variables, x and y. SMTLIB does not have variables as such, we instead have functions with no arguments. A function without arguments (with a non-deterministic value) can be thought of as a variable.
(declare-const x Float32)
(declare-const y Float32)
We now add assertions which reflect our preconditions to constrain the values of x and y. We have some implicit preconditions (the values must be valid SPARK floats), and some explicit ones.
(assert (is_spark_float x))
(assert (is_spark_float y))
(assert (fp.geq x (_ +zero 8 24)))
(assert (fp.geq y (_ +zero 8 24)))
We're interested in particular in the else branch (where the division takes place), so we will further constrain our inputs accordingly. Note that the division mentions RNE which specifies the rounding mode to use. SPARK requires a compiler to use RoundNearestEven and IEEE floats (see Section 9.8 of the SPARK user manual), hence we have hard-coded this here.
(assert (not (fp.eq y (_ +zero 8 24))))
(assert (not (fp.lt y (fp.div RNE x float__last))))
Finally we want to check if doing the division is safe - in other words we want to prove that x / y is still a valid spark float. To do this we assert the converse and then try to find an assignment for x and y that makes the entire formula valid. If we succeed and find a satisfying assignment we know the original conjecture is not true, and we have a counter-example. If we fail, we know that the original conjecture is true.
(assert (not (is_spark_float (fp.div RNE x y))))
We can now give this to an SMT solver. In this case we get a SAT answer, meaning the original code missed a check! We also get a counter-example. There are many possible counter-examples, but in this case the solver returned this one: x = 0x35D67FFF (1.598149e-06f) and y = 0x00000003 (4.203895e-45f). This is indeed a counter-example that passes the defensive code, but still causes an overflow when we try to divide x by y! One way to fix the original function would be to add a check in the if statement that makes sure y is not a subnormal number. This is an excellent reminder that obvious arithmetic (such as the re-arranging of terms that resulted in this overflow check) that is valid with reals, is often not valid for floats, especially when subnormal numbers are involved.
We have compiled around 60 of such examples, some valid, some invalid (like the above); this lets solver writers know which kind of problems we and users of SPARK are facing, and where it can be useful to focus future efforts.