Introduction

The goal of this project is to implement a relational abstract domain able to represent relationships between program variables.

The domain will conform to the Domain.S module signature provided in the skeleton library. Optionally, it could also be integrated into another group’s static analyzer (from the static analyzer project) in place of simpler, non-relational domains such as intervals. Relational domains are necessary to discover invariants that involve several variables simultaneously (e.g., x = 2*y + 1), which is impossible with non-relational domains.

You must implement one of the two following domains:

Each of these domains are properly described in the papers linked in the references. You are expected to read the corresponding article, and understand it in order to implement properly the abstract domain.

Organization

The project must be delivered by email to me, in the form of a compressed archive (with tar or zip), called yournames{.tgz, .zip} which must contain a directory called yournames (example: dupont-durand).

This directory must contain the source files of your abstract domain implementation.

You are not allowed to use any other language than OCaml! However, you are allowed to modify parts of the skeleton library as long as your domain ultimately conforms to the Domain.S signature. You can also add any dependency to the project.

The directory should also contain:

Library skeleton

The sources of the skeleton library are available here.

The library is built with dune. You can build the project using dune build and run the tests using dune test.

The central file is lib/domain.ml, which defines the Domain.S module signature that your implementation must satisfy.

The Bottom exception

The Bottom exception signals that the abstract state is empty (no concrete environment is represented). Any operation may raise Bottom to indicate that the result is unreachable. This is used in place of a dedicated bottom value.

The ctx and t types

The signature revolves around two abstract types:

Side-effect convention

Most abstract operators work by side-effect: they receive a mutable abstract state t and modify it in place. For instance, join_with ctx a b computes the join and stores the result into a; the value b is not modified. This convention applies to the lattice operators (join_with, widen_with, meet_with, narrow_with), the transfer functions (const_int, unop_minus, binop_add, …), and the guards (guard_le, guard_lt, …).

The copy function is provided to duplicate an abstract state when the original must be preserved.

Overview of the operations

Lattice operations. leq ctx a b tests inclusion. join_with, meet_with, widen_with, and narrow_with compute the corresponding lattice operations and store the result into their first t argument. top ctx returns a fresh state representing all concrete environments.

Transfer functions. const_int ctx a v c models the assignment v := c. unop_minus ctx a v w models v := -w. The binary operators binop_add, binop_sub, binop_mul, binop_div, and binop_mod model v := w1 op w2. Variables are identified by their index (type var), numbered from 0 to n-1 where n is the number passed to init.

Guards. guard_le, guard_lt, guard_eq, and guard_neq refine the abstract state by assuming a condition holds between two variables. They raise Bottom if the condition is unsatisfiable.

Expected work

Difference Bound Matrices

The DBM domain represents sets of constraints of the form x - y <= c where x and y are program variables and c is an integer constant. These constraints are stored in a square matrix of dimension n+1 (where n is the number of variables), the extra row/column representing the constant 0. The entry m[i][j] holds the tightest known bound for x_i - x_j.

You must implement:

Hints

Use +∞ (or a dedicated representation) for unconstrained entries. After closure, emptiness is detected by a negative diagonal entry. The hardest part is correctly handling assignments involving expressions more complex than simple differences; you may need to fall back to weaker constraints (remove and re-derive) for operations that do not fit the x - y <= c template.

Karr’s linear equalities

The Karr domain represents sets of affine equalities between program variables, e.g., x = 2*y + 3*z + 1. The abstract state is a system of affine equalities stored as a matrix in row echelon form.

You must implement:

Hints

Use exact arithmetic (the Zarith library) to avoid rounding issues. Since widening is not needed for a domain of finite height (the lattice of affine subspaces satisfies the ascending chain condition), widen_with can simply behave as join_with. The main difficulty lies in implementing a correct and efficient join (affine hull); a good reference is Karr’s original algorithm.

Extensions

If you want, you are allowed to implement one (or some) of the following extensions. This optional work will be taken into account in the grade.

Octagon abstract domain

Extend the DBM domain to the octagon abstract domain, which represents constraints of the form ± x ± y <= c. The octagon domain is strictly more expressive than DBM while retaining polynomial-time algorithms. It encodes constraints in a matrix of dimension 2n by introducing, for each variable x, a positive form x+ and a negative form x- (representing -x).

You must adapt the closure algorithm to enforce coherence between the positive and negative forms of each variable (strong closure). You should provide examples showing invariants that the octagon domain can discover but the DBM domain cannot, and discuss the cost difference.

Linear congruences

Extend Karr’s linear equalities domain with linear congruences, representing constraints of the form a1*x1 + ... + an*xn = b [mod m]. This allows discovering modular relationships between variables (e.g., x is always even, or x = y mod 3).

You should provide examples illustrating invariants that require congruence information, and discuss how congruences interact with the equality domain.

Resources and Bibliography

Software:

On abstract domains: