Introduction

The goal of this project is to implement a relational abstract domain able to represent relationships between program variables.

The domain will conform to the Domain.S module signature provided in the skeleton library. Optionally, it could also be integrated into another group’s static analyzer (from the static analyzer project) in place of simpler, non-relational domains such as intervals. Relational domains are necessary to discover invariants that involve several variables simultaneously (e.g., x = 2*y + 1), which is impossible with non-relational domains.

You must implement one of the two following domains:

Difference Bound Matrices (DBM): represents constraints of the form x - y <= c, using a matrix encoding and shortest-path closure.
Karr’s linear equalities: represents affine equalities between variables (e.g., x = 2*y + 1), using algorithms from linear algebra (Gauss elimination).

Each of these domains are properly described in the papers linked in the references. You are expected to read the corresponding article, and understand it in order to implement properly the abstract domain.

Organization

The project must be delivered by email to me, in the form of a compressed archive (with tar or zip), called yournames{.tgz, .zip} which must contain a directory called yournames (example: dupont-durand).

This directory must contain the source files of your abstract domain implementation.

You are not allowed to use any other language than OCaml! However, you are allowed to modify parts of the skeleton library as long as your domain ultimately conforms to the Domain.S signature. You can also add any dependency to the project.

The directory should also contain:

A report (between 2 and 4 pages) discussing your abstract domain, your implementation choices, your difficulties, and your experiments.
A set of tests demonstrating the features, precision and correctness of your domain.

Library skeleton

The sources of the skeleton library are available here.

The library is built with dune. You can build the project using dune build and run the tests using dune test.

The central file is lib/domain.ml, which defines the Domain.S module signature that your implementation must satisfy.

The `Bottom` exception

The Bottom exception signals that the abstract state is empty (no concrete environment is represented). Any operation may raise Bottom to indicate that the result is unreachable. This is used in place of a dedicated bottom value.

The `ctx` and `t` types

The signature revolves around two abstract types:

ctx: a domain-specific context created once by init and threaded to every subsequent operation. It typically holds the number of variables and any auxiliary data structures shared across abstract states (e.g., the matrix dimension for DBM, or the number of columns in the equation system for Karr).
t: a mutable abstract state (abstract element). Most operations modify it in place.

Side-effect convention

Most abstract operators work by side-effect: they receive a mutable abstract state t and modify it in place. For instance, join_with ctx a b computes the join and stores the result into a; the value b is not modified. This convention applies to the lattice operators (join_with, widen_with, meet_with, narrow_with), the transfer functions (const_int, unop_minus, binop_add, …), and the guards (guard_le, guard_lt, …).

The copy function is provided to duplicate an abstract state when the original must be preserved.

Overview of the operations

Lattice operations. leq ctx a b tests inclusion. join_with, meet_with, widen_with, and narrow_with compute the corresponding lattice operations and store the result into their first t argument. top ctx returns a fresh state representing all concrete environments.

Transfer functions. const_int ctx a v c models the assignment v := c. unop_minus ctx a v w models v := -w. The binary operators binop_add, binop_sub, binop_mul, binop_div, and binop_mod model v := w1 op w2. Variables are identified by their index (type var), numbered from 0 to n-1 where n is the number passed to init.

Guards. guard_le, guard_lt, guard_eq, and guard_neq refine the abstract state by assuming a condition holds between two variables. They raise Bottom if the condition is unsatisfiable.

Expected work

Difference Bound Matrices

The DBM domain represents sets of constraints of the form x - y <= c where x and y are program variables and c is an integer constant. These constraints are stored in a square matrix of dimension n+1 (where n is the number of variables), the extra row/column representing the constant 0. The entry m[i][j] holds the tightest known bound for x_i - x_j.

You must implement:

Closure. After every operation that modifies the matrix, you must compute the shortest-path closure (Floyd-Warshall or Johnson’s algorithm). The closure ensures that all implied constraints are made explicit and that emptiness is detected (a negative cycle on the diagonal).
Lattice operations. Join is the pointwise maximum; meet is the pointwise minimum (followed by closure). Inclusion is the pointwise comparison. You must design a suitable widening (e.g., setting entries that increase to +∞) and narrowing.
Transfer functions. Assignments and arithmetic operations must be expressed as constraint manipulations on the matrix. For instance, v := w + c can be encoded by removing all constraints on v and then adding v - w <= c and w - v <= -c.
Guards. A guard v - w <= c translates directly into a matrix entry update followed by closure. Other comparisons (<, =, !=) must be expressed in terms of difference constraints.

Hints

Use +∞ (or a dedicated representation) for unconstrained entries. After closure, emptiness is detected by a negative diagonal entry. The hardest part is correctly handling assignments involving expressions more complex than simple differences; you may need to fall back to weaker constraints (remove and re-derive) for operations that do not fit the x - y <= c template.

Karr’s linear equalities

The Karr domain represents sets of affine equalities between program variables, e.g., x = 2*y + 3*z + 1. The abstract state is a system of affine equalities stored as a matrix in row echelon form.

You must implement:

Gaussian elimination. All operations on the equality system must maintain the matrix in row echelon form.
Lattice operations. The join of two systems of equalities is the set of equalities implied by both (intersection of the affine spaces, computed via a combined matrix reduction). Meet is the conjunction of both systems (union of constraints, followed by re-reduction). Inclusion is checked by verifying that every equality in the smaller element is implied by the larger.
Transfer functions. An assignment v := expr is modeled by substituting the expression for v in all equalities and then re-reducing. A non-invertible assignment (e.g., v := w * w) requires eliminating v from the system and adding any new equality that can be derived.
Guards. An equality guard v = w adds an equation to the system. Inequality guards (<=, <, !=) generally cannot be exploited by this domain and should be treated conservatively (no refinement).

Hints

Use exact arithmetic (the Zarith library) to avoid rounding issues. Since widening is not needed for a domain of finite height (the lattice of affine subspaces satisfies the ascending chain condition), widen_with can simply behave as join_with. The main difficulty lies in implementing a correct and efficient join (affine hull); a good reference is Karr’s original algorithm.

Extensions

If you want, you are allowed to implement one (or some) of the following extensions. This optional work will be taken into account in the grade.

Octagon abstract domain

Extend the DBM domain to the octagon abstract domain, which represents constraints of the form ± x ± y <= c. The octagon domain is strictly more expressive than DBM while retaining polynomial-time algorithms. It encodes constraints in a matrix of dimension 2n by introducing, for each variable x, a positive form x+ and a negative form x- (representing -x).

You must adapt the closure algorithm to enforce coherence between the positive and negative forms of each variable (strong closure). You should provide examples showing invariants that the octagon domain can discover but the DBM domain cannot, and discuss the cost difference.

Linear congruences

Extend Karr’s linear equalities domain with linear congruences, representing constraints of the form a1*x1 + ... + an*xn = b [mod m]. This allows discovering modular relationships between variables (e.g., x is always even, or x = y mod 3).

You should provide examples illustrating invariants that require congruence information, and discuss how congruences interact with the equality domain.

Resources and Bibliography

Software:

OCaml language.
Zarith arbitrary precision number library
Graphviz graph visualization

On abstract domains:

Antoine Miné. A New Numerical Abstract Domain based on Difference-Bound Matrices
Antoine Miné. The octagon abstract domain
Michael Karr. Affine relationships among variables of a program
Philippe Granger. Static Analysis of Linear Congruence Equalities among Variables of a Program