A Theory of the State

There are two classes of programming language whose mathematical foundations have been well understood for many years. The first consists of variations on the simple imperative language of while-programs with global variables. Since the work of Floyd and Hoare nearly 40 years ago, there has been a huge amount of work on using program logics to specify and verify programs written in languages of this class, usually under the heading of "formal methods". The second comprises more-or-less pure functional languages, featuring higher-order functions and different choices of type system (including none) and different evaluation orders. These languages have been studied extensively in the "theory and semantics" community, using operational and denotational techniques.

However, essentially no useful programming language really fits into either of these classes. C, ML, Java and C# all feature dynamically allocated mutable storage and higher-order features. Really good models and reasoning principles for this combination have proved extremely elusive, despite a huge amount of research. Even proving elementary properties of first-order code with dynamically allocated pointer structures was impractically hard prior to recent work in Separation Logic. It is easy to write a set of rules which specify precisely how any individual program executes, but seems very difficult to show that two bits of program always behave the same, or to express specifications that involve the way in which bits of mutable state are kept (partially) hidden (encapsulated, abstracted) within, say, a C# object or an ML closure. Worrying about the finer details of how to establish subtle contextual equivalences involving first-class functions and private state in ML-like languages may appear to be a rather esoteric pursuit. But it is actually a central issue in understanding sequential computation. Much applied programming-related research can be seen to involve the same basic issues, approached from different application-specific perspectives. For example

These topics are certainly distinct. In particular, the engineering issues are utterly different in the different cases. But the conceptual problems are very similar, despite the tendency for each area to develop "just enough" of its own theory to state some kind of correctness result for each particular system proposed.

Leviathan is an umbrella name for a line of research into semantics and program logics for a range of languages, type systems, analyses and program transformations, all of which involve state. The pipe dream is to design a unified, low-level, machine-verified foundation into which many different type systems, logics and analyses may be expressed, verified, generalized and combined. Such a foundation could, for example, form the basis for a next-generation, secure, multilanguage execution environment. But attacking that head-on looks a bit of a tall order, to put it mildly. Instead, over the last few years, with various collaborators, I've been sneaking up on it and trying to get it surrounded:

As you can see, we've used almost everything in our attempts to get to grips with state, generativity and encapsulation. Operational, concrete denotational, abstract denotational, and axiomatic semantics. High-level, medium-level and low-level languages. Manual and machine-assisted proof. Predicates, relations and parametric relations. There's an enormous amount still to do, but it's all starting to hang together in a most encouraging fashion...

...mathematics is always concerned with extensions rather than intensions. - Whitehead and Russell, Principia Mathematica