Research

I'm a CNRS researcher at LIP6, where I am a member of the Whisper team. My work revolves around making safer Systems software through domain-specific languages (DSLs) and formal verification (and, in particular, using and abusing of interactive theorem provers).

Prior to joining LIP6, I was a postdoctoral researcher in the Gallium team at INRIA Paris. There, I have worked with François Pottier and Didier Rémy. This involved some Mezzo, and some ornamentation.

And before all that, I was Conor McBride's PhD student. During my PhD, I've worked on the theory behind the intensional fragment of Epigram. I've studied some universes of data-types and the possibilities offered by such an approach. If Epigram were to be (eventually) implemented, it might contain the cool stuff I've modelled with polynomial functors.

Publications

A Formally Verified Compiler for Lustre

Timothy Bourke, Lélio Brun, Pierre-Évariste Dagand, Xavier Leroy, Marc Pouzet, Lionel Rieg

The correct compilation of block diagram languages like Lustre, Scade, and a discrete subset of Simulink is important since they are used to program critical embedded control software. We describe the specification and verification in an Interactive Theorem Prover of a compilation chain that treats the key aspects of Lustre: sampling, nodes, and delays. Building on CompCert, we show that repeated execution of the generated assembly code faithfully implements the dataflow semantics of source programs.

We resolve two key technical challenges. The first is the change from a synchronous dataflow semantics, where programs manipulate streams of values, to an imperative one, where computations manipulate memory sequentially. The second is the verified compilation of an imperative language with encapsulated state to C~code where the state is realized by nested records. We also treat a standard control optimization that eliminates unnecessary conditional statements.

The Essence of Ornaments

Pierre-Evariste Dagand

Functional programmers from all horizons strive to use, and sometimes abuse, their favorite type system in order to capture the invariants of their programs. A widely-used tool in that trade consists in defining finely-indexed datatypes. Operationally, these types classify the programmer's data, following the ML tradition. Logically, these types enforce the program invariants in a novel manner. This new programming pattern, by which one programs over inductive definitions to account for some invariants, is the motivation behind the theory of ornaments.

However, ornaments originate from dependent type theory and may thus appear rather daunting to a functional programmer of the non dependent kind. Ornaments may also appear quite specific to a particular type theory whilst, in fact, they are a very general notion. This article aims at presenting the notion of ornament from first-principles and, in particular, decluttered from syntactic considerations.

To do so, we shall give a sufficiently abstract model of indexed datatypes by means of many-sorted signatures. In this process, we formalise our intuition that an indexed datatype is the combination of a data-structure and a data-logic. Unlike a purely categorical presentation, this model will appeal to our programmer's intuition: we shall be manipulating (type-theoretic) functions, remaining close to our functional comfort zone.

Over this abstraction of datatypes, we shall recast the definition of ornaments, effectively giving a model of ornaments. Benefiting both from the operational and abstract nature of many-sorted signatures, we cast ornaments under an operational and an abstract light. Finally, the versatility of our model has been instrumental in the development of a genuine calculus of data-structures: with a few combinators, new datatypes can be engineered by composing off-the-shelf data-structures with special-purpose data-logics. In this article, we shall illustrate these techniques by way of examples.

As a result, ornaments should appear applicable and, one hopes, of interest beyond the type-theoretic circles, case in point being languages with generalized abstract datatypes or refinement types.

Verifying clock-directed modular code generation for Lustre (fr)

Timothy Bourke, Pierre-Évariste Dagand, Marc Pouzet, Lionel Rieg

This paper addresses the compilation pass that transforms synchronous dataflow programs into imperative programs. The challenge is to move from a dataflow semantics, where programs manipulate streams of values, to an imperative one, where programs perform stateful computations. We specify and verify in the Coq interactive theorem prover a code generator that handles the key aspects of Lustre including nodes and delays.

Partial Type Equivalences for Verified Dependent Interoperability

Pierre-Évariste Dagand, Eric Tanter, Nicolas Tabareau

Full-spectrum dependent types promise to enable the development of correct-by-construction software. However, even certified software needs to interact with simply-typed or untyped programs, be it to perform system calls, or to use legacy libraries. Trading static guarantees for runtime checks, the dependent interoperability framework provides a mechanism by which simply-typed values can safely be coerced to dependent types and, conversely, dependently-typed programs can defensively be exported to a simply-typed application.

In this paper, we give a semantic account of dependent interoperability. Our presentation relies on and is guided by a pervading notion of type equivalence, whose importance has been emphasized in recent works on homotopy type theory. Specifically, we develop the notion of partial type equivalences as a key foundation for dependent interoperability. Our framework is developed in Coq; it is thus constructive and verified in the strictest sense of the terms.

Using our library, users can specify domain-specific partial equivalences between data structures. Our library then takes care of the (sometimes, heavy) lifting that leads to interoperable programs. It thus becomes possible, as we shall illustrate, to internalize and hand-tune the extraction of dependently-typed programs to interoperable OCaml programs within Coq itself.

From Sets to Bits in Coq

Arthur Blot, Pierre-Evariste Dagand, Julia Lawall

Computer Science abounds in folktales about how -- in the early days of computer programming -- bit vectors were ingeniously used to encode and manipulate finite sets. Algorithms have thus been developed to minimize memory footprint and maximize efficiency by taking advantage of microarchitectural features. With the development of automated and interactive theorem provers, finite sets have also made their way into the libraries of formalized mathematics. Tailored to ease proving, these representations are designed for symbolic manipulation rather than computational efficiency.

This paper aims at bridging this gap. In the Coq proof assistant, we implement a bitset library and prove its correctness with respect to a formalization of finite sets. Our library enables a seamless interaction of sets for computing -- bitsets -- and sets for proving -- finite sets.

Ornaments in Practice

Thomas Williams, Pierre-Evariste Dagand, Didier Rémy

Ornaments have been introduced as a way to describe some changes in datatype definitions that preserve their recursive structure, reorganizing, adding, or dropping some pieces of data. After a new data structure has been described as an ornament of an older one, some functions operating on the bare structure can be partially or sometimes totally lifted into functions operating on the ornamented structure. We explore the feasibility and the interest of using ornaments in practice by applying these notions in an ML-like programming language. We propose a concrete syntax for defining ornaments of datatypes and the lifting of bare functions to their ornamented counterparts, describe the lifting process, and present several interesting use cases of ornaments.

Coq: The world's best macro assembler?

Andrew Kennedy, Nick Benton, Jonas Jensen, Pierre-Evariste Dagand

We describe a Coq formalization of a subset of the x86 architecture. One emphasis of the model is brevity: using dependent types, type classes and notation we give the x86 semantics a makeover that counters its reputation for baroqueness. We model bits, bytes, and memory concretely using functions that can be computed inside Coq itself; concrete representations are mapped across to mathematical objects in the SSReflect library (naturals, and integers modulo 2n) to prove theorems. Finally, we use notation to support conventional assembly code syntax inside Coq, including lexically-scoped labels. Ordinary Coq definitions serve as a powerful ``macro'' feature for everything from simple conditionals and loops to stack-allocated local variables and procedures with parameters. Assembly code can be assembled within Coq, producing a sequence of hex bytes. The assembler enjoys a correctness theorem relating machine code in memory to a separation-logic formula suitable for program verification.

Transporting Functions across Ornaments

Pierre-Evariste Dagand, Conor McBride
Extended version of the ICFP'12 paper

Programming with dependent types is a blessing and a curse. It is a blessing to be able to bake invariants into the definition of data-types: we can finally write correct-by-construction software. However, this extreme accuracy is also a curse: a data-type is the combination of a structuring medium together with a special purpose logic. These domain-specific logics hamper any effort of code reuse among similarly structured data.

In this paper, we exorcise our data-types by adapting the notion of ornament to our universe of inductive families. We then show how code reuse can be achieved by ornamenting functions. Using these functional ornament, we capture the relationship between functions such as the addition of natural numbers and the concatenation of lists. With this knowledge, we demonstrate how the implementation of the former informs the implementation of the latter: the user can ask the definition of addition to be lifted to lists and she will only be asked the details necessary to carry on adding lists rather than numbers.

Our presentation is formalised in a type theory with a universe of data-types and all our constructions have been implemented as generic programs, requiring no extension to the type theory.

A Cosmology of Datatypes
(PhD thesis)

Pierre-Evariste Dagand

With special thanks to:

  • Clément!
  • Fredrik!
  • Guillaume (A.)!
  • Guillaume (M.-M.)!
  • Hank!
  • Lorenzo!
  • Matteo!
  • Pedro!
  • Stevan!
  • Vincent!

A Categorical Treatment of Ornaments

Pierre-Evariste Dagand, Conor McBride

Ornaments aim at taming the multiplication of special-purpose datatypes in dependently typed programming languages. In type theory, purpose is logic. By presenting datatypes as the combination of a structure and a logic, ornaments relate these special-purpose datatypes through their common structure. In the original presentation, the concept of ornament was introduced concretely for an example universe of inductive families in type theory, but it was clear that the notion was more general.

This paper digs out the abstract notion of ornaments in the form of a categorical model. As a necessary first step, we abstract the universe of datatypes using the theory of polynomial functors. We are then able to characterise ornaments as cartesian morphisms between polynomial functors. We thus gain access to powerful mathematical tools that shall help us understand and develop ornaments. We shall also illustrate the adequacy of our model. Firstly, we rephrase the standard ornamental constructions into our framework. Thanks to its conciseness, this process gives us a deeper understanding of the structures at play. Secondly, we develop new ornamental constructions, by translating categorical structures into type theoretic artefacts.

Elaborating Inductive Definitions

Pierre-Evariste Dagand, Conor McBride

We present an elaboration of inductive definitions down to a universe of datatypes. The universe of datatypes is an internal presentation of strictly positive types within type theory. By elaborating an inductive definition -- a syntactic artifact -- to its code -- its semantics -- we obtain an internalized account of inductives inside the type theory itself: we claim that reasoning about inductive definitions could be carried in the type theory, not in the meta-theory as it is usually the case.

Besides, we give a formal specification of that elaboration process. It is therefore amenable to formal reasoning too. We prove the soundness of our translation and hint at its completeness with respect to Coq's Inductive definitions. The practical benefits of this approach are numerous. For the type theorist, this is a small step toward bootstrapping, ie. implementing the inductive fragment in the type theory itself. For the programmer, this means better support for generic programming: we shall present a lightweight deriving mechanism, entirely definable by the programmer and therefore not requiring any extension to the type theory.

Fully-Abstract Compilation to JavaScript

Cédric Fournet, Nikhil Swamy, Juan Chen, Pierre-Evariste Dagand, Pierre-Yves Strub, Ben Livshits

Many tools allow programmers to develop applications in high-level languages and deploy them in web browsers via compilation to JavaScript. While practical and widely used, these compilers are ad-hoc: no guarantee is provided on their correctness for whole programs, nor their security for programs executed within arbitrary JavaScript contexts.

This paper presents a compiler with such positive guarantees. We compile an ML-like language with higher-order functions and references to JavaScript, while preserving all source program properties. Relying on type-based invariants and applicative bisimilarity, we show full abstraction: two programs are equivalent in all source contexts if and only if their wrapped translations are equivalent in all JavaScript contexts. We evaluate our compiler on sample programs, including a series of secure libraries.

Transporting Functions across Ornaments

Pierre-Evariste Dagand, Conor McBride

Programming with dependent types is a blessing and a curse. It is a blessing to be able to bake invariants into the definition of data-types: we can finally write correct-by-construction software. However, this extreme accuracy is also a curse: a data-type is the combination of a structuring medium together with a special purpose logic. These domain-specific logics hamper any effort of code reuse among similarly structured data.

In this paper, we exorcise our data-types by adapting the notion of ornament to our universe of inductive families. We then show how code reuse can be achieved by ornamenting functions. Using these functional ornament, we capture the relationship between functions such as the addition of natural numbers and the concatenation of lists. With this knowledge, we demonstrate how the implementation of the former informs the implementation of the latter: the user can ask the definition of addition to be lifted to lists and she will only be asked the details necessary to carry on adding lists rather than numbers.

Our presentation is formalised in a type theory with a universe of data-types and all our constructions have been implemented as generic programs, requiring no extension to the type theory.

Ornaments, organising the Zoo of data-types

Pierre-Evariste Dagand, Conor McBride

When programming in type theory, we discover a whole zoo of data-types. We have all met unary natural numbers, lists, finite sets, vectors and other similarly structured yet often exotic breeds. Despite their specificities, we notice a striking similarity between these data-types, with natural numbers appearing here as a common ancestor. In type theory, such a diverse evolution comes out of necessity: to be fit for purpose, our types have to be as precise as possible. From a reusability perspective, this is a disaster: these purpose-built data-types are too specific to fit into or benefit from a global library.

In this work, we study an organisation principle for the zoo of data-types. McBride's Ornaments define a general principle for creating new data-types from old. We shall give a categorical presentation of this construction and show that it exactly captures our intuition of a ``lineage''. Our abstract treatment greatly simplifies the type-theoretic definition. Hence, we can explain the standard constructions (e.g., ornamental algebra, algebraic ornament) in simpler terms. Further, we are able to tap into the rich categorical structure of ornaments to uncover new programming artifacts.

Besides giving a mathematical basis for our classification work, this presentation gives a hint on how one could semi-automatically lift operations from ancestor to descendant. Whilst lifting algebras is still work in progress, we have obtained some promising results for restricted classes of morphisms of data-types.

This work has been formalised in Agda, using Interaction Structures to model inductive families.

The Gentle Art of Levitation

James Chapman, Pierre-Evariste Dagand, Conor McBride, Peter Morris

We present a closed dependent type theory whose inductive datatypes are presented not by a scheme for generative datatype declaration, but by encoding in a "universe". Each inductive datatype is thus given by interpreting its description" -- an ordinary first-class value in a datatype of descriptions. Moreover, the latter itself has a description.

By presenting datatypes via such a self-encoding universe, datatype-generic programming becomes ordinary programming. We show some of the resulting generic operations and deploy them in particular, useful ways on the datatype of datatype descriptions itself. Surprisingly this apparently self-supporting setup is achievable without paradox or infinite regress.

Filet-o-Fish: practical and dependable domain-specific languages

Pierre-Evariste Dagand, Andrew Baumann, Timothy Roscoe

We address a persistent problem with using domain-specific languages to write operating systems: the effort of implementing, checking, and debugging the DSL usually outweighs any of its benefits. Because these DSLs generate C by templated string concatenation, they are tedious to write, fragile, and incompatible with automated verification tools.

We present Filet-o-Fish (FoF), a semantic language to ease DSL construction. Building a DSL using FoF consists of safely composing semantically-rich building blocks. This has several advantages: input files for the DSL are formal specifications of the system's functionality, automated testing of the DSL is possible via existing tools, and we can prove that the C code generated by a given DSL respects the semantics expected by the developer.

Early experience has been good: FoF is in daily use as part of the tool chain of the Barrelfish multicore OS, which makes extensive use of domain-specific languages to generate low-level OS code. We have found that the ability to rapidly generate DSLs we can rely on has changed how we have designed the OS.

The Multikernel: A new OS architecture for scalable multicore systems

Andrew Baumann, Paul Barham, Pierre-Evariste Dagand, Tim Harris, Rebecca Isaacs, Simon Peter,
Timothy Roscoe, Adrian Schupbach, Akhilesh Singhania

Commodity computer systems contain more and more processor cores and exhibit increasingly diverse architectural tradeoffs, including memory hierarchies, interconnects, instruction sets and variants, and IO configurations. Previous high-performance computing systems have scaled in specific cases, but the dynamic nature of modern client and server workloads, coupled with the impossibility of statically optimizing an OS for all workloads and hardware variants pose serious challenges for operating system structures.

We argue that the challenge of future multicore hardware is best met by embracing the networked nature of the machine, rethinking OS architecture using ideas from distributed systems. We investigate a new OS structure, the multikernel, that treats the machine as a network of independent cores, assumes no inter-core sharing at the lowest level, and moves traditional OS functionality to a distributed system of processes that communicate via message-passing.

We have implemented a multikernel OS to show that the approach is promising, and we describe how traditional scalability problems for operating systems (such as memory management) can be effectively recast using messages and can exploit insights from distributed systems and networking. An evaluation of our prototype on multicore systems shows that, even on present-day machines, the performance of a multikernel is comparable with a conventional OS, and can scale better to support future hardware.

Opis: reliable distributed systems in OCaml

Pierre-Evariste Dagand, Dejan Kostic, Viktor Kuncak

Concurrency and distribution pose algorithmic and implementation challenges in developing reliable distributed systems, making the field an excellent testbed for evaluating programming language and verification paradigms. Several specialized domain-specific languages and extensions of memory-unsafe languages were proposed to aid distributed system development. We present an alternative to these approaches, showing that modern, higher-order, strongly typed, memory safe languages provide an excellent vehicle for developing and debugging distributed systems.

We present Opis, a functional-reactive approach for developing distributed systems in Objective Caml. An Opis protocol description consists of a reactive function (called event function) describing the behavior of a distributed system node. The event functions in Opis are built from pure functions as building blocks, composed using the Arrow combinators. Such architecture aids reasoning about event functions both informally and using interactive theorem provers. For example, it facilitates simple termination arguments.

Given a protocol description, a developer can use higher-order library functions of Opis to 1) deploy the distributed system, 2) run the distributed system in a network simulator with full-replay capabilities, 3) apply explicit-state model checking to the distributed system, detecting undesirable behaviors, and 4) do performance analysis on the system. We describe the design and implementation of Opis, and present our experience in using Opis to develop peer-to-peer overlay protocols, including the Chord distributed hash table and the Cyclon random gossip protocol. We found that using Opis results in high programmer productivity and leads to easily composable protocol descriptions. Opis tools were effective in helping identify and eliminate correctness and performance problems during distributed system development.

Contact

email:
pierre-evariste.dagand --at-- lip6.fr
office:
Corridor 25-26, 3rd floor, door 327.
phone:
01 44 27 88 77
address:
LIP6
Boîte courrier 169
4 place Jussieu
75252 Paris Cedex 05