I am a distributed systems researcher and software engineer.
I was affiliated with Inria / UPMC until March 2015.
Now, I am a software engineer at
Google Zürich.
My research interests evolve around distributed systems and database replication,
in particular highly available systems with weak consistency models.
In my PhD thesis, I demonstrated how to minimize the metadata cost of
eventually-consistent replication systems based on (Conflict-free) Replicated Data Types (CRDTs)
and on causal consistency.
More generally, I am interested in maximimzing consistency guarantees, programmability and
efficiency / scalability of eventually-consistent systems, as well as in understanding the
limits and trade-offs of their design space.
- Specification and Complexity of Collaborative Text Editing
Hagit Attiya, Sebastian Burckhardt, Alexey Gotsman, Adam Morrison, Hongseok Yang, Marek Zawirski.
PODC'16: Symposium on Principles of Distributed Computing, Chicago, IL, USA. To appear.
[PDF] [abstract] [bib] [extended version PDF]
Collaborative text editing systems allow users to concurrently edit a shared document, inserting and deleting elements (e.g., characters or lines). There are a number of protocols for collaborative text editing, but so far there has been no precise specification of their desired behavior, and several of these protocols have been shown not to satisfy even basic expectations. This paper provides a precise specification of a replicated list object, which models the core functionality of replicated systems for collaborative text editing. We define a strong list specification, which we prove is implemented by an existing protocol, as well as a weak list specification, which admits additional protocol behaviors.
A major factor determining the efficiency and practical feasibility of a collaborative text editing protocol is the space overhead of the metadata that the protocol must maintain to ensure correctness. We show that for a large class of list protocols, implementing either the strong or the weak list specification requires a metadata overhead that is at least linear in the number of elements deleted from the list. The class of protocols to which this lower bound applies includes all list protocols that we are aware of, and we show that one of these protocols almost matches the bound.
@InProceedings{list-podc16,
author = {Hagit Attiya and Sebastian Burckhardt and Alexey Gotsman and
Adam Morrison and Hongseok Yang and Marek Zawirski},
title = {{Specification and Complexity of Collaborative Text Editing}},
booktitle = {{Symposium on Principles of Distributed Computing (PODC)}},
year = 2016,
}
- Eventually Consistent Register Revisited
Marek Zawirski, Carlos Baquero, Annette Bieniusa, Nuno Preguiça, Marc Shapiro.
Technical report arXiv:1511.05010. November 2015.
PaPoC'16: 2nd Workshop on the Principles and Practice of Consistency for Distributed Data. April 2016.
[PDF] [abstract] [bib] [arXiv]
In order to converge in the presence of concurrent updates, modern eventually consistent replication systems rely on causality information and operation semantics. It is relatively easy to use semantics of high-level operations on replicated data structures, such as sets, lists, etc. However, it is difficult to exploit semantics of operations on registers, which store opaque data. In existing register designs, concurrent writes are resolved either by the application, or by arbitrating them according to their timestamps. The former is complex and may require user intervention, whereas the latter causes arbitrary updates to be lost. In this work, we identify a register construction that generalizes existing ones by combining runtime causality ordering, to identify concurrent writes, with static data semantics, to resolve them. We propose a simple conflict resolution template based on an application-predefined order on the domain of values. It eliminates or reduces the number of conflicts that need to be resolved by the user or by an explicit application logic. We illustrate some variants of our approach with use cases, and how it generalizes existing designs.
@TechReport{register-revisited,
author = {Marek Zawirski and Carlos Baquero and Annette Bieniusa
and Nuno Pregui{\c c}a and Marc Shapiro},
title = {Eventually Consistent Register Revisited},
institution = {\url{arXiv.org}},
year = 2015,
number = {arXiv:1511.05010 [cs.DC]},
month = nov,
url = {http://arxiv.org/abs/1511.05010}
}
- Write Fast, Read in the Past: Causal Consistency for Client-side Applications
Marek Zawirski, Nuno Preguiça, Sérgio Duarte, Annette Bieniusa, Valter Balegas, Marc Shapiro.
Middleware'15: 16th International Middleware Conference, Vancouver, BC, Canada. December 2015.
Extended technical report: Rapport de recherche RR-8729, Inria. May 2015.
[PDF] [abstract] [bib] [tech report PDF (extended version)] [HAL] [github]
Client-side apps (e.g., mobile or in-browser) need cloud data to be available in a local cache, for both reads and updates. For optimal user experience and developer support, the cache should be consistent and fault-tolerant. In order to scale to high numbers of unreliable and resource-poor clients, and large database, the system needs to use resources sparingly. The SwiftCloud distributed object database is the first to provide fast reads and writes via a causally-consistent client-side local cache backed by the cloud. It is thrifty in resources and scales well, thanks to consistent versioning provided by the cloud, using small and bounded metadata. It remains available during faults, switching to a different data centre when the current one is not responsive, while maintaining its consistency guarantees. This paper presents the SwiftCloud algorithms, design, and experimental evaluation. It shows that client-side apps enjoy the high performance and availability, under the same guarantees as a remote cloud data store, at a small cost.
@InProceedings{swiftcloud,
author = {Zawirski, Marek and Pregui{\c c}a, Nuno and Duarte,
S{\'e}rgio and Bieniusa, Annette and Balegas, Valter
and Shapiro, Marc},
title = {Write Fast, Read in the Past: Causal Consistency for
Client-side Applications},
booktitle = Middleware,
year = 2015,
month = dec,
address = {Vancouver, BC, Canada},
organization = {ACM/IFIP/Usenix},
Keywords = {geo-replication; causal consistency; eventual
consistency; fault tolerance; client-side
applications},
}
@TechReport{swiftcloud-tr15,
author = {Zawirski, Marek and Pregui{\c c}a, Nuno and Duarte,
S{\'e}rgio and Bieniusa, Annette and Balegas, Valter
and Shapiro, Marc},
title = {Write Fast, Read in the Past: Causal Consistency for Client-side Applications},
institution = {Inria},
year = 2015,
number = {RR-8729}
}
- Dependable Eventual Consistency with Replicated Data Types
Marek Zawirski.
PhD thesis, Inria/UPMC. January 2015.
[PDF] [abstract] [bib] [HAL]
Web applications rely on replicated databases to place data close to their users, and to tolerate
failures. Many replication systems opt for eventual consistency, which offers excellent responsiveness
and availability, but may expose applications to the complexity of concurrency and failures.
To alleviate this problem, recent replication systems have begun to strengthen their interface
with additional guarantees, such as causal consistency, which protects the application from
ordering anomalies, and Replicated Data Types (RDTs), which encapsulate concurrent updates
via high level object interface and ensure convergent semantics. However, dependable algorithms
for these abstractions come at a high cost in metadata size. This thesis studies three related
topics: (i) design of RDT implementations with minimized metadata; (ii) design of consistency
algorithms with minimized metadata; and (iii) the limits of the design space.
Our first contribution is a study of metadata complexity of RDTs. RDTs need metadata
to provide rich semantics under concurrency and failures. Several existing implementations
incur high overhead in storage space, in the number of updates, or polynomial in the number of
replicas. Minimizing metadata is nontrivial without impacting their semantics or fault-tolerance.
We design optimized set and register RDTs with metadata overhead reduced to the number of
replicas. We also demonstrate metadata lower bounds for six RDTs, thereby proving optimality of
four implementations. As a result, RDT designers and users can better navigate the design space.
Our second contribution is the design of SwiftCloud, a replicated causally-consistent RDT
object database for client-side applications, e.g., mobile or in-browser apps. We devise algorithms
to support high numbers of client-side partial replicas backed by the cloud, in a fault-tolerant
manner, with small and bounded metadata. We demonstrate how to support available and
consistent reads and updates, at the expense of some slight data staleness; i.e., our approach
trades freshness for scalability (small and bounded metadata, parallelism), and availability
(ability to switch between data centers on failure). Our experiments with thousands of client
replicas confirm the design goals were achieved at a low staleness cost.
@PhdThesis{thesis-zawirski,
author = {Marek Zawirski},
title = {{Dependable Eventual Consistency with Replicated Data Types}},
school = {L'université Pierre et Marie Curie (UPMC)},
year = 2015,
month = jan,
address = {Paris, France},
}
- Replicated Data Types: Specification, Verification, Optimality
Sebastian Burckhardt, Alexey Gotsman, Hongseok Yang, Marek Zawirski.
POPL'14: 41st Symposium on Principles of Programming Languages, San Diego, CA, USA. January 2014.
[PDF] [abstract] [bib] [extended version PDF]
Geographically distributed systems often rely on replicated eventually
consistent data stores to achieve availability and performance. To resolve
conflicting updates at different replicas, researchers and practitioners have
proposed specialized consistency protocols, called replicated data types, that
implement objects such as registers, counters, sets or lists. Reasoning about
replicated data types has however not been on par with comparable work on
abstract data types and concurrent data types, lacking specifications,
correctness proofs, and optimality results.
To fill in this gap, we propose a framework for specifying replicated data types
using relations over events and verifying their implementations using
replication-aware simulations. We apply it to 7 existing implementations of 4
data types with nontrivial conflict-resolution strategies and optimizations
(last-writer-wins register, counter, multi-value register and observed-remove
set). We also present a novel technique for obtaining lower bounds on the
worst-case space overhead of data type implementations and use it to prove
optimality of 4 implementations. Finally, we show how to specify consistency of
replicated stores with multiple objects axiomatically, in analogy to prior work
on weak memory models. Overall, our work provides foundational reasoning tools
to support research on replicated eventually consistent stores.
@InProceedings{rdt-popl14,
author = {Sebastian Burckhardt and Alexey Gotsman and Hongseok Yang and Marek Zawirski},
title = {{Replicated Data Types: Specification, Verification, Optimality}},
booktitle = {{41st Symposium on Principles of Programming Languages (POPL)}},
year = 2014,
}
- SwiftCloud: Fault-Tolerant Geo-Replication Integrated all the Way to the Client Machine
Marek Zawirski, Annette Bieniusa, Valter Balegas, Sérgio Duarte, Carlos Baquero, Marc Shapiro, Nuno Preguiça
Technical report: Rapport de recherche RR-8347, Inria. August 2013. Superseded by the Middleware paper.
[PDF] [abstract] [bib] [poster] [HAL] [arXiv]
Client-side logic and storage are increasingly used in web and mobile
applications to improve response time and availability. Current approaches
tend to be ad-hoc and poorly integrated with the server-side logic.
We present a principled approach to integrate client- and server-side
storage. We support mergeable and strongly consistent transactions that
target either client or server replicas and provide access to
causally-consistent snapshots efficiently. In the presence of infrastructure
faults, a client-assisted failover solution allows client execution to
resume immediately and seamlessly access consistent snapshots without waiting.
We implement this approach in SwiftCloud, the first transactional system to
bring geo-replication all the way to the client machine.
Example applications show that our programming model is useful across a range
of application areas. Our experimental evaluation shows that SwiftCloud
provides better fault tolerance and at the same time can improve both latency
and throughput by up to an order of magnitude, compared to classical
geo-replication techniques.
Please refer to the recent paper on SwiftCloud instead when possible.
@TechReport{swiftcloud-tr13,
author = {Marek Zawirski
and Annette Bieniusa
and Valter Balegas
and S{\'e}rgio Duarte
and Carlos Baquero
and Marc Shapiro
and Nuno Pregui{\c c}a},
title = {{S}wift{C}loud: Fault-Tolerant Geo-Replication Integrated
all the Way to the Client Machine},
institution = {Inria},
year = 2013,
number = {RR-8347}
}
- An Optimized Conflict-free Replicated Set
Annette Bieniusa, Marek Zawirski, Nuno Preguiça, Marc Shapiro, Carlos Baquero, Valter Balegas, Sérgio Duarte.
Technical report: Rapport de recherche RR-8083, Inria. October 2012.
[PDF] [abstract] [bib] [HAL] [arXiv]
Eventual consistency of replicated data supports concurrent updates, reduces
latency and improves fault tolerance, but forgoes strong consistency.
Accordingly, several cloud computing platforms implement eventually-consistent
data types.
The set is a widespread and useful abstraction, and many replicated set designs
have been proposed. We present a reasoning abstraction, permutation
equivalence, that systematizes the characterization of the expected concurrency
semantics of concurrent types. Under this framework we present one of the
existing conflict-free replicated data types, Observed-Remove Set.
Furthermore, in order to decrease the size of meta-data, we propose a new
optimization to avoid tombstones. This approach that can be transposed to other
data types, such as maps, graphs or sequences.
@TechReport{orswot-tr12,
author = {Annette Bieniusa and Marek Zawirski and Nuno
Pregui{\c c}a and Marc Shapiro and Carlos Baquero and Valter
Balegas and S{\'e}rgio Duarte},
title = {An Optimized Conflict-free Replicated Set},
institution = {Inria},
year = 2012,
number = {RR-8083}
}
- Brief Announcement: Semantics of Eventually Consistent Replicated Sets
Annette Bieniusa, Marek Zawirski, Nuno Preguiça, Marc Shapiro, Carlos Baquero, Valter Balegas, Sérgio Duarte.
DISC'12: 26th International Symposium on Distributed Computing, Salvador, Bahia, Brazil. October 2012.
[PDF] [bib]
@InProceedings{crdt-sets-disc12,
author = {Annette Bieniusa and Marek Zawirski and Nuno Pregui{\c c}a
and Marc Shapiro and Carlos Baquero and Valter
Balegas and S{\'e}rgio Duarte},
title = {Brief Announcement: Semantics of Eventually Consistent
Replicated Sets},
booktitle = {{26th International Symposium on Distributed Computing (DISC)}},
year = 2012,
}
- The Space Complexity of Transactional Interactive Reads
Masoud Saeida Ardekani, Marek Zawirski, Pierre Sutra, Marc Shapiro.
HotCDP'12: 1st Workshop on Hot Topics in Cloud Data Processing (in conjunction with EuroSys 2012), Bern, Switzerland. April 2012.
[PDF] [abstract] [bib]
Transactional Web Applications need to perform fast interactive reads while
ensuring reasonable isolation guarantees.
This paper studies the problem of taking consistent snapshots for transactions
with interactive reads. We introduce four levels of freshness, and solutions
to guarantee them. We also explore trade-offs between the space complexity and
the freshness levels.
@InProceedings{interactive-reads-hotcdp12,
author = {Saeida Ardekani, Masoud and Marek Zawirski and
Pierre Sutra and Marc Shapiro},
title = {The Space Complexity of Transactional Interactive
Reads},
booktitle = {{1st Workshop on Hot Topics in Cloud Data Processing (HotCDP)}},
year = 2012
}
- Conflict-free Replicated Data Types
Marc Shapiro, Nuno Preguiça, Carlos Baquero, Marek Zawirski.
SSS'11: 13th International Symposium on Stabilization, Safety, and Security of Distributed Systems, Grenoble, France. October 2011.
[PDF] [abstract] [bib] [tech report PDF (extended version)] [HAL]
Replicating data under Eventual Consistency (EC) allows any replica to accept
updates without remote synchronisation. This ensures performance and
scalability in large-scale distributed systems (e.g., clouds). However,
published EC approaches are ad-hoc and error-prone. Under a formal Strong
Eventual Consistency (SEC) model, we study sufficient conditions for
convergence. A data type that satisfies these conditions is called a
Conflict-free Replicated Data Type (CRDT). Replicas of any CRDT are guaranteed
to converge in a self-stabilising manner, despite any number of failures.
This paper formalises two popular approaches (state- and operation-based) and
their relevant sufficient conditions. We study a number of useful CRDTs, such
as sets with clean semantics, supporting both add and remove operations, and
consider in depth the more complex Graph data type. CRDT types can be composed
to develop large-scale distributed applications, and have interesting
theoretical properties.
@InProceedings{crdts-sss11,
author = {Marc Shapiro and Nuno Pregui{\c c}a and Carlos Baquero
and Marek Zawirski},
title = {Conflict-free Replicated Data Types},
booktitle = {{13th International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS)}},
year = 2011
}
- Convergent and Commutative Replicated Data Types
Marc Shapiro, Nuno Preguiça, Carlos Baquero, Marek Zawirski.
BEACTS#104: Bulletin of the European Association for Theoretical Computer Science, Number 104. June 2011.
[PDF] [abstract] [bib]
Eventual consistency aims to ensure that replicas of some mutable shared
object converge without foreground synchronisation. Previous approaches
to eventual consistency are ad-hoc and error-prone. We study a principled
approach: to base the design of shared data types on some simple formal
conditions that are sufficient to guarantee eventual consistency. We call these
types Convergent or Commutative Replicated Data Types (CRDTs). This
paper formalises asynchronous object replication, either state based or
operation based, and provides a sufficient condition appropriate for each case.
It describes several useful CRDTs, including container data types supporting
both add and remove operations with clean semantics, and more complex
types such as graphs and monotonic DAGs. It discusses some properties
needed to implement non-trivial CRDTs.
@Article{crdts-beacts11,
author = {Marc Shapiro and Nuno Pregui{\c c}a and Carlos Baquero and
Marek Zawirski},
title = {Convergent and Commutative Replicated Data Types},
journal = {{Bulletin of the European Association for Theoretical
Computer Science (EATCS)}},
year = 2011,
number = 104,
}
- A comprehensive study of Convergent and Commutative Replicated Data Types
Marc Shapiro, Nuno Preguiça, Carlos Baquero, Marek Zawirski.
Technical report: Rapport de recherche RR-7506, Inria. January 2011.
[PDF] [abstract] [bib] [HAL]
Eventual consistency aims to ensure that replicas of some mutable shared
object converge without foreground synchronisation. Previous approaches to
eventual consistency are ad-hoc and error-prone. We study a principled
approach: to base the design of shared data types on some simple formal
conditions that are sufficient to guarantee eventual consistency. We call
these types Convergent or Commutative Replicated Data Types (CRDTs). This paper
formalises asynchronous object replication, either state based or operation
based, and provides a sufficient condition appropriate for each case. It
describes several useful CRDTs, including container data types supporting both
add and remove operations with clean semantics, and more complex types such as
graphs, montonic DAGs, and sequences. It discusses some properties needed to
implement non-trivial CRDTs.
@TechReport{crdts-comprehensive-tr11,
author = {Marc Shapiro and Nuno Pregui{\c c}a and Carlos Baquero
and Marek Zawirski},
title = {A comprehensive study of {C}onvergent and {C}ommutative
{R}eplicated {D}ata {T}ypes},
institution = {{Inria}},
year = 2011,
number = {RR-7506}
}
Subreviewer or coreviewer for: DISC 2013, ICDCN 2014, POPL 2014, Middleware 2014.
Reviewer for W-PSDS 2015.
I have always been passionate about offshore and ocean sailing.
On the active side, I am at sea with my friends or family as much as I can (which always feels too little!).
I keep track of approximate
routes of my trips.
Back at home, I enjoy following
round-the-world races such as
Volvo Ocean Race and
Vendée Globe.
I am also keen of mountain biking.