What is a CRDT?
CRDTs are inspired above all by the work of Shapiro, Preguiça, Baquero, and Zawirski. In distributed computing, a conflict-free replicated data type (abbreviated CRDT) is a type of specially-designed data structure used to achieve strong eventual consistency (SEC) and monotonicity (absence of rollbacks). There are two alternative routes to ensuring SEC: operation-based CRDTs and state-based CRDTs. . The two alternatives are equivalent, as one can emulate the other, but operation-based CRDTs require additional guarantees from the communication middleware.
CRDTs are used to replicate data across multiple computers in a network, executing updates without the need for remote synchronization. This would lead to merge conflicts in systems using conventional eventual consistency technology, but CRDTs are designed such that conflicts are mathematically impossible. Under the constraints of the CAP theorem they provide the strongest consistency guarantees for available/partition-tolerant (AP) settings. In contrast, consensus protocols such as Paxos are required for strongly-consistent/partition-tolerant (CP) settings.
Find out more about CRDTs on Wikipedia.
Who is using CRDTs?
Where are we going?
The goal of SyncFree is to enable large-scale distributed applications without global synchronisation, by building upon the recent concept of Conflict-free Replicated Data Types (CRDTs). CRDTs allow unsynchronised concurrent updates, yet ensure data consistency. This revolutionary approach maximises responsiveness and availability; it enables locating data near its users, in decentralised clouds. SyncFree aims to enable extreme scale replication, by applying eventual consistency techniques to CDRTs and beyond CDRTs.
Global-scale applications, such as virtual wallets, advertising platforms, social networks, online games, or collaboration networks, require consistency across distributed data items. As networked users, objects, devices, and sensors proliferate, the consistency issue is increasingly acute for the software industry. Classical alternatives are both unsatisfactory: either to rely on synchronisation to ensure strong consistency, or to forfeit synchronisation and consistency altogether with ad-hoc eventual consistency. The former approach does not scale beyond a single data centre and is expensive. The latter is extremely difficult to understand, and remains error-prone, even for highly- skilled programmers.
CDRTs avoid both global synchronisation and the complexities of ad-hoc eventual consistency by leveraging some simple formal properties of CRDTs. CRDTs are designed so that unsynchronised concurrent updates do not conflict and have well-defined semantics. By combining CRDT objects from a standard library of proven datatypes (counters, sets, graphs, sequences, etc.), large-scale distributed programming is simpler and less error-prone. CRDTs are a practical and cost-effective approach. The SyncFree project develops both theoretical and practical understanding of large-scale synchronisation-free programming based on CRDTs. Project results are new industrial applications, new application architectures, large-scale evaluation of both, programming models and algorithms for large-scale applications, and advanced scientific understanding.
The SyncFree project advances both the theory and practice of large-scale application architectures, and especially of CRDTs and related mechanisms. As the SyncFree industrial partners already have large user bases and feel the need for increased scalability in their applications, the project aims to include an extreme-scale crowd-sourced experiments, pushing the scalability needs of real world applications. An open-source library of CRDTs, to be used in future scalable distributed applications will be made available, leaving a lasting and beneficial impact far beyond the end of the project. Beyond CRDTs the project explores global invariants in an extreme-scale environment to develop programming tools and patterns for extreme scale replication, and to experiment in vivo with extreme scale real applications.