CRDTs - Conflict-free Replicated Data Types

Distributed Data Structure C Applications & Practices

Basic Information

  • Name: CRDTs (Conflict-free Replicated Data Types)
  • Formal Proposal: 2011 (Marc Shapiro, Nuno Preguica, Carlos Baquero, Marek Zawirski)
  • Type: Distributed Data Structure
  • Application Fields: Collaborative Editing, Distributed Databases, Local-first Software
  • Academic Resources: https://crdt.tech/

Conceptual Description

CRDTs are a special class of data structures that can be independently replicated across multiple computers. Each replica can be modified simultaneously and independently without coordination with other replicas. Regardless of the order in which modifications arrive, all replicas will eventually converge to the same state automatically, without the need for any special conflict resolution code or user intervention.

Core Features

  • Eventual Consistency: All replicas will eventually converge to the same state
  • Coordination-free Modifications: Any replica can be modified independently without central coordination
  • Automatic Merging: Concurrent modifications are automatically merged without manual conflict resolution
  • Decentralization: No central server is required, supporting P2P networks
  • Offline-friendly: Offline modifications can be automatically merged upon reconnection

Two Types of CRDTs

State-based CRDTs (CvRDTs)

  • Send the complete local state to other replicas on each update
  • The receiver merges the received state with the local state
  • The merge operation must satisfy commutativity, associativity, and idempotence
  • Higher bandwidth consumption but simpler implementation

Operation-based CRDTs (CmRDTs)

  • Send update operations directly to other replicas and apply them
  • More bandwidth-efficient (only transmitting operations rather than the complete state)
  • Requires reliable causal order delivery
  • More complex implementation

Common CRDT Data Types

  • G-Counter: Grow-only Counter
  • PN-Counter: Positive-Negative Counter (supports increment and decrement)
  • G-Set: Grow-only Set
  • OR-Set: Observe-Remove Set (supports addition and removal)
  • LWW-Register: Last-Write-Wins Register
  • RGA: Replicated Growable Array (used for sequences/text)
  • YATA: Text CRDT algorithm used by Yjs

Major Implementations

  • Automerge: JSON Document CRDT (JavaScript/Rust)
  • Yjs: Fastest JavaScript CRDT implementation
  • Loro: Emerging CRDT library
  • Diamond Types: High-performance CRDT (Rust)

Production-level Applications

  • Redis: Uses CRDTs for multi-region replication
  • Riak: Distributed database with built-in CRDTs
  • Azure Cosmos DB: Provides CRDT data types
  • SoundCloud: Uses CRDTs for audio distribution
  • Figma: Uses CRDT-like technology for collaborative design

Challenges and Limitations

  • Some operational semantics are difficult to express in CRDTs
  • Storage overhead of state-based CRDTs grows over time
  • Deletion operations require "tombstone" markers
  • Some conflict resolution strategies may not align with user intuition
  • Garbage Collection (GC) is complex in distributed environments

Relationship with OpenClaw

CRDTs can be used for multi-device synchronization in OpenClaw—when users use OpenClaw simultaneously on their phones and computers, conversation history and configurations can be automatically synchronized without conflicts using CRDTs, eliminating the need for a central server.

Sources

External References

Learn more from these authoritative sources: