Operational transformation & conflict-free resolution for real-time collaboration applications

Table of Contents

I’m currently interviewing with a bunch of companies in the United Kingdom & part of this means I have the opportunity to find out about challenges certain businesses face. One such challenge is that of real-time collaboration within a company’s application.

This has sparked my interest in how real-time collaborative technologies are implemented.

Multiple users being able to edit the same document is fairly common place. It’s so common, in fact, that applications like Google Docs, Microsoft Word, Canva, Figma, Miro, Notion, Linear and many more have support for it out of the box. It’s so, so common that it’s actually weird to see applications without it.

Now all I want to know is how it works.

Techniques #

In exploring this, I’ve found two fairly common techniques that can help.

Operational Transformation (OT) #

This algorithm is most commonly used in real-time collaboration. It processes changes to the current state of the document and applies them server-side in a manner where the order of operations doesn’t affect consistency.

Conflict-free Replicated Data Types (CRDTs) #

Similar to OT, CRDTs ensure that copies of the data eventually converge, regardless of the order of changes. Actions in CRDTs are designed to be commutative, idempotent, and associative, simplifying the merging process. CRDTs are great because they don’t necessarily rely on a server and can work peer-to-peer. This means that users can continue working offline and sync up seamlessly once they reconnect, with no fear of their changes causing inconsistencies.

Considerations #

There are a bunch of other considerations that should be taken into account as well, but I won’t go into too much depth here.

Network latency & disconnections #

Gracefully handling network latency and disconnections on the client side is crucial. Implementing strategies to expire older state changes may help reduce UI inconsistencies and conflict.

Real-time updates #

Instead of polling, using technologies like WebSockets or gRPC can efficiently manage real-time state updates from the server.

Conflict resolution UX #

The user interface should provide intuitive mechanisms for users to understand and resolve conflicts when they occur.

Chronology & age of a state change #

While CRDTs don’t rely on the order of the changes, OTs rely on previous and next state. The distance between chronological changes can cause complexity in conflict resolution. Additionally, the age of a state change could impact future state changes if they aren’t appropriately discarded by the client side after a reasonable TTL. In other words, don’t submit super old state changes from when the network connection was lost, just submit the latest. (Unless you store chronology and use this for server-side conflict resolution later.)

Implementations #

Operational transformation #

In a nutshell, OT is all about ensuring that whatever you do, the end state is consistent, regardless of the order of operations. This is crucial because in a real-world scenario, you never know in what order users’ changes will reach the server. You can collect device timestamps, but synchronising them between many clients is near impossible (but as I mentioned above, could be super useful in conflict resolution if you keep a history of changes).

Concepts #

Idempotency: This is the superhero of concepts in the world of real-time editing. Idempotency means that no matter how many times a particular operation is performed, the result will be the same as if it were performed just once. Imagine if hitting the post button on Facebook kept re-submitting your latest thoughts; you would have no friends! In collaborative applications, idempotency ensures that repeated updates don’t affect the final state beyond the initial application — crucial for maintaining sanity across collaborative sessions.
Associativity: Think of this as the mixer. It means that no matter how you group your operations the end result is going to be the same. Take the example of(a * b) * c = a * (b * c). This is super useful when you have a bunch of changes that need to be merged and you don’t want to spend time untangling what depends on what.
Commutativity: This concept ensures that the order of operations doesn’t matter — a * b = b * a. So, whether user A’s changes are applied before user B’s or vice versa, the end result will be the same. This is gold in environments where changes are flying in from all directions or even made offline.

Basic implementation example #

This is a very basic implementation example using a single string as the state. When mutating the state, we’re simply taking an index and either inserting or deleting text.

For example:

interface Operation {
  id: string;
  index: number;
  length: number;
  text: string;
  type: 'insert' | 'delete';
  timestamp: number;
}

function applyOperation(currentState: string, operation: Operation): string {
  switch (operation.type) {
    case 'insert':
      return currentState.slice(0, operation.index) + operation.text + currentState.slice(operation.index);
    case 'delete':
      return currentState.slice(0, operation.index) + currentState.slice(operation.index + operation.length);
    default:
      return currentState;
  }
}

function transformOperation(baseOperation: Operation, newOperation: Operation): Operation {
  if (newOperation.index > baseOperation.index) {
    switch (baseOperation.type) {
      case 'insert':
        newOperation.index += baseOperation.text.length;
        break;
      case 'delete':
        newOperation.index -= baseOperation.length;
        break;
    }
  }
  return newOperation;
}

function compareAndApplyOT(operations: Operation[], currentState: string): string {
  let appliedOperations = new Map<string, Operation>();
  operations.sort((a, b) => a.timestamp - b.timestamp);

  for (const operation of operations) {
    if (!appliedOperations.has(operation.id)) {
      let transformedOperation = operation;
      appliedOperations.forEach(appliedOp => {
        transformedOperation = transformOperation(appliedOp, transformedOperation);
      });
      currentState = applyOperation(currentState, transformedOperation);
      appliedOperations.set(operation.id, transformedOperation);
    }
  }
  return currentState;
}

Conflict-free Replicated Data Types #

CRDTs are another technique for managing state in real-time collaborative applications. They take a different approach by ensuring that every change can independently converge towards the same end result, no matter the order they are applied. This makes CRDTs particularly robust in environments with significant network disruptions.

CRDTs excel in situations where latency is more than just a minor inconvenience — think working from a train with spotty internet connection. Each action or edit made in a CRDT system is designed to independently merge without needing to reference a central server immediately. This means users can continue working offline and sync up seamlessly once they reconnect, with no fear of their changes causing inconsistencies.

Concepts #

Basic implementation example #

interface CRDTOperation {
  id: string;
  index: number;
  text: string;
  type: 'insert' | 'delete';
}

function applyOperation(currentState: string, operation: Operation): string {
  switch (operation.type) {
    case 'insert':
      return currentState.slice(0, operation.index) + operation.text + currentState.slice(operation.index);
    case 'delete':
      return currentState.slice(0, operation.index) + currentState.slice(operation.index + operation.length);
    default:
      return currentState;
  }
}

function compareAndApplyCRDT(operations: CRDTOperation[], currentState: string): string {
  let operationMap = new Map<string, CRDTOperation>();
  operations.forEach(op => operationMap.set(op.id, op)); // Ensures idempotency by overwriting duplicates

  let sortedOperations = Array.from(operationMap.values());
  // Sort by ID for consistency, but order does not change the result
  sortedOperations.sort((a, b) => a.id.localeCompare(b.id));

  for (const operation of sortedOperations) {
    currentState = applyOperation(currentState, operation); // Re-using the applyOperation function from OT
  }
  return currentState;
}

Conclusion #

I really liked exploring how these concepts work. Although we’re super connected today, it’s still critical applications handle this sort of complexity well in order to improve user experience.

Obviously in a real world scenario I wouldn’t be using a single string field to manage state, but I wanted to show you how they could work in practice.

Additionally, I likely would try to avoid using the application layer to process state changes. Although it’s possible, I’m not sure what impact that would have on database ACID compliance when updates are being submitted frequently from multiple clients - perhaps something a queue could handle.

I had a play with a few tools that helped me understand how they work. I hope you find them useful.

Thanks #

Big shout out to Tuhin Banerjee on Medium whose post helped me get started with my own implementation.