Skip to content

High Availability

Chatto stores all persistent data in NATS JetStream. For high availability, run a multi-node NATS cluster with JetStream replication so that data survives individual node failures.

NATS CLUSTER (RAFT CONSENSUS) Chattoreplica 1 Chattoreplica 2 nats-1leader nats-2follower nats-3follower replication

A highly available Chatto deployment consists of:

  • A NATS cluster (3 or more nodes) running JetStream with Raft consensus
  • One or more Chatto server processes connecting to the cluster as clients
  • A load balancer distributing traffic across Chatto server processes

NATS handles leader election and data replication automatically. Chatto server processes are stateless and don’t need to know about the cluster topology — they just connect to a NATS URL.

Chatto’s embedded NATS server is designed for single-node convenience. For high availability, run a dedicated NATS cluster.

Each NATS node needs a configuration like this:

server_name: nats-1
listen: 0.0.0.0:4222
jetstream {
store_dir: /data/jetstream
max_mem: 1G
max_file: 50G
}
cluster {
name: chatto
listen: 0.0.0.0:6222
routes: [
nats-route://nats-1:6222
nats-route://nats-2:6222
nats-route://nats-3:6222
]
}
authorization {
token: "your-shared-token"
}
services:
nats-1:
image: nats:latest
command: ["--config", "/etc/nats/nats.conf"]
volumes:
- ./nats-1.conf:/etc/nats/nats.conf:ro
- nats1_data:/data
nats-2:
image: nats:latest
command: ["--config", "/etc/nats/nats.conf"]
volumes:
- ./nats-2.conf:/etc/nats/nats.conf:ro
- nats2_data:/data
nats-3:
image: nats:latest
command: ["--config", "/etc/nats/nats.conf"]
volumes:
- ./nats-3.conf:/etc/nats/nats.conf:ro
- nats3_data:/data

Each node gets its own config file with a unique server_name but the same routes list.

Once your NATS cluster is running, configure Chatto to replicate data across nodes:

Terminal window
CHATTO_NATS_REPLICAS=3

This controls how many copies of each stream, KV bucket, and object store NATS maintains. Must be an odd number for quorum:

ReplicasNodes RequiredToleratesUse Case
11No failuresDevelopment, single-node
33+1 node failureProduction
55+2 node failuresCritical deployments

All Chatto storage — KV buckets, event streams, and object stores — uses the same replication factor. There are no per-bucket overrides.

All critical data is durably stored and replicated:

  • Server data — users, memberships, configuration, roles, permissions, rooms, message bodies, reactions, threads, read status, call start/join/leave/end facts
  • Assets — avatars, attachments (unless offloaded to S3)
  • Auth tokens — bearer tokens for cross-origin authentication
  • Notifications — user notifications (90-day TTL)

Some data is kept in memory for speed and has short TTLs:

  • Presence — online/offline status (60-second TTL, auto-expires)

This state is replicated across the NATS cluster for consistency, but it is reconstructed automatically if lost.

LiveKit E2EE call keys are stored behind Chatto’s KMS boundary in the ENCRYPTION_KEYS bucket. They are replicated with the rest of file-backed NATS state, excluded from normal backups, and shredded when calls end. If key storage is lost during an active call, clients must reconnect into a fresh call.

Disable embedded NATS and point Chatto at your cluster:

Terminal window
CHATTO_NATS_EMBEDDED_ENABLED=false
CHATTO_NATS_CLIENT_URL=nats://nats-1:4222,nats://nats-2:4222,nats://nats-3:4222
CHATTO_NATS_CLIENT_AUTH_METHOD=token
CHATTO_NATS_CLIENT_TOKEN=your-shared-token
CHATTO_NATS_REPLICAS=3

Listing multiple URLs gives the client failover — if one node is down, it connects to another.

ScenarioImpactRecovery
1 NATS node down (R3)No data loss, brief leader electionAutomatic
Chatto server process downLoad balancer routes to healthy server processesAutomatic
Minority of NATS nodes downReads and writes continue normallyAutomatic
Majority of NATS nodes downWrites rejected (no quorum), reads may workRestore nodes
All NATS nodes downComplete outageRestore cluster or restore from backup