Skip to content

High Availability

Chatto stores all persistent data in NATS JetStream. For high availability, run a multi-node NATS cluster with JetStream replication so that data survives individual node failures.

NATS CLUSTER (RAFT CONSENSUS) Chattoreplica 1 Chattoreplica 2 nats-1leader nats-2follower nats-3follower replication

A highly available Chatto deployment consists of:

  • A NATS cluster (3 or more nodes) running JetStream with Raft consensus
  • One or more Chatto instances connecting to the cluster as clients
  • A load balancer distributing traffic across Chatto instances

NATS handles leader election and data replication automatically. Chatto instances are stateless and don’t need to know about the cluster topology — they just connect to a NATS URL.

Chatto’s embedded NATS server is designed for single-node convenience. For high availability, run a dedicated NATS cluster.

Each NATS node needs a configuration like this:

server_name: nats-1
listen: 0.0.0.0:4222
jetstream {
store_dir: /data/jetstream
max_mem: 1G
max_file: 50G
}
cluster {
name: chatto
listen: 0.0.0.0:6222
routes: [
nats-route://nats-1:6222
nats-route://nats-2:6222
nats-route://nats-3:6222
]
}
authorization {
token: "your-shared-token"
}
services:
nats-1:
image: nats:latest
command: ["--config", "/etc/nats/nats.conf"]
volumes:
- ./nats-1.conf:/etc/nats/nats.conf:ro
- nats1_data:/data
nats-2:
image: nats:latest
command: ["--config", "/etc/nats/nats.conf"]
volumes:
- ./nats-2.conf:/etc/nats/nats.conf:ro
- nats2_data:/data
nats-3:
image: nats:latest
command: ["--config", "/etc/nats/nats.conf"]
volumes:
- ./nats-3.conf:/etc/nats/nats.conf:ro
- nats3_data:/data

Each node gets its own config file with a unique server_name but the same routes list.

Once your NATS cluster is running, configure Chatto to replicate data across nodes:

Terminal window
CHATTO_NATS_REPLICAS=3

This controls how many copies of each stream, KV bucket, and object store NATS maintains. Must be an odd number for quorum:

ReplicasNodes RequiredToleratesUse Case
11No failuresDevelopment, single-node
33+1 node failureProduction
55+2 node failuresCritical deployments

All Chatto storage — KV buckets, event streams, and object stores — uses the same replication factor. There are no per-bucket overrides.

All critical data is durably stored and replicated:

  • Instance data — users, spaces, memberships, configuration, roles, permissions
  • Per-space data — rooms, message bodies, reactions, threads, read status
  • Assets — avatars, attachments (unless offloaded to S3)
  • Auth tokens — bearer tokens for cross-origin authentication
  • Notifications — user notifications (90-day TTL)

Some data is kept in memory for speed and has short TTLs:

  • Presence — online/offline status (60-second TTL, auto-expires)
  • Call state — active call participants (repopulated by LiveKit webhooks)

These are still replicated across the cluster for consistency, but they’re reconstructed automatically if lost.

Disable embedded NATS and point Chatto at your cluster:

Terminal window
CHATTO_NATS_EMBEDDED_ENABLED=false
CHATTO_NATS_CLIENT_URL=nats://nats-1:4222,nats://nats-2:4222,nats://nats-3:4222
CHATTO_NATS_CLIENT_AUTH_METHOD=token
CHATTO_NATS_CLIENT_TOKEN=your-shared-token
CHATTO_NATS_REPLICAS=3

Listing multiple URLs gives the client failover — if one node is down, it connects to another.

ScenarioImpactRecovery
1 NATS node down (R3)No data loss, brief leader electionAutomatic
Chatto instance downLoad balancer routes to healthy instancesAutomatic
Minority of NATS nodes downReads and writes continue normallyAutomatic
Majority of NATS nodes downWrites rejected (no quorum), reads may workRestore nodes
All NATS nodes downComplete outageRestore cluster or restore from backup