Zero 1.6
Postgres 17 Replication Failover and Performance
Installation
npm install @rocicorp/zero@1.6Features
- Postgres 17 Logical Replication Failover:
zero-cachenow creates replication slots with an ordinal naming scheme (e.g.zero_0_a,zero_0_b) so they can be registered withsynchronized_standby_slotsfor logical replication failover. TODO: add docs link. - Litestream Region: Added
ZERO_LITESTREAM_REGIONfor deployments in non-standard AWS partitions like GovCloud (thanks @ericykim!). - Optional
argsin custom queries/mutators: Custom query and mutator execution functions now treatargsas optional when the args type already allowsundefined(thanks @0xcadams!).
Performance
- Faster
EXISTSsubqueries via the newCapoperator, which lets SQLite skipORDER BYfor non-flippedEXISTSchildren - Bulk-insertion optimization in Replicache via
putMany, speeding up large sync patches (3-5x faster for typical sync batches, up to 53x for construction) - Batch deletes and upserts in
SQLiteStorewrites (~7-9x faster on 1000-put commits) - Parallelize I/O during pull and rebase
- Heap-based k-way merge in
fetchMergeSort(O(log K) per row vs O(K)), with a newmergeSortedStreamsutility - Initial sync progress reporting uses
pg_classestimates instead of full table scans - De-dupe SQLite requests in flip-join when children want the same parent
Fixes
- Returning to an app after stale-tab GC or CVR purge caused a full page reload; now the Zero instance rotates in place
"Row already exists"errors after an IVM advance failure could mask the original error and continue through corrupt branch state, also fixed forIVMBranch.fork()"Row already exists"assertion failures during poke processing caused byputManyrebalancing duplicating entries across adjacent BTree children- Initial sync could fail or take hours on large databases because progress reporting did full
COUNT(*)andSUM(pg_column_size(...))scans - Deadlock between post-initial-sync
changeLogreset and a live replication-manager during non-disruptive resync - Zombie
ViewSyncers could accumulate in theactive-client-groupsmetric when clients disconnected beforeinitConnectionresolved ConcurrentModificationExceptionis now classified as a Rehome so the client reconnects instead of erroringzero-cachestartup errors during change-streamer init were not published to subscribersTypeError: Expected string at context.query. Got nullwhen handling DDL events withNULL current_query()- Repeated initial-sync failures could exhaust the replication-slot name pool; cleanup now runs preemptively under the management lock, and inactive slots are deleted together with their
replicasrow so a stuck slot doesn't keep claiming a name - Replication slot creation timeouts crashed the server during backfill retries; backfill timeouts now only error after the maximum retry backoff is reached
- Shadow sync threw when a synced table could not be queried by ZQL; it now silently ignores the table to match prod behavior
- WebSocket errors are now logged as warnings instead of errors, since they reflect client or upstream issues rather than server faults
- Inspector now caches AST and metrics for deleted queries so they remain accessible after eviction
Breaking Changes
- Inspector per-query hydration metrics format changed: Per-query hydration metrics (
query-hydration-server-ms) are now reported as a plain number (most-recent hydration time in ms) instead of a TDigest histogram, and the per-query metrics type was renamed fromServerMetricstoQueryServerMetrics. If you have custom tooling reading inspector metrics, you'll need to update it. The protocol version was bumped from 50 to 51 to reflect this;MIN_SERVER_SUPPORTED_SYNC_PROTOCOLremains at 30, so 1.6 servers remain compatible with older clients.