06. Replication and HA
Monitor streaming replication, replication lag, and high availability status.
This dashboard is currently under development. Replication metrics are collected as part of the health check system, and the full dashboard visualization is coming soon.
Purpose​
Ensure replication health for:
- Disaster recovery readiness
- Read replica performance
- Failover preparedness
When to use​
- Monitoring replica lag during high load
- Investigating replication disconnections
- Validating HA setup
- Capacity planning for replicas
Dashboard status​
In 0.15.0 this dashboard ships as a single placeholder panel ("Coming soon...") and has no data
panels or template variables yet. Replication metrics are still collected by the stack (for
example replication, replication_slots, and pg_stat_replication in the full preset), so
until the visualizations land you can inspect replication health directly via SQL using the
queries below.
Unused replication slots prevent WAL cleanup and can fill disk.
Replication modes​
Streaming replication​
Standard async or sync replication:
-- on primary
select * from pg_stat_replication;
Logical replication​
For selective table replication:
-- check subscriptions
select * from pg_stat_subscription;
Related dashboards​
- Primary health — 01. Node overview
- Query load on replica — 02. Query analysis
Troubleshooting​
Replica not connecting​
-
Check primary allows connections:
show max_wal_senders;
select * from pg_stat_replication; -
Verify pg_hba.conf allows replication
-
Check network connectivity
Replication lag growing​
- Check replica resource usage (CPU, I/O)
- Review long-running queries on replica
- Consider
hot_standby_feedbacksetting - Check for replication conflicts:
select * from pg_stat_database_conflicts;
Replication slot bloat​
Remove unused slots:
-- List slots
select slot_name, active, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn))
from pg_replication_slots;
-- Drop unused slot (CAUTION)
select pg_drop_replication_slot('unused_slot');