Skip to main content
View rawEdit

Dashboard overview

PostgresAI monitoring includes 14 pre-built Grafana dashboards designed for expert-level PostgreSQL troubleshooting.

Dashboard categories​

Triage and overview​

#DashboardPurpose
01Node overviewHigh-level node health, wait events, sessions
02Query analysisTop-N queries by various metrics
03Single queryDeep-dive into specific queryid

Wait events and locks​

#DashboardPurpose
04Wait eventsActive session history (ASH-style)
13Lock contentionLock waits and blocking chains

Storage and maintenance​

#DashboardPurpose
05BackupsBackup status and WAL archiving
07AutovacuumVacuum progress and bloat
08Table statsAggregated table metrics
09Single tableDeep-dive into specific table
10Index healthIndex usage and bloat
11Single indexDeep-dive into specific index
12SLRUSLRU cache statistics

Replication and HA​

#DashboardPurpose
06ReplicationReplication lag and slot status

Stack health​

#DashboardPurpose
--Self-monitoringMonitoring stack health

Common variables​

All dashboards share these filter variables:

VariablePurposeExample
cluster_nameCluster identifierproduction, staging
node_nameNode within clusterprimary, replica-1
db_nameDatabase filtermyapp, All

Incident response​

  1. Start with 01. Node overview

    • Check wait event distribution
    • Look for session count anomalies
    • Note TPS/QPS patterns
  2. Identify the bottleneck

    • High CPU wait events — Check queries (02)
    • High IO wait events — Check disk activity, queries
    • High LWLock — Check specific lock type (13)
  3. Drill down

    • Use 02. Query analysis to find problematic queries
    • Use 03. Single query for detailed metrics on specific queryid

Routine monitoring​

TaskDashboardWhat to look for
Query review02. Query analysisNew slow queries, regression
Index health10. Index healthUnused indexes, bloat
Table health08. Table statsBloat, sequential scans
Vacuum status07. AutovacuumDead tuple accumulation

Legend options​

Most query-related dashboards support multiple legend formats:

FormatShowsUse case
queryidNumeric ID onlyCompact view
displaynameTruncated queryDefault
displayname_longFull query with contextDebugging

Select the format using the Query texts variable at the top of dashboards.

Time range tips​

  • Incident investigation: Start with 15m-1h to see recent patterns
  • Trend analysis: Use 24h-7d for capacity planning
  • Comparison: Use "Compare to" feature for week-over-week analysis

Next steps​