Skip to main content
View rawEdit

01. Single node performance overview

High-level dashboard for quick triage and overall database health assessment.

01. Node overview dashboard

Purpose

This dashboard provides a "shallow but wide" view of database performance, ideal for:

  • Incident response: Quickly identify which subsystem is problematic
  • Daily health checks: Spot anomalies at a glance
  • Capacity planning: Track growth trends

When to use

  • First dashboard to check during any performance incident
  • Morning health check routine
  • Before and after maintenance windows
  • When users report "the database is slow"

Key panels

Active session history

Similar to AWS RDS Performance Insights, this panel shows wait event distribution over time.

Active session history panel

What it shows:

  • Stacked bar chart of active sessions by wait event category
  • Each bar represents a sampling interval

Wait event categories:

CategoryColorIndicates
CPU*GreenOn-CPU activity (query execution)
IOBlueDisk I/O waits
LWLockRedLightweight lock contention
LockOrangeRow/table lock waits
TimeoutGraySleep/timeout events

Healthy state:

  • Mostly green (CPU) with occasional blue (IO)
  • Total height below max_connections * 0.5

Warning signs:

  • Sustained red (LWLock) — Internal contention
  • Sustained orange (Lock) — Application-level locking issues
  • Spikes above normal baseline — Sudden load increase

Sessions

What it shows:

  • Current session count by state
  • active: Executing queries
  • idle: Connected but not executing
  • idle in transaction: In transaction, not executing

Healthy range:

  • active < 20-50 (depending on workload)
  • idle in transaction should be minimal (< 5)

Warning signs:

  • High idle in transaction — Connection leaks or long transactions
  • active near max_connections — Connection exhaustion

Non-idle sessions

Focused view of sessions doing actual work.

Healthy state:

  • Stable pattern matching application load
  • No sudden spikes without corresponding application events

TPS (transactions per second)

What it shows:

  • Commit rate
  • Rollback rate (if significant)

Use for:

  • Capacity baseline
  • Detecting throughput drops

QPS (queries per second)

From pg_stat_statements, showing actual query execution rate.

Note: QPS typically higher than TPS since one transaction contains multiple queries.

Variables

VariablePurposeOptions
cluster_nameCluster filterYour cluster names
node_nameNode filternode-01, replica-01, etc.
db_nameDatabase filterDatabase names or All

Troubleshooting

No data in ASH (Active Session History)

  1. Verify pgwatch is collecting metrics:

    docker compose logs pgwatch | grep -i "wait\|session"
  2. Check VictoriaMetrics has data:

    curl 'http://localhost:8428/api/v1/query?query=pg_stat_activity_count'

Sessions count doesn't match pg_stat_activity

The dashboard samples at intervals. For real-time view, query directly:

select state, count(*)
from pg_stat_activity
where backend_type = 'client backend'
group by state;