Skip to main content
View rawEdit

01. Single node performance overview

High-level dashboard for quick triage and overall database health assessment.

01. Node overview dashboard

Purpose​

This dashboard provides a "shallow but wide" view of database performance, ideal for:

  • Incident response: Quickly identify which subsystem is problematic
  • Daily health checks: Spot anomalies at a glance
  • Capacity planning: Track growth trends

When to use​

  • First dashboard to check during any performance incident
  • Morning health check routine
  • Before and after maintenance windows
  • When users report "the database is slow"

Key panels​

Active session history​

Similar to AWS RDS Performance Insights, this panel shows wait event distribution over time.

Active session history panel

What it shows:

  • Stacked bar chart of active sessions by wait event category
  • Each bar represents a sampling interval

Wait event categories:

CategoryColorIndicates
CPU*GreenOn-CPU activity (query execution)
IOBlueDisk I/O waits
LWLockRedLightweight lock contention
LockOrangeRow/table lock waits
TimeoutGraySleep/timeout events

Healthy state:

  • Mostly green (CPU) with occasional blue (IO)
  • Total height below max_connections * 0.5

Warning signs:

  • Sustained red (LWLock) — Internal contention
  • Sustained orange (Lock) — Application-level locking issues
  • Spikes above normal baseline — Sudden load increase

Sessions​

What it shows:

  • Current session count by state
  • active: Executing queries
  • idle: Connected but not executing
  • idle in transaction: In transaction, not executing

Healthy range:

  • active < 20-50 (depending on workload)
  • idle in transaction should be minimal (< 5)

Warning signs:

  • High idle in transaction — Connection leaks or long transactions
  • active near max_connections — Connection exhaustion

Non-idle sessions​

Focused view of sessions doing actual work.

Healthy state:

  • Stable pattern matching application load
  • No sudden spikes without corresponding application events

TPS (transactions per second)​

What it shows:

  • Commit rate
  • Rollback rate (if significant)

Use for:

  • Capacity baseline
  • Detecting throughput drops

QPS (queries per second)​

From pg_stat_statements, showing actual query execution rate.

Note: QPS typically higher than TPS since one transaction contains multiple queries.

Variables​

VariablePurposeOptions
cluster_nameCluster filterYour cluster names
node_nameNode filternode-01, replica-01, etc.
db_nameDatabase filterDatabase names or All

Troubleshooting​

No data in ASH (Active Session History)​

  1. Verify pgwatch is collecting metrics:

    docker compose logs pgwatch | grep -i "wait\|session"
  2. Check VictoriaMetrics has data:

    curl 'http://localhost:8428/api/v1/query?query=pg_stat_activity_count'

Sessions count doesn't match pg_stat_activity​

The dashboard samples at intervals. For real-time view, query directly:

select state, count(*)
from pg_stat_activity
where backend_type = 'client backend'
group by state;