Skip to main content
View rawEdit

04. Wait event analysis (Active Session History)

Active Session History (ASH) style dashboard for deep wait event analysis.

ASH by wait type​

ASH by wait type

ASH by wait event​

ASH by wait event

ASH by query​

ASH by query

Purpose​

Similar to AWS RDS Performance Insights or Oracle ASH, this dashboard provides:

  • Detailed breakdown of what sessions are waiting on
  • Historical wait event patterns
  • Correlation between waits and performance issues

When to use​

  • Database is slow but CPU is not high
  • Investigating lock contention
  • Understanding I/O patterns
  • Diagnosing intermittent slowdowns

Prerequisites​

This dashboard uses wait event data collected from pg_stat_activity by pgwatch. No additional extensions are required.

Key panels​

The dashboard has three panels, each an Active Session History view at a different level of detail: Active session history (grouped by wait event type), Active session history by event type (adds the specific wait event), and Active session history by event type and event (further broken down per query_id).

Active session history​

What it shows:

  • Stacked bar chart of active sessions by wait event type over time
  • Each color represents a wait event type

Wait event categories:

CategoryDescriptionCommon events
CPUOn-CPU processingCPU*
IODisk I/O operationsDataFileRead, WALWrite
LWLockInternal PostgreSQL locksBufferContent, LockManager
LockRow/table lockstuple, transactionid
BufferPinBuffer pinningBufferPin
ActivityBackground processesLogicalLauncherMain
IPCInter-process communicationBgWorkerStartup

Interpretation guide for the most common wait events you will see in these panels:

Wait eventMeaningAction
CPU*Query processingNormal if workload-appropriate
DataFileReadReading from diskCheck shared_buffers, add memory
DataFileWriteWriting to diskNormal for writes
WALWriteWAL I/OCheck storage speed
LWLock:BufferContentBuffer contentionScale up, partition tables, reduce bloat
Lock:tupleRow lock waitCheck for lock conflicts
Lock:transactionidTransaction waitLong transactions blocking

Active session history by event type​

What it shows:

  • The same ASH view, additionally split out by the specific wait event (wait_event) within each type, so you can see exactly which event dominates a wait type

Active session history by event type and event​

What it shows:

  • The ASH view further attributed to the query_id responsible, linking waits to specific queries (when captured during sampling)

Limitations:

  • Only captures queries active during sampling
  • Short queries may be missed
  • Use with 02. Query analysis for complete picture

Understanding wait events​

CPU waits​

High CPU wait is normal for compute-bound workloads. Investigate if:

  • CPU wait is disproportionate to actual CPU usage
  • Query throughput is lower than expected

I/O waits​

Common I/O wait events:

EventCauseSolution
DataFileReadReading data not in cacheIncrease shared_buffers, add indexes
DataFileExtendGrowing data filesPre-extend files, faster storage
DataFileFlushFlushing dirty pagesTune checkpoint_* settings
WALSyncSyncing WALFaster WAL storage, adjust commit_delay

Lock waits​

Lock wait events indicate heavyweight lock contention:

EventCauseSolution
Lock:tupleRow-level lock waitAvoid multiple sessions modifying same rows; use FOR UPDATE SKIP LOCKED or NOWAIT
Lock:transactionidWaiting for transactionFind and address long transactions
Lock:relationTable-level lockUse low lock_timeout with retries for DDL. See Zero-downtime schema migrations

LWLock waits​

Lightweight locks are internal to PostgreSQL:

EventCauseSolution
LWLock:BufferContentConcurrent access to same buffer pageScale up instance, partition tables, reduce foreign keys, address bloat with VACUUM FULL or pg_repack
LWLock:LockManagerFast-path lock limit exceeded (16 locks/backend)Enable partition pruning, reduce indexes, use connection pooler. See Postgres Marathon series

Variables​

VariablePurpose
wait_event_typeFilter by wait event type
wait_eventFilter by specific wait event
cluster_nameCluster filter
node_nameNode filter
db_nameDatabase filter

Troubleshooting​

No wait event data​

  1. Verify pgwatch is collecting metrics:

    docker compose logs pgwatch-postgres pgwatch-prometheus | grep -i wait
  2. Check VictoriaMetrics has wait event data (host port 59090, VM basic auth):

    curl -u "$VM_AUTH_USERNAME:$VM_AUTH_PASSWORD" \
    'http://localhost:59090/api/v1/query?query=pgwatch_wait_events_total'
  3. Ensure pg_stat_activity is accessible to the monitoring user

Wait events don't match RDS Performance Insights​

Different collection mechanisms and methodologies may cause variations. PostgresAI monitoring does not use a sub-second sampler. Instead, the wait_events metric is a snapshot count(*) of active sessions in pg_stat_activity (WHERE state = 'active'), grouped by wait_event_type / wait_event. This snapshot is collected on the metric interval — every 15 seconds in the full preset — and then aggregated into time buckets for visualization.

RDS Performance Insights samples at a higher frequency (roughly once per second), so absolute counts and short-lived waits can differ.

"Other" category too large​

The "Other" category contains infrequent or miscellaneous waits. If large:

  1. Check for uncommon extensions
  2. Review background worker activity
  3. Consider if sampling rate is appropriate