Skip to main content
View rawEdit

postgres_ai monitoring reference documentation

Metrics​

note

This page lists the user-facing metric groups. The pgwatch collector ships with additional groups that are emitted but not yet documented here (for example, pg_stat_slru, pg_statio_all_tables, pg_statio_all_indexes, multixact_size, pg_index_pilot, stats_reset, table_size_detailed). To see the full set of metrics emitted by your monitoring instance, query the Prometheus / VictoriaMetrics endpoint directly (e.g. /api/v1/label/__name__/values).

Common labels​

Most metrics include these standard labels:

  • cluster - Cluster identifier (e.g., "default")
  • datname - Database name being monitored
  • env - Environment (e.g., "production")
  • instance - pgwatch instance identifier
  • job - Prometheus job name
  • node_name - Database node name
  • sink_type - Metrics sink type (e.g., "prometheus")
  • sys_id - System identifier

Metric-specific labels​

Additional labels are available for specific metric types:

  • Query metrics (pg_stat_statements): queryid, datname
  • Table metrics (table_stats, pg_stat_all_tables): schema, table_name, table_full_name, table_size_cardinality_mb
  • Index metrics (pg_stat_all_indexes): schemaname, relname, indexrelname
  • Lock metrics (locks_mode): lockmode
  • Wait events (wait_events): wait_event, wait_event_type
  • Replication metrics: application_name, client_info, usename
  • Settings metrics: setting_name, setting_value, unit, category, vartype

Background writer metrics (bgwriter, checkpointer)​

Collected every 30 seconds

MetricDescriptionUnits
bgwriter_checkpoints_timedNumber of scheduled checkpoints performed-
bgwriter_checkpoints_reqNumber of requested checkpoints performed-
bgwriter_checkpoint_write_timeTime spent writing checkpoint data to diskMilliseconds
bgwriter_checkpoint_sync_timeTime spent syncing checkpoint data to diskMilliseconds
bgwriter_buffers_checkpointBuffers written during checkpoints-
bgwriter_buffers_cleanBuffers cleaned by background writer-
bgwriter_maxwritten_cleanTimes background writer stopped due to max write limit-
bgwriter_buffers_backendBuffers written directly by backends-
bgwriter_buffers_backend_fsyncTimes backends performed direct fsync-
bgwriter_buffers_allocBuffers allocated-
bgwriter_last_reset_sSeconds since bgwriter stats resetSeconds
checkpointer_num_timedNumber of timed checkpoints performed-
checkpointer_num_requestedNumber of requested checkpoints performed-
checkpointer_restartpoints_timedNumber of timed restart points performed-
checkpointer_restartpoints_reqNumber of requested restart points performed-
checkpointer_restartpoints_doneNumber of restart points completed-
checkpointer_write_timeTime spent writing checkpoint dataMilliseconds
checkpointer_sync_timeTime spent syncing checkpoint dataMilliseconds
checkpointer_buffers_writtenNumber of buffers written by checkpointer-
checkpointer_last_reset_sSeconds since stats resetSeconds

Database statistics (db_stats, db_size, pg_stat_activity)​

Collected every 15-30 seconds

MetricDescriptionUnits
db_stats_numbackendsNumber of active connections to database-
db_stats_xact_commitTransactions committed-
db_stats_xact_rollbackTransactions rolled back-
db_stats_blks_readDisk blocks read-
db_stats_blks_hitBuffer cache hits-
db_stats_tup_returnedRows returned by queries-
db_stats_tup_fetchedRows fetched by queries-
db_stats_tup_insertedRows inserted-
db_stats_tup_updatedRows updated-
db_stats_tup_deletedRows deleted-
db_stats_conflictsRecovery conflicts-
db_stats_temp_filesTemporary files created-
db_stats_temp_bytesTemporary file bytes writtenBytes
db_stats_deadlocksDeadlocks detected-
db_stats_blk_read_timeTime spent reading data file blocksMilliseconds
db_stats_blk_write_timeTime spent writing data file blocksMilliseconds
db_stats_postmaster_uptime_sPostgres server uptimeSeconds
db_stats_backup_duration_sCurrent backup durationSeconds
db_stats_in_recovery_intWhether instance is in recovery modeBoolean (0/1)
db_stats_invalid_indexesCount of invalid indexes-
db_stats_session_timeTime spent in database sessionsMilliseconds
db_stats_active_timeTime spent executing queriesMilliseconds
db_stats_idle_in_transaction_timeTime spent idle in transactionsMilliseconds
db_stats_sessionsTotal number of sessions-
db_stats_sessions_abandonedSessions abandoned-
db_stats_sessions_fatalSessions ended fatally-
db_stats_sessions_killedSessions killed-
db_size_size_bDatabase size in bytesBytes
db_size_catalog_size_bCatalog schema size in bytesBytes
pg_stat_activity_countCount of sessions by state (additional labels: state, application_name)-
pg_stat_activity_max_tx_durationMaximum transaction duration (additional labels: state, application_name)Seconds

Query performance (pg_stat_statements)​

Collected every 30 seconds Additional Labels: queryid (query identifier), datname (database name)

MetricDescriptionUnits
pg_stat_statements_callsNumber of times query executed-
pg_stat_statements_plans_totalNumber of times query planned-
pg_stat_statements_exec_time_totalTotal execution timeMilliseconds
pg_stat_statements_plan_time_totalTotal planning timeMilliseconds
pg_stat_statements_rowsTotal rows returned/affected-
pg_stat_statements_shared_bytes_hit_totalShared buffer cache hitsBytes
pg_stat_statements_shared_bytes_read_totalShared buffer reads from diskBytes
pg_stat_statements_shared_bytes_dirtied_totalShared buffer blocks dirtiedBytes
pg_stat_statements_shared_bytes_written_totalShared buffer blocks writtenBytes
pg_stat_statements_block_read_totalTime spent reading blocksMilliseconds
pg_stat_statements_block_write_totalTime spent writing blocksMilliseconds
pg_stat_statements_wal_recordsWAL records generated-
pg_stat_statements_wal_fpiWAL full page images generated-
pg_stat_statements_wal_bytesWAL bytes generatedBytes
pg_stat_statements_temp_bytes_readTemporary file bytes readBytes
pg_stat_statements_temp_bytes_writtenTemporary file bytes writtenBytes

Lock statistics (locks_mode)​

Collected every 30 seconds Additional Labels: lockmode (lock type: AccessShareLock, RowExclusiveLock, etc.)

MetricDescriptionUnits
locks_mode_countNumber of locks held by mode type-

Lock waits (lock_waits)​

Collected every 30 seconds

Detailed blocked / blocking pairs for lock-contention root cause analysis. New in 0.15: the lock-wait metrics carry session PIDs as labels, so the blocker can be identified directly in Grafana / PromQL without running the manual pg_locks join. Used by Dashboard 13 — Lock contention.

Additional Labels: blocked_pid (PID of the waiting backend), blocker_pid (PID of the blocking backend), blocked_user / blocker_user, blocked_appname / blocker_appname, blocked_table / blocker_table (affected relation), blocked_query_id / blocker_query_id, and datname. Note: blocked_mode / blocker_mode and blocked_locktype / blocker_locktype are emitted as plain (non-tag_) value columns in the metric definition, so they are not Prometheus labels.

MetricDescriptionUnits
pgwatch_lock_waits_blocked_msTime the blocked backend has been waitingMilliseconds
pgwatch_lock_waits_blocker_tx_msAge of the blocking backend's transactionMilliseconds
Terminating a blocker

Use the blocker_pid label directly: select pg_terminate_backend(<blocker_pid>);. No manual blocking-chain query is required to find the PID.

Wait events (wait_events)​

Collected every 15 seconds Additional Labels: wait_event (specific wait event), wait_event_type (wait category), and on PostgreSQL 14+ query_id (associated query)

MetricDescriptionUnits
wait_events_totalCount of processes experiencing wait event-

Table statistics (table_stats, pg_stat_all_tables)​

Collected every 30 seconds Additional Labels: schema (table schema), table_name (table name), table_full_name (schema.table), table_size_cardinality_mb (size category)

MetricDescriptionUnits
table_stats_table_size_bTable size in bytesBytes
table_stats_total_relation_size_bTotal relation size including indexesBytes
table_stats_toast_size_bTOAST table sizeBytes
table_stats_seq_scanSequential scans performed-
table_stats_seq_tup_readRows read by sequential scans-
table_stats_idx_scanIndex scans performed-
table_stats_idx_tup_fetchRows fetched by index scans-
table_stats_n_tup_insRows inserted-
table_stats_n_tup_updRows updated-
table_stats_n_tup_delRows deleted-
table_stats_n_tup_hot_updHOT updates performed-
table_stats_n_live_tupEstimated live rows-
table_stats_n_dead_tupEstimated dead rows-
table_stats_vacuum_countManual vacuums performed-
table_stats_autovacuum_countAutovacuums performed-
table_stats_analyze_countManual analyzes performed-
table_stats_autoanalyze_countAutoanalyzes performed-
table_stats_tx_freeze_ageTransaction freeze age-
table_stats_is_part_rootWhether table is a partition rootBoolean (0/1)
table_stats_last_seq_scan_sSeconds since last sequential scanSeconds
table_stats_no_autovacuumWhether autovacuum is disabledBoolean (0/1)
table_stats_seconds_since_last_analyzeSeconds since last analyzeSeconds
table_stats_seconds_since_last_vacuumSeconds since last vacuumSeconds

Index statistics (pg_stat_all_indexes)​

Collected every 30 seconds Additional Labels: schemaname (schema name), relname (table name), indexrelname (index name)

MetricDescriptionUnits
pg_stat_all_indexes_idx_scanIndex scans performed-
pg_stat_all_indexes_idx_tup_readIndex entries returned-
pg_stat_all_indexes_idx_tup_fetchTable rows fetched via index-

WAL and replication metrics (wal, replication, replication_slots, pg_stat_replication, pg_stat_wal_receiver, pg_archiver, archive_lag, pg_xlog_position)​

Collected every 15-30 seconds

MetricDescriptionUnits
wal_xlog_location_bCurrent WAL locationBytes
wal_in_recovery_intWhether instance is in recovery modeBoolean (0/1)
wal_postmaster_uptime_sPostgres server uptimeSeconds
wal_timelineCurrent timeline ID-
replication_sent_lag_bReplication sent lagBytes
replication_write_lag_bReplication write lagBytes
replication_flush_lag_bReplication flush lagBytes
replication_replay_lag_bReplication replay lagBytes
replication_write_lag_msReplication write lagMilliseconds
replication_flush_lag_msReplication flush lagMilliseconds
replication_replay_lag_msReplication replay lagMilliseconds
archive_lag_current_lsn_numericCurrent LSN as numeric value-
archive_lag_archived_wal_finish_lsn_numericArchived WAL finish LSN as numeric-
archive_lag_wal_files_behindNumber of WAL files behind archive-
archive_lag_seconds_since_archiveSeconds since last archiveSeconds
archive_lag_archived_countTotal archived WAL files-
archive_lag_failed_countFailed archive attempts-
pg_archiver_pending_wal_countNumber of WAL files pending archive-
pg_wal_size_bytesTotal size of regular files in the pg_wal directory (via pg_ls_waldir(); excludes subdirectories such as pg_wal/archive_status). New in 0.15. Not emitted when pg_wal_size_status_code > 0.Bytes
pg_wal_size_status_codepg_wal size collection status: 0 = success, 1 = pg_ls_waldir() not available, 2 = monitoring role lacks EXECUTE privilege. New in 0.15.-
Interpreting pg_wal_size

pg_wal growth that is not matched by archive or replica progress points to disk-fill risk — typically a stuck WAL archiver, an inactive replication slot retaining WAL, or sustained high WAL generation. Cross-reference with the archiver and replication-slot metrics above, and see How to troubleshoot a growing pg_wal directory.

xmin horizon (xmin_horizon)​

Collected every 30 seconds. Instance-level, primary only. New in 0.15.

Tracks the current xmin horizon age split by blocker class and horizon type. Component *_age_tx / *_count columns emit 0 (never NULL) when that component has no active blocker. Used by Dashboard 07 — Autovacuum and xmin horizon.

MetricDescriptionUnits
xmin_horizon_data_horizon_age_txAge of the data horizon (worst of client-backend, slot, standby, prepared-xact blockers)Transactions
xmin_horizon_catalog_horizon_age_txAge of the catalog horizon (data horizon plus catalog catalog_xmin blockers)Transactions
xmin_horizon_snapshot_xminRaw snapshot xmin anchor (txid_snapshot_xmin(txid_current_snapshot()))-
xmin_horizon_pg_stat_activity_age_txOldest client-backend backend_xmin ageTransactions
xmin_horizon_pg_stat_activity_countNumber of client backends holding a horizon-
xmin_horizon_pg_replication_slots_age_txOldest replication-slot xmin ageTransactions
xmin_horizon_pg_replication_slots_countNumber of slots holding the data horizon-
xmin_horizon_pg_replication_slots_catalog_age_txOldest replication-slot catalog_xmin ageTransactions
xmin_horizon_pg_replication_slots_catalog_countNumber of slots holding the catalog horizon-
xmin_horizon_pg_stat_replication_age_txOldest standby-feedback backend_xmin ageTransactions
xmin_horizon_pg_stat_replication_countNumber of standbys holding a horizon-
xmin_horizon_pg_prepared_xacts_age_txOldest prepared-transaction ageTransactions
xmin_horizon_pg_prepared_xacts_countNumber of prepared transactions holding a horizon-

xmin horizon blockers (xmin_horizon_blockers)​

Collected every 30 seconds. Instance-level, primary only. New in 0.15.

Captures the single oldest (top) blocker for each currently active component as a separate labeled series, with that blocker's xmin age in transactions. Emits one series per active component and no series for an empty component, so cardinality varies between 0 and 5 per scrape. The monitoring role's own sessions are excluded.

Additional Labels: component (pg_stat_activity, pg_replication_slots, pg_replication_slots_catalog, pg_stat_replication, pg_prepared_xacts), horizon_type (data / catalog), blocker_database, blocker_user, blocker_appname, blocker_state, queryid (for activity blockers), slot_name / slot_type / slot_plugin / slot_xmin_source / slot_status / slot_wal_status (for slot blockers), standby_name (for replication blockers), prepared_gid / owner (for prepared-transaction blockers).

MetricDescriptionUnits
xmin_horizon_blockers_age_txxmin age of the top blocker for the labeled componentTransactions
Query text is not a label

Query text is intentionally not emitted as a Prometheus label. Use the queryid label to look up the query text in pgwatch query storage.

I/O statistics (pg_stat_io, PostgreSQL 16+)​

Collected every 30 seconds. Instance-level. New in 0.15.

Collects I/O statistics from the PostgreSQL pg_stat_io view (PostgreSQL 16+). On PostgreSQL 15 and earlier this group emits nothing. Values are aggregated by backend type with a total row added via ROLLUP. Used by Dashboard 14 — I/O statistics.

Additional Labels: backend_type (e.g. client backend, autovacuum worker, background writer, checkpointer, walwriter, or total for the rollup row).

MetricDescriptionUnits
pg_stat_io_readsRead operations-
pg_stat_io_read_bytes_mbData readMiB
pg_stat_io_read_time_msTime spent readingMilliseconds
pg_stat_io_writesWrite operations-
pg_stat_io_write_bytes_mbData writtenMiB
pg_stat_io_write_time_msTime spent writingMilliseconds
pg_stat_io_writebacksWriteback operations-
pg_stat_io_writeback_bytes_mbData written backMiB
pg_stat_io_writeback_time_msTime spent on writebacksMilliseconds
pg_stat_io_fsyncsfsync operations-
pg_stat_io_fsync_time_msTime spent on fsyncsMilliseconds
pg_stat_io_extendsRelation extend operations-
pg_stat_io_extend_bytes_mbData added by extendsMiB
pg_stat_io_hitsBlocks found in shared buffers-
pg_stat_io_evictionsBuffers evicted to make room-
pg_stat_io_reusesBuffers reused directly (e.g. ring buffers)-
pg_stat_io_stats_reset_sSeconds since pg_stat_reset_shared('io')Seconds

Bloat analysis metrics (pg_table_bloat, pg_btree_bloat, unused_indexes, rarely_used_indexes, redundant_indexes, pg_invalid_indexes)​

Collected every 2-3 hours

MetricDescriptionUnits
pg_table_bloat_real_size_mibActual size of table/indexMegabytes
pg_table_bloat_extra_sizeExtra space due to bloatBytes
pg_table_bloat_extra_pctPercentage of space wastedPercent
pg_table_bloat_bloat_sizeEstimated bloat sizeBytes
pg_table_bloat_bloat_pctEstimated bloat percentagePercent
pg_btree_bloat_real_size_mibActual index sizeMegabytes
pg_btree_bloat_extra_sizeExtra space due to index bloatBytes
pg_btree_bloat_extra_pctPercentage of index space wastedPercent
pg_btree_bloat_bloat_sizeEstimated index bloat sizeBytes
pg_btree_bloat_bloat_pctEstimated index bloat percentagePercent
pg_btree_bloat_fillfactorIndex fill factor-
pg_btree_bloat_is_naWhether bloat calculation is not availableBoolean (0/1)

Transaction and process metrics (pg_blocked, pg_long_running_transactions, pg_stuck_idle_in_transaction, pg_txid, pg_database_wraparound, pg_vacuum_progress, pg_total_relation_size)​

Collected every 30 seconds

MetricDescriptionUnits
pg_blocked_queriesNumber of blocked queries-
pg_long_running_transactions_transactionsNumber of long-running transactions-
pg_long_running_transactions_age_in_secondsAge of longest transactionSeconds
pg_database_wraparound_age_datfrozenxidAge of database frozen transaction ID-
pg_database_wraparound_age_datminmxidAge of database minimum multixact ID-
pg_stuck_idle_in_transaction_queriesNumber of stuck idle-in-transaction queries-
pg_txid_currentCurrent transaction ID-
pg_txid_xminMinimum transaction ID in snapshot-
pg_txid_xmin_ageAge of minimum transaction ID-
pg_total_relation_size_bytesTotal relation size including indexesBytes

Configuration settings (settings)​

Collected every 5 minutes Additional Labels: setting_name (parameter name), setting_value (parameter value), unit (value unit), category (setting category), vartype (variable type)

MetricDescriptionUnits
settings_numeric_valueNumeric value of configuration setting-
settings_is_defaultWhether setting is at default valueBoolean (0/1)
settings_configuredWhether setting is configuredBoolean (0/1)