How to find redundant indexes
Let's discuss today how to find and clean up redundant indexes.
What is a redundant index?
Redundant indexes refer to multiple indexes on a table that serve the same purpose or where one index could satisfy the queries that all others do. Two basic examples:
Duplicate Indexes: Two or more indexes that have exactly the same columns in the same order are clearly redundant.
Overlapping Indexes: When you have a multi-column index, for example on
(a, b)
, and then another index just on(a)
. For queries that only filter on columna
, the index on(a, b)
could suffice, making the index on(a)
potentially unnecessary.
Note that (a)
is redundant to (a, b)
but not to (b, a)
. Accordingly, (b)
is not redundant to (a, b)
.
Further, we'll also assume that:
- unique indexes should not be considered in this kind of analysis because they have special purpose;
- indexes on expression follow the same rules – we just consider each expression similarly as columns, and expression values should be fully matched;
- in case of partial indexes, the conditions should be fully matched (this rule can be adjusted with certain assumptions, but we won't do that);
- we don't consider covering indexes (🎯 TODO: include them as well).
Why to clean up redundant indexes
The same 6 reasons that we discussed for the unused indexes are applicable here.
General algorithm of unused indexes cleanup
Using the query provided below, identify the sets of redundant indexes. It is enough to analyze only one node in a cluster – e.g. the primary. It doesn't matter when stats were reset because this analysis is based only the static information (DB structure).
For each index that is considered redundant, perform a manual analysis to avoid mistakes. If not fully sure, remove such index from consideration.
Optionally, consider the "soft-drop" technique as described in Index maintenance, before actually dropping an index.
After proper analysis and testing, drop one index at a time, closely watching the database performance metrics to ensure there are no unwanted plan changes. To drop indexes, use
DROP INDEX CONCURRENTLY
.
Query to find redundant indexes
Here is a query from
postgres-checkup,
it produces a result in the form of JSON, including the commands to drop indexes using DROP INDEX CONCURRENTLY
.
with const as (
select 0 as min_relpages -- on very large DBs, increase it to, say, 100
), fk_indexes as (
select
n.nspname as schema_name,
ci.relname as index_name,
cr.relname as table_name,
(confrelid::regclass)::text as fk_table_ref,
array_to_string(indclass, ', ') as opclasses
from pg_index i
join pg_class ci on ci.oid = i.indexrelid and ci.relkind = 'i'
join pg_class cr on cr.oid = i.indrelid and cr.relkind = 'r'
join pg_namespace n on n.oid = ci.relnamespace
join pg_constraint cn on cn.conrelid = cr.oid
left join pg_stat_user_indexes si on si.indexrelid = i.indexrelid
where
contype = 'f'
and not i.indisunique
and conkey is not null
and ci.relpages > (select min_relpages from const)
and si.idx_scan < 10
), index_data as ( -- Redundant indexes
select
*,
indkey::text as columns,
array_to_string(indclass, ', ') as opclasses
from pg_index i
join pg_class ci on ci.oid = i.indexrelid and ci.relkind = 'i'
where indisvalid = true and ci.relpages > (select min_relpages from const)
), redundant_indexes as (
select
i2.indexrelid as index_id,
tnsp.nspname AS schema_name,
trel.relname AS table_name,
pg_relation_size(trel.oid) as table_size_bytes,
irel.relname AS index_name,
am1.amname as access_method,
(i1.indexrelid::regclass)::text as reason,
i1.indexrelid as reason_index_id,
pg_get_indexdef(i1.indexrelid) main_index_def,
pg_size_pretty(pg_relation_size(i1.indexrelid)) main_index_size,
pg_get_indexdef(i2.indexrelid) index_def,
pg_relation_size(i2.indexrelid) index_size_bytes,
s.idx_scan as index_usage,
quote_ident(tnsp.nspname) as formated_schema_name,
coalesce(nullif(quote_ident(tnsp.nspname), 'public') || '.', '') || quote_ident(irel.relname) as formated_index_name,
quote_ident(trel.relname) AS formated_table_name,
coalesce(nullif(quote_ident(tnsp.nspname), 'public') || '.', '') || quote_ident(trel.relname) as formated_relation_name,
i2.opclasses
from index_data as i1
join index_data as i2 on
i1.indrelid = i2.indrelid -- the same table
and i1.indexrelid <> i2.indexrelid -- NOT the same index
join pg_opclass op1 on i1.indclass[0] = op1.oid
join pg_opclass op2 on i2.indclass[0] = op2.oid
join pg_am am1 on op1.opcmethod = am1.oid
join pg_am am2 on op2.opcmethod = am2.oid
join pg_stat_user_indexes as s on s.indexrelid = i2.indexrelid
join pg_class as trel on trel.oid = i2.indrelid
join pg_namespace as tnsp on trel.relnamespace = tnsp.oid
join pg_class as irel on irel.oid = i2.indexrelid
where
not i2.indisprimary -- index 1 is not PK
and not ( -- skip if index1 is (primary or uniq) and is NOT (primary and uniq)
i2.indisunique and not i1.indisprimary
)
and am1.amname = am2.amname -- same access type
and i1.columns like (i2.columns || '%') -- index 2 includes all columns from index 1
and i1.opclasses like (i2.opclasses || '%')
-- index expressions is same
and pg_get_expr(i1.indexprs, i1.indrelid) is not distinct from pg_get_expr(i2.indexprs, i2.indrelid)
-- index predicates is same
and pg_get_expr(i1.indpred, i1.indrelid) is not distinct from pg_get_expr(i2.indpred, i2.indrelid)
), redundant_indexes_fk as (
select
ri.*,
(
select count(1)
from fk_indexes fi
where
fi.fk_table_ref = ri.table_name
and fi.opclasses like (ri.opclasses || '%')
) > 0 as supports_fk
from redundant_indexes ri
), redundant_indexes_tmp_num as ( -- Cut recursive links
select row_number() over () num, rig.*
from redundant_indexes_fk rig
), redundant_indexes_tmp_links as (
select
ri1.*,
ri2.num as r_num
from redundant_indexes_tmp_num ri1
left join redundant_indexes_tmp_num ri2 on
ri2.reason_index_id = ri1.index_id
and ri1.reason_index_id = ri2.index_id
), redundant_indexes_tmp_cut as (
select *
from redundant_indexes_tmp_links
where num < r_num or r_num is null
), redundant_indexes_cut_grouped as (
select
distinct(num),
*
from redundant_indexes_tmp_cut
order by index_size_bytes desc
), redundant_indexes_grouped as (
select
index_id,
schema_name,
table_name,
table_size_bytes,
index_name,
access_method,
string_agg(distinct reason, ', ') as reason,
string_agg(main_index_def, ', ') as main_index_def,
string_agg(main_index_size, ', ') as main_index_size,
index_def,
index_size_bytes,
index_usage,
formated_index_name,
formated_schema_name,
formated_table_name,
formated_relation_name,
supports_fk
from redundant_indexes_cut_grouped
group by
index_id,
table_size_bytes,
schema_name,
table_name,
index_name,
access_method,
index_def,
index_size_bytes,
index_usage,
formated_index_name,
formated_schema_name,
formated_table_name,
formated_relation_name,
supports_fk
order by index_size_bytes desc
), redundant_indexes_num as (
select row_number() over () num, rig.*
from redundant_indexes_grouped rig
), redundant_indexes_json as (
select
json_object_agg(coalesce(rin.schema_name, 'public') || '.' || rin.index_name, rin) as json
from redundant_indexes_num rin
), redundant_indexes_total as (
select
sum(index_size_bytes) as index_size_bytes_sum,
sum(table_size_bytes) as table_size_bytes_sum
from redundant_indexes_grouped
), do_lines as (
select
format(
'DROP INDEX CONCURRENTLY %s; -- %s, %s, table %s',
formated_index_name,
pg_size_pretty(index_size_bytes)::text,
reason,
formated_relation_name
) as line
from redundant_indexes_grouped
order by table_name, index_name
), undo_lines as (
select
replace(
format('%s; -- table %s', index_def, formated_relation_name),
'CREATE INDEX',
'CREATE INDEX CONCURRENTLY'
) as line
from redundant_indexes_grouped
order by table_name, index_name
), database_stat as (
select row_to_json(dbstat)
from (
select
sd.stats_reset::timestamptz(0),
age(
date_trunc('minute',now()),
date_trunc('minute',sd.stats_reset)
) as stats_age,
((extract(epoch from now()) - extract(epoch from sd.stats_reset))/86400)::int as days,
(select pg_database_size(current_database())) as database_size_bytes
from pg_stat_database sd
where datname = current_database()
) as dbstat
) -- final result
select
jsonb_pretty(jsonb_build_object(
'redundant_indexes',
(select * from redundant_indexes_json),
'redundant_indexes_total',
(select row_to_json(rit) from redundant_indexes_total as rit),
'do',
(select json_agg(dl.line) from do_lines as dl),
'undo',
(select json_agg(ul.line) from undo_lines as ul),
'database_stat',
(select * from database_stat),
'min_index_size_bytes',
(select min_relpages * current_setting('block_size')::numeric from const)
));