Skip to main content

SQL style guide

Copy as markdown | View raw | Edit

Core philosophy

From PEP8:

  • Consistency with this style guide is important
  • Consistency within a project is more important
  • Consistency within one module or function is the most important
  • However, know when to be inconsistent -- sometimes style guide recommendations just aren't applicable

Core rules

  • Use lowercase SQL keywords (not uppercase)
  • Use snake_case for all identifiers (no CamelCase)
  • Names must begin with a letter and may not end in underscore
  • Only use letters, numbers, and underscores in names
  • Be explicit: always use AS for aliases, specify JOIN types
  • Root keywords on their own line (except with single argument)
  • Multi-line arguments must be indented relative to root keyword
  • Use ISO 8601 date format: yyyy-mm-ddThh:mm:ss.sssss
  • Foreign key naming: user_id to reference users table (singular + _id)
  • Use meaningful aliases that reflect the data (not just single letters)

Formatting

Keywords and alignment

-- Root keywords left-aligned
-- Arguments indented relative to root keyword
select
  client_id,
  submission_date
from main_summary
where
  sample_id = '42'
  and submission_date > '20180101'
limit 10;

Comments

/* Block comments for multi-line descriptions */
-- Line comments for single line notes
select
  client_id,  -- user identifier
  submission_date
from main_summary;

Parentheses

-- Opening paren ends the line
-- Closing paren aligns with starting line
-- Contents indented
with sample as (
  select
    client_id,
    submission_date
  from main_summary
  where sample_id = '42'
)

Boolean operators

-- AND/OR at beginning of line
where
  submission_date > '20180101'
  and sample_id = '42'

Table design rules

  • Always add id column of type identity generated always
  • Always add table comments using comment on table...
  • Default to public schema
  • Include schema in queries for clarity
  • Use singular table names with _id suffix for foreign keys

Best practices

  • Use CTEs instead of nested queries
  • Explicit column names in GROUP BY (except for expressions - see below)
  • Functions treated as identifiers: date_trunc() not DATE_TRUNC()
  • One argument per line for multi-argument clauses
  • Use meaningful aliases that reflect the data being selected

GROUP BY exception

-- Acceptable: use numbers to avoid repeating complex expressions
select
  date_trunc('minute', xact_start) as xact_start_minute,
  count(*)
from pg_stat_activity
group by 1
order by 1;

Examples

Good

select
  t.client_id as client_id,
  date(t.created_at) as day
from telemetry as t
inner join users as u
  on t.user_id = u.id
where
  t.submission_date > '2019-07-01'
  and t.sample_id = '10'
group by t.client_id, day;

Bad

SELECT t.client_id, DATE(t.created_at) day
FROM telemetry t, users u
WHERE t.user_id = u.id AND t.submission_date > '2019-07-01'
GROUP BY 1, 2;