Database Lab Engine configuration reference
#
OverviewDatabase Lab Engine behavior can be controlled using the main configuration file that has YAML format. This reference describes available configuration options.
Example config files can be found here: https://gitlab.com/postgres-ai/database-lab/-/tree/v2.2/configs.
Useful guides that help manage Database Lab Engine:
tip
The configuration of Database Lab Engine can be reloaded without downtime:
docker exec -it dblab_test kill -SIGHUP 1
#
The list of configuration sectionsHere is how the configuration file is structured:
Section | Description |
---|---|
global | Global parameters such as path to data directory or enabling debugging. |
server | Database Lab Engine API server. |
poolManager | Manages filesystem pools or volume groups. |
provision | How thin cloning is organized. |
retrieval | Defines the data flow: a series of "jobs" for initial retrieval of the data, and, optionally, continuous data synchronization with the source, snapshot creation and retention policies. The initial retrieval may be either "logical" (dump/restore) or "physical" (based on replication or restoration from a archive). |
cloning | Thin cloning policies. |
platform | Postgres.ai Platform integration (provides GUI, advanced features such as user management, logs). |
observer | CI Observer configuration. CI Observer helps verify database schema changes (database migrations) automatically, in CI/CD pipelines. Available on the Postgres.ai Platform. |
global
: global parameters#
Section engine
- defines the Database Lab Engine. Supported engines:postgres
debug
- allows seeing more in the Database Lab Engine logsdatabase
(key-value, optional) - contains default configuration options of the restored databaseusername
(string, optional, default: "postgres") - a default username for logical/physical restore jobsdbname
(string, optional, default: "postgres") - a default database name for logical/physical restore jobs
server
: Database Lab Engine API server#
Section verificationToken
(string, required) - the token that is used to work with Database Lab APIhost
(string, optional, default: "") - the host to which the Database Lab server accepts HTTP connectionsport
(string, required) - HTTP server port
poolManager
: filesystem pools or volume groups management#
Section mountDir
(string, required) - specifies the location of the pools mount directory (can contain multiple pool directories)dataSubDir
(string, optional, default: "") - specifies the location of restored data by Database Lab Engine relative to the pool which is placed inside the mount directory (mountDir
)clonesMountSubDir
(string, required) - the directory that will be used to mount clonessocketSubDir
(string, required) - the UNIX socket directory that will be used to establish local connections to cloned databasespreSnapshotSuffix
(string, required) - the suffix to denote preliminary snapshots
provision
: thin cloning environment settings#
Section portPool
(key-value, required) - defines a pool of ports for Postgres clonesfrom
(integer, required) - the lowest port value in the poolto
(integer, required) - the highest port value in the pool
dockerImage
(string, required) - the Postgres Docker image that to be used when cloninguseSudo
(boolean, optional, default: false) - use sudo for ZFS/LVM and Docker commands if Database Lab server running outside a containerkeepUserPasswords
(bool, optional, default: "false") - By default, in addition to creating a new user with administrative privileges, Database Lab Engine resets passwords for all existing users. This is done for security reasons. If this behavior is undesirable and you want to keep the ability authenticate for the existing users with their unchanged passwords, then set the value of the variable totrue
.containerConfig
(key-value, optional) - options to pass custom parameters to clone containers
retrieval
: data retrieval#
Section refresh
(key-value, optional) - describes configuration for a full refresh.timetable
(string, optional, default: "") - defines a timetable in crontab format: https://en.wikipedia.org/wiki/Cron#Overview
jobs
(list, optional) - declares the set of running jobs. Stages must be defined in thespec
sectionspec
(key-value, optional) - contains a configuration spec for each job
#
Data retrieval jobsAvailable job names:
logicalDump
logicalRestore
logicalSnapshot
physicalRestore
physicalSnapshot
info
You need to choose either "logical" or "physical" set of jobs. Mixing is not allowed
Note, that all jobs are optional. For example, all of the following approaches defining the initial data retrieval process are allowed:
- You may consider using both
logicalDump
andlogicalRestore
to make a dump to a file and then restore from it - You may use only
logicalRestore
and restore from an already prepared dump file - You may use only
logicalDump
, withoutlogicalRestore
(however, this approach makes sense only if you defineimmediateRestore
option in thelogicalDump
job, to perform dump & restore on-the-fly, without saving the dump to a file)
logicalDump
#
Job Dumps a PostgreSQL database from a provided source to an archive or to the Database Lab Engine instance.
Options:
dumpLocation
(string, required) - the dump file (or directory, for a directory-format archive) will be automatically created on this location on the host machinedockerImage
(string, required) - specifies the Docker image containing the dump-required toolsource
(key-value, required) - describes source of data:type
(string, required) - defines location type of a dumped database. Available values:local
,remote
,rdsIam
connection
(key-value, required) - defines connection parameters of source:dbname
(string, required) - defines the database dbname to be restoredhost
(string, required) - defines hostname of the databaseport
(integer, optional, default: 5432) - defines port of the databaseusername
(string, optional, default: postgres) - defines database username to connect to the databasepassword
(string, optional, default: "") - defines username password to connect to the database; the environment variable PGPASSWORD can be used instead of this option; the environment variable has a higher priority
rdsIam
(key-value, optional) - contains options specific for RDS IAM source typeawsRegion
(string, required) - AWS Region where RDS is locateddbInstanceIdentifier
(string, required) - RDS instance IdentifiersslRootCert
(string, required) - path on the host machine to the SSL root certificate. You can download it from https://s3.amazonaws.com/rds-downloads/rds-combined-ca-bundle.pem
parallelJobs
(integer, optional, default: 1) - defines the number of concurrent jobs using thepg_dump
optionjobs
. This option can dramatically reduce the time to dump a large databasepartial
(key-value, optional) - defines options for partial dumping. Available options:tables
:tables
(list of strings, optional) - dumps definition and/or data of only the listed tables
immediateRestore
(key-value, optional) - provides options for direct restore to a Database Lab Engine instance.forceInit
(boolean, optional, default: false) - init data even if the Postgres directory (see the configuration optionsglobal.mountDir
andglobal.dataSubDir
) is not empty; note the existing data might be overwritten
logicalRestore
#
Job Restores a PostgreSQL database from an archive created by pg_dump in one of the non-plain-text formats.
Options:
dbname
(string, required) - defines the database dbname to be restoreddumpLocation
(string, required) - specifies the location of the archive file (or directory, for a directory-format archive) on the host machine to be restoreddockerImage
(string, required) - specifies the Docker image containing the restore-required toolforceInit
(boolean, optional, default: false) - init data even if the Postgres directory (see the configuration optionsglobal.mountDir
andglobal.dataSubDir
) is not empty; note the existing data might be overwrittenparallelJobs
(integer, optional, default: 1) - defines the number of concurrent jobs using thepg_restore
optionjobs
. This option can dramatically reduce the time to restore a large database to a server running on a multiprocessor machinepartial
(key-value, optional) - defines options for partial restoring. Available options:tables
:tables
(list of strings, optional) - restores definition and/or data of only the listed tables
logicalSnapshot
#
Job Prepares a snapshot for logical restored PostgreSQL database.
Options:
dataPatching
(key-value, optional) - defines SQL queries for data patchingdockerImage
(string, optional) - specifies the Docker image to run a data patching containerqueryPreprocessing
(key-value, optional) - defines pre-processing parametersqueryPath
(string, optional, default: "") - specifies the path to SQL pre-processing queries; an empty string means that no pre-processing definedmaxParallelWorkers
(integer, optional, default: 2) - defines the worker limit for parallel queries
preprocessingScript
(string, optional) - path on the host machine to a pre-precessing scriptconfigs
(key-value, optional) - applies PostgreSQL configuration parameters when preparing a working snapshot. These parameters are inherited by all clones. See also: How to configure PostgreSQL used by Database Lab Engine
physicalRestore
#
Job Restores data from a physical backup.
Supported restore tools:
- WAL-G (
walg
) - is an archival restoration tool (https://github.com/wal-g/wal-g) - Custom (
custom
) - allows defining own command to restore data
Options:
tool
(string, required) - defines the tool to restore data. See available restore tools listdockerImage
(string, required) - specifies the Docker image containing the restoring toolsync
(key-value, optional) - refresh PGDATA after data fetching:enabled
(boolean, optional, default: false) - runs a separate container to refresh Database Lab datahealthCheck
(key-value, optional) - describes health check options for a sync container:interval
(int, optional, default: 5) - health check interval for a data sync container (in seconds)maxRetries
(int, optional, default: 200) - maximum number of health check retries
configs
(key-value, optional) - applies PostgreSQL configuration parameters to the sync instance
envs
(key-value, optional) - passes custom environment variables to the Docker container with the restoring toolwalg
(key-value, optional) - defines WAL-G configuration options:backupName
(string, required) - defines the backup name to restore
customTool
(key-value, optional) - defines configuration options for custom restoring tool:command
(string, required) - defines the command to restore data using a custom toolrestore_command
(string, optional) - defines the PostgreSQLrestore_command
configuration option to refresh data; Database Lab Engine automatically propagates the specified value to the proper location, depending on the version of PostgreSQL: in versions 11 and older, it is to be stored inrecovery.conf
, while in 12 and newer, it is a part of the main file,postgresql.conf
physicalSnapshot
#
Job Prepares a snapshot for physical restored PostgreSQL database.
Options:
skipStartSnapshot
(boolean, optional, default: false) - skip taking a snapshot while the retrieval startspromotion
(key-value, optional) - promotes PGDATA after data fetching:enabled
(boolean, optional, default: false) - enable PGDATA promotiondockerImage
(string, optional) - specifies the Docker image containing the promotion-compatible PostgreSQL instancehealthCheck
(key-value, optional) - describes health check options for a data promotion container:interval
(int, optional, default: 5) - health check interval for a data promotion container (in seconds)maxRetries
(int, optional, default: 200) - maximum number of health check retries
queryPreprocessing
(key-value, optional) - defines pre-processing SQL queriesqueryPath
(string, optional, default: "") - specifies the path to SQL pre-processing queries; an empty string means that no pre-processing definedmaxParallelWorkers
(integer, optional, default: 2) - defines the worker limit for parallel queries
configs
(key-value, optional) - applies PostgreSQL configuration parameters to the promotion instance
sysctls
(key-value, optional) - allows configuring namespaced kernel parameters (sysctls) of Docker container for a promotion stage of taking a snapshot. See supported parameters: https://docs.docker.com/engine/reference/commandline/run/#configure-namespaced-kernel-parameters-sysctls-at-runtimepreprocessingScript
(string, optional) - path on the host machine to a pre-precessing scriptconfigs
(key-value, optional) - applies PostgreSQL configuration parameters to snapshot. These parameters are inherited by all clones. See also: How to configure PostgreSQL used by Database Lab Engineenvs
(key-value, optional) - passes custom environment variables to the promotion Docker containerscheduler
(key-value, required) - contains tasks which run on a schedule:snapshot
(key-value, optional) - defines rules to create a new snapshot on a schedule:timetable
(string, required) - defines a timetable in crontab format: https://en.wikipedia.org/wiki/Cron#Overview
retention
(key-value, optional) - defines rules to clean up old snapshots on a schedule:timetable
(string, required) - defines a timetable in crontab format: https://en.wikipedia.org/wiki/Cron#Overviewlimit
(integer, required) - defines how many snapshots should be held
cloning
: thin cloning policies#
Section accessHost
(string, required) - the host that will be specified in the database connection string to inform users about how to connect to database clonesmaxIdleMinutes
(integer, optional, default: 0) - automatically delete clones after the specified minutes of inactivity
platform
: Postgres.ai Platform integration#
Section url
(string, optional, default: "https://postgres.ai/api/general") - Platform API URLaccessToken
(string, required) - the token for authorization in Platform API. This token can be obtained on the Postgres.ai ConsoleenablePersonalTokens
(boolean, optional, default: false) - enables authorization with personal tokens of the organization's members
observer
: CI Observer configuration#
Section replacementRules
(key-value, optional) - set up rules based on regular expressions (a pair of values"regexp":"replace"
; to check syntax, use this document) for Postgres logs that will be sent to the Platform when running Observed Sessions; this helps ensure that sensitive data is masked properly and it doesn't leave the origin Replacement rules applies to the following log fields:message
,detail
,hint
,internal_query
,query