Working with Amazon RDS? See Database Lab tutorial for Amazon RDS.
Database Lab is used to boost software development and testing processes via enabling ultra-fast provisioning of databases of any size.
In this tutorial, we are going to create a "demo" PostgreSQL database and then set up a Database Lab Engine for it. Database Lab Engine will be installed on an AWS EC2 instance (alternatively, it can be an instance on another cloud platform such as GCP, or a bare-metal machine) with Ubuntu 18.04, and an additional EBS volume to store PostgreSQL data directory. Optionally, you'll be able to skip creating the "demo" database and use an existing database instead.
Database Lab Engine is hosted and developed on GitLab.com. Why? GitLab Inc. is our (Postgres.ai) long-term client and an early adopter (see GitLab Development Docs). GitLab has an open-source version. Last but not least: GitLab uses PostgreSQL.
However, nowadays, not many open-source projects are hosted at GitLab.com unfortunately. ⭐️ Please support the project by giving a star on GitLab! It's on the main page of the Database Lab Engine repository, in the upper right corner:
- Prepare a virtual machine (we will use AWS EC2) with an additional disk to store data, install Docker to run containers, and ZFS to enable copy-on-write for thin cloning
- Configure and launch the Database Lab Engine
- Start using Database Lab API and client CLI to clone Postgres database in seconds
Not using AWS? Not a problem: if you want to use any other cloud provider (like Google Cloud) or run your Database Lab on VMWare, or on bare metal; only the first step is going to be slightly different. In general, the overall procedure is pretty much the same.
LVM support as an alternative to ZFS first appeared in version 0.3.0 of Database Lab. Note that LVM, unlike a ZFS, does not support multiple snapshots and related features.
Step 1. Prepare a machine with disk, Docker, and ZFS
Prepare a machine
Create an EC2 instance with Ubuntu 18.04 and an additional EBS volume to store data. You can find detailed instructions on how to create an AWS EC2 instance here (if you want to use Google Cloud, see the GCP documentation).
(optional) Ports need to be open in the Security Group being used
You will need to open the following ports (outbound rules in your Security Group):
22: to connect to the instance using SSH
2345: to work with Database Lab Engine API (can be changed in the Database Lab Engine configuration file)
6000-6100: to connect to PostgreSQL clones (this is the default port range used in the Database Lab Engine configuration file, and can be changed if needed)
For real-life use, it is not a good idea to open ports to the public. Instead, it is recommended to use VPN or SSH port forwarding to access both Database Lab API and PostgreSQL clones, or to enforce encryption for all connections using NGINX with SSL and configuring SSL in PostgreSQL configuration.
Additionally, to be able to install software, allow access to external resources using HTTP/HTTPS (edit the inbound rule in your Security Group):
Here is how the inbound and outbound rules in your Security Group may look like:
If needed, you can find the detailed installation guides for Docker here.
Further, we will need environment variable
$DBLAB_DISK. It must contain the device name that corresponds to the disk where all the Database Lab Engine data will be stored.
To understand what needs to be specified in
$DBLAB_DISK in your case, check the output of
AWS local ephemeral NVMe disks; EBS volumes for instances built on the Nitro system:$ sudo lsblkNAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT...xvda 202:0 0 8G 0 disk└─xvda1 202:1 0 8G 0 part /nvme0n1 259:0 0 777G 0 disk$ export DBLAB_DISK="/dev/nvme0n1"
AWS EBS volumes for older (pre-Nitro) EC2 instances:$ sudo lsblkNAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT...xvda 202:0 0 8G 0 disk└─xvda1 202:1 0 8G 0 part /xvdb 202:16 0 777G 0 disk$ export DBLAB_DISK="/dev/xvdb"
Set up either ZFS or LVM to enable thin cloning
ZFS is a recommended way to enable thin cloning in Database Lab. LVM is also available, but has certain limitations:
- much less flexible disk space consumption and risks for a clone to be destroyed during massive operations in it
- inability to work with multiple snapshots ("time travel"), cloning always happens based on the most recent version of data
Create a new ZFS storage pool (make sure
$DBLAB_DISK has the correct value, see the previous step!):
And check the result using
zfs list and
lsblk, it has to be like this:
Step 2. Configure and launch the Database Lab Engine
To make your work with Database Lab API secure, do not open Database Lab API and Postgres clone ports to the public and instead use VPN or SSH port forwarding. It is also a good idea to encrypt all the traffic: for Postgres clones, set up SSL in the configuration files; and for Database Lab API, install, and configure NGINX with a self-signed SSL certificate. See the How to Secure Database Lab Engine.
Prepare database data directory
Next, we need to get the data to the Database Lab Engine server. For our testing needs, we have 3 options:
- "Generated database": generate a synthetic database for testing purposes
- "Physical copy" (
pg_basebackup): copy an existing database (perform "think cloning" once) using a "physical" method such as
- "Logical copy" (dump/restore): copy an existing database using the "logical" method (dump/restore)
- 1. Generated database
- 2. Physical copy (pg_basebackup)
- 3. Logical copy (dump/restore)
If you don't have an existing database for testing, then let's just generate some synthetic database in the data directory ("PGDATA") located at
/var/lib/dblab/data. A simple way of doing this is to use PostgreSQL standard benchmarking tool,
pgbench. With scale factor
-s 100, the database size will be ~1.4 GiB; feel free to adjust the scale factor value according to your needs.
To generate PGDATA with
pgbench, we are going to run a regular Docker container with Postgres temporarily. We will use
POSTGRES_HOST_AUTH_METHOD=trust to allow a connection without authentication (not suitable for real-life use).
Generate data in the
test database using
PostgreSQL data directory is ready. Now let's stop and remove the container:
Now, we need to take care of Database Lab Engine configuration. Copy the contents of configuration example
config.example.logical_generic.yml from the Database Lab repository to
~/.dblab/server.yml and edit the following options:
- Set secure
server:verificationToken, it will be used to authorize API requests to the Database Lab Engine
- If your Postgres major version is not 12 (default), set the proper version in Postgres Docker images tags:
Launch Database Lab server
How to check the Database Lab Engine logs
Need to start over? Here is how to clean up
If something went south and you need to make another attempt at the steps in this tutorial, use the following steps to clean up:
Step 3. Start cloning!
Install Database Lab client CLI
CLI can be used on any machine, you just need to be able to reach the Database Lab Engine API (port 2345 by default). In this tutorial, we will install and use CLI locally on the EC2 instance.
Initialize CLI configuration:
Check the configuration by fetching the status of the instance:
Create a clone
After a second or two, if everything is configured correctly, you will see that the clone is ready to be used. It should look like this:
Connect to a clone
Now you can work with this clone using any PostgreSQL client, for example,
psql. Use the following connection info (
db section of the response of the
dblab clone create command):
Check the available table:
Now let's see how quickly we can reset the state of the clone. Delete some data or drop some table.
To reset, use the
clone reset command (replace
my_first_clone with the ID of your clone if you changed it). You can do it not leaving psql -- for that, use the
Check the status of the clone:
Notice how fast the resetting was, just a few seconds! 💥
Reconnect to the clone:
Now check the database objects you've dropped or partially deleted – everything should be the same as when we started.
For more, see the full client CLI reference.
GUI (Database Lab Platform)
To use the GUI, you need to sign up for Database Lab Platform.
How to get the Database Lab GUI
Currently, the Database Lab GUI is in "private beta" mode. The onboarding consists of two steps. Step 1: Sign up using a Google, LinkedIn, GitLab, or GitHub account. Step 2: Next, the Postgres.ai team will contact you and schedule a demo, during which your account will be activated.
Add Database Lab Engine to the Platform
- On the Database Lab instances page of your organization click the Add instance button.
- On the Add instance page fill in the following:
Project: choose any project name, it will be created automatically
Verification token: specify the same verification token that you've used in the Database Lab Engine configuration file
URL: Database Lab API server (EC2 instance public IP or hostname, specify port if needed, e.g.
- Click the Verify URL button to check the availability of the Engine. Ignore the warning about insecure connection – in this Tutorial, we have skipped some security-related steps.
- Click the Add button to add the instance to the Platform.
Create a clone
- Go to the Database Lab instance page.
- Click the Create clone button.
- Fill the ID field with a meaningful name.
- (optional) By default, the latest data snapshot (closest to production state) will be used to provision a clone. You can choose another snapshot if any.
- Fill database credentials. Remember the password (it will not be available later, Database Lab Platform does not store it!) – you will need to use it to connect to the clone.
- Click the Create clone button and wait for a clone to be provisioned. The process should take only a few seconds.
- You will be redirected to the Database Lab clone page.
Connect to a clone
- From the Database Lab clone page under section Connection info, copy the psql connection string field contents by clicking the Copy button.
- Here we assume that you have
psqlinstalled on your working machine. In the terminal, type
psqland paste the psql connection string field contents. Change the database name
DBNAMEparameter, you can always use
postgresfor the initial connection.
- Run the command and type the password you've set during the clone creation.
- Test established connection by listing tables in the database using
Reach out to our team, we'll be happy to help! Use the Intercom widget located at the right bottom corner.
👋 Database Lab "Private Beta" program
Database Lab Platform (SaaS) is currently in a "private beta" mode, being tested by several hundred engineers. Want to become an early adopter? Join Database Lab by Postgres.ai "Private Beta" program today: https://postgres.ai/console/.