How to install Database Lab with Terraform on AWS
The most convenient and fastest way to install the Database Lab Engine (DLE) and other Database Lab Platform components are using our Terraform Module. Your source PostgreSQL database can be located anywhere. DLE and other components will be created under your AWS account on an EC2 instance.
Currently, only the "logical" mode of data retrieval (dump/restore) is supported – the only available method for managed PostgreSQL cloud services such as RDS Postgres, RDS Aurora Postgres, Azure Postgres, or Heroku. "Physical" mode is not yet supported by the module, but it will be in the future. More about various data retrieval options for DLE.
Manual installation guides:
Prerequisites
- AWS cloud account
- You must have AWS Access Keys and a default region in your Terraform environment. To successfully run this Terraform module, the IAM User/Role must have the following permissions:
- Read/Write permissions on EC2
- Read/Write permissions on Route53
- Read/Write permissions on Cloudwatch
- The DLE runs on an EC2 instance which can be accessed using a selected set of SSH keys uploaded to EC2. Use the Terraform parameter
aws_keypair
to specify which EC2 Keypair to use - AWS Route 53 Hosted Zone for a domain or sub-domain you control.
- For instructions on how to set up a Route53 Hosted Zone, see Amazon's Documentation
- Note that if you are using a domain registrar other than Amazon, follow Amazon's guide to making Route53 the DNS provider for your domain name.
- You must have AWS Access Keys and a default region in your Terraform environment. To successfully run this Terraform module, the IAM User/Role must have the following permissions:
- Terraform CLI
- Minimum version: 1.0
- 🚧 Currently, it is supposed that you run
terraform
commands on a Linux or macOS machine. Windows is not yet supported.)
Install Database Lab components using Terraform
The following steps were tested on Intel / Ubuntu 20.04 and Apple M1 / MacOS Big Sur. They should work on all modern Linux and macOS environments without or a few minor modifications.
1. Install Terraform (optional)
- SSH to any machine with internet access, it will be used as deployment machine.
- Install Terraform CLI (see the official guide). Example for Ubuntu:
sudo apt-get update && sudo apt-get install -y gnupg software-properties-common curl
curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main" # Adjust if you have ARM platform.
sudo apt-get update && sudo apt-get install terraform
# Verify installation.
terraform -help
2. Get and configure Database Lab Terraform Module
The module will be available using Terraform Registry soon, though you can clone module's Git repository and adjust the code for your needs.
- Get our Terraform Module for Database Lab using Git:
git clone https://gitlab.com/postgres-ai/terraform-postgres-ai-database-lab.git
cd terraform-postgres-ai-database-lab/
- To configure parameters used by Terraform (and the Database Lab Engine itself), you will need to modify
terraform.tfvars
and create asecret.tfvars
file with secrets - The variables can be set in multiple ways with the following precedence order (lowest to highest):
- values passed on the command line
- values defined in
terraform.tfvars
- default values in
variables.tf
- All variables starting with
source_
represent the database connection parameters for the source data to be retrieved by DLE. That database must be accessible from the instance hosting the DLE (that one created by Terraform)
- Edit
terraform.tfvars
file. Variables which values start withYOUR_
are required to configure for your deployment, others are optional.
dle_version = "3.0.0"
aws_ami_name = "DBLABserver*"
aws_deploy_region = "us-east-1"
aws_deploy_ebs_availability_zone = "us-east-1a"
aws_deploy_ec2_instance_type = "t2.large"
aws_deploy_ec2_instance_tag_name = "DBLABserver-ec2instance"
aws_deploy_ec2_volumes_count = "YOUR_NUMBER_OF_EBS_VOLUMES_FOR_DLE_ZFS_POOLS" # e.g. 2
aws_deploy_ebs_size = "YOUR_INSTANCE_DISK_SIZE" # e.g. "40".
aws_deploy_ebs_type = "gp2"
aws_deploy_allow_ssh_from_cidrs = ["0.0.0.0/0"]
aws_deploy_dns_zone_name = "YOUR_HOSTED_ZONE" # e.g. "mycompany.com".
aws_deploy_dns_api_subdomain = "dle-tf-test" # Requires Route 53 hosted zone setup.
# Data source. You can choose one of two options:
# - direct connection to source DB (source_type = "postgres")
# - dump stored on AWS S3 (source_type = "s3")
# Option 1 – direct Postgres connection.
source_type = "postgres"
source_postgres_version = "YOUR_POSTGRES_VERSION" # e.g. "13".
source_postgres_host = "YOUR_POSTGRES_HOST" # e.g. "ec2-3-215-57-87.compute-1.amazonaws.com".
source_postgres_port = "YOUR_POSTGRES_PORT" # e.g. "5432".
source_postgres_dbname = "YOUR_POSTGRES_DBNAME" # e.g. "postgres".
source_postgres_username = "YOUR_POSTGRES_USERNAME" # e.g. "postgres".
# Option 2 – dump on S3.
# Important: your AWS user has to be able to create IAM roles
# to work with S3 buckets in your AWS account.
# source_type = "s3" # source is dump stored on demo s3 bucket
# source_pgdump_s3_bucket = "YOUR_S3_BUCKET" # e.g. tf-demo-dump", this is an example public bucket.
# source_pgdump_path_on_s3_bucket = "YOUR_PGDUMP_FILEPATH" # e.g. "heroku.dmp" for one-file dump(plain text or compressed plain text)
# or test/heroku_dmp_2 for directory type dump, these are example dumps from the demo database.
# Important: all vCPUs of the EC2 instance will be utilized for dump restore, so restore time depends on the EC2 instance class
dle_debug_mode = "true"
dle_retrieval_refresh_timetable = "0 0 * * 0"
# Include all libraries your installation are using.
# Database Lab DB Migration Checker requires "logerrors" extension (works with Postgres 10 and newer)
postgres_config_shared_preload_libraries = "pg_stat_statements,logerrors"
platform_project_name = "aws_test_tf"
# ssh public keys can be copied to provisioned machine to allow connect to ubuntu user
# keys can be provided by list of local file names. All listed files should exist and be accesible
ssh_public_keys_files_list = ["~/.ssh/id_rsa.pub"]
# or provide them as string (it is OK to use both options at the same time)
ssh_public_keys_list = [
"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDhbblazDXCFEc21DtFzprWC8DiqidnVRROzp6J6BeJR9+XydPUtl0Rt2mcNvxL5ro5bI9u5JRW8aDd6s+Orpr66hEDdwQTbT1wp5nyduFQcT3rR+aeDSilQvAHjr4/z/GZ6IgZ5MICSIh5hJJagHoxAVqeS9dCA27tv/n2T2XrxIUeBhywH1EmfwrnEw97tHM8F+yegayFDI1nVOUWUIxFMaygMygix8uKbQ2fl4rkkxG2oEx7uyAFMXHt4bewNbZuAp8b/b5ODL6tGHuHhcwfbWGriCO+l7UOf1K9maTx00o4wkzAPyd+qs70y/1iMX2YOOLYaYYdptEnFal2DVoD [email protected]"
]
#if non keys are provided provisioned machine allows connection with ${aws_deploy_ec2_instance_tag_name}.pem only. This file is created locally by TF.
- Create
secret.tfvars
containingplatform_access_token
,source_postgres_password
, andvcs_github_secret_token
. An example:
# Database Lab Platform.
# Open https://console.postgres.ai/, choose your organization,
# then "Access tokens" in the left menu, generate token under "Add token"
# section, "Personal token" should be unchecked.
platform_access_token = "YOUR_PLATFORM_ACCESS_TOKEN"
# Postgres password, set only if you are using a direct Postgres connection.
source_postgres_password = "YOUR_POSTGRES_PASSWORD"
# GitHub token. To generate, open https://github.com/settings/tokens/new.
vcs_github_secret_token = "YOUR_VCS_SECRET_TOKEN"
If you need to use your own PostgreSQL configuration parameters (copying them into postgresql_clones_custom.conf
file), make sure there is a private key in this Terraform module directory to access the EC2 machine.
- Set environment variables with AWS credentials:
# Browse to AWS Console / My security credentials / Access keys for CLI, SDK, & API access.
# Create or use existing Access Key.
# Link: https://console.aws.amazon.com/iam/home#/security_credentials?credentials=iam
export AWS_ACCESS_KEY_ID="YOUR_KEY_ID"
export AWS_SECRET_ACCESS_KEY="YOUR_SECRET_ACCESS_KEY"
- Review and adjust file
postgresql_clones_custom.conf
. This file will be added to all Postgres instances in clones created by DLE. If you don't need any custom parameters, leave this file empty but do not delete it.
3. Run deployment
- Initialize Terraform working directory:
terraform init
- Deploy:
terraform apply -var-file="secret.tfvars" -auto-approve
- If everything goes well, you should get an output like this:
vcs_db_migration_checker_verification_token = "gsio7KmgaxECfJ80kUx2tUeIf4kEXZex"
dle_verification_token = "zXPodd13LyQaKgVXGmSCeB8TUtnGNnIa"
ec2_public_dns = "ec2-11-111-111-11.us-east-2.compute.amazonaws.com"
ec2instance = "i-0000000000000"
ip = "11.111.111.11"
platform_joe_signing_secret = "lG23qZbUh2kq0ULIBfW6TRwKzqGZu1aP"
public_dns_name = "demo-api-engine.aws.postgres.ai" # todo: this should be URL, not hostname – further we'll need URL, with protocol – `https://`
- To verify the result and check the progress, you might want to connect to the just-created EC2 machine using the IP address or hostname from the Terraform output. In our example, it can be done using this one-liner:
echo "sudo docker logs dblab_server -f" | ssh [email protected] -i postgres_ext_test.pem
DLE server started successfully and is waiting for your commands if you see a message like:
2021/07/02 10:28:51 [INFO] Server started listening on :2345.
You can find more about DLE logs and configuration on this page.
4. Set up Database Lab Platform
🚀 We are working on the automation of this step, the auto-registration feature will be available soon.
Sign in to the Postgres.ai Platform and register your new DLE server:
- Go to
Database Lab / Instances
in the left menu. - Press the
Add instance
button. Project
– specify any name (this is how your DLE server will be named in the platform).Verification token
– use the token generated above (verification_token
value); do NOT press the "Generate" button here!URL
– use the value generated above.- Click the
Verify URL
button to check the connectivity. Then click theAdd
button to register the DLE instance. If everything is right, you should see the DLE page with green "OK" status.
- Go to
Add Joe chatbot for efficient SQL optimization workflow:
- Go to the
SQL Optimization / Ask Joe
page using the left menu, click theAdd instance
button, specify the same project as you defined in the previous step. Signing secret
– useplatform_joe_signing_secret
from the Terraform output.URL
– usepublic_dns_name
values from the Terraform output with port444
; in our example, it'shttps://demo-api-engine.aws.postgres.ai:444
.- Click the
Verify URL
button to check the connectivity and then click theAdd
button to register the Joe instance.
Now you can start using Joe chatbot for SQL execution plans troubleshooting and verification of optimization ideas. As a quick test, go to
SQL Optimization / Ask Joe
in the left menu, select the instance, and enter\dt+
command (a psql command to show the list of tables with sizes). You should see how Joe created a thin clone behind the scenes and immediately ran this psql command, presenting the result to you.- Go to the
5. Set up Database Migration Checker
- Prepare a repository with your DB migrations (Flyway, Sqitch, Liquibase, etc.)
- Add secrets:
DLMC_CI_ENDPOINT
- an endpoint of your Database Lab Migration Checker service – usevcs_db_migration_checker_registration_url
from the Terraform outputDLMC_VERIFICATION_TOKEN
- verification token for the Database Lab Migration Checker API – usevcs_db_migration_checker_verification_token
from the Terraform output
- Configure a new workflow in the created repository (see an example of configuration: https://github.com/postgres-ai/green-zone/blob/master/.github/workflows/main.yml)
- add a custom action: https://github.com/marketplace/actions/database-lab-realistic-db-testing-in-ci
- provide input params for the action (the full list of available input params)
- provide environment variables:
DLMC_CI_ENDPOINT
- use a CI Checker endpoint from the repository secretsDLMC_VERIFICATION_TOKEN
- use a verification token from the repository secrets
6. Install and try the client CLI (dblab
)
- Follow the guide to install Database Lab CLI
- Initialize CLI:
dblab init --environment-id=<ANY NAME FOR ENVIRONMENT> --url=https://<public_dns_name> --token=<your_personal_token_from_postgres_ai_platform>
- Try it:
dblab instance status
It should return the OK status:
{
"status": {
"code": "OK",
"message": "Instance is ready"
},
...
}
Important Note
When the DLE creates new database clones, it makes them available on incremental ports in the 6000 range (e.g. 6000, 6001, ...). The DLE CLI will also report that the clone is available on a port in the 6000 range. However, please note that these are the ports when accessing the DLE from localhost
. This Terraform module deploys Envoy to handle SSL termination and port forwarding to provide connection to the clones provisioned by the DLE.
Bottom Line: When connecting to clones, add 3000
to the port number reported by the DLE CLI to connect to the clone. for example, if the CLI reports that a new clone is available at port 6001
connect that clone at port 9001
.
Known Issues
Certificate Authority Authorization (CAA) for your Hosted Zone
Depending on your DNS provider and configuration, you may need to create a CAA record in your hosted zone.vOn instance creation, this Terraform module will use Let's Encrypt to generate a valid SSL Certificate. For that to succeed, Let's Encrypt must be recognized as a valid issuing CA by your domain. To do this, add a DNS record that looks like this:
Domain Record type Value
example.com. CAA 0 issue "letsencrypt.org"
Troubleshooting
You can get help deploying the DLE. Here are two great ways to do this:
- Join the Database Lab Community Slack
- Reach out to the Postgres.ai team on Intercom chat widget (located at the bottom right corner)
Reporting Issues & Contributing
We want to make deploying and managing the Database Lab Engine as easy as possible! Please report bugs and submit feature ideas using Gitlab's Issue feature.