The Database Lab Engine (DLE) is an open-source technology to enable thin cloning for PostgreSQL. Thin clones are exceptionally useful when you need to scale the development process. DLE can manage dozens of an independent clones of your database on a single machine, so each engineer or automation process works with their own database provisioned in seconds without extra costs.
DLE 2.4 brings two major capabilities to those who are interested in working with PostgreSQL thin clones:
- a new component, DB Migration Checker, that automates DB migration testing in CI/CD pipelines
- a Terraform module to deploy DLE in AWS using the "logical" provisioning mode
Additionally, this release has a lot of improvements and fixes.
DB migrations – database schema and data changes, usually controlled by a special tool that tracks the changes in Git. There are many such tools: Flyway, Liquibase, and Active Record Migrations, to name a few. These tools are necessary for keeping DB schema changes sane, reliable, and predictable.
However, in most cases, testing of the changes in CI/CD is very weak because it is done using either an empty or some tiny, mocked database. As a result, with growing databases and workloads, deployments of DB changes fail more often. This problem may be so annoying for some people that they might even think about switching to some NoSQL, schemaless databases, to forget about such issues – but to meet, eventually, a bunch of others: data inconsistency or update difficulties caused by lack of normalization.
With DLE 2.4 and its DB Migration Checker component, it becomes easy to get realistic testing using thin clones of PostgreSQL databases of any size right in CI/CD pipelines. With this technology, you can drastically decrease the risk of deploying harmful DB schema changes and continue using PostgreSQL, with its excellent feature set and reliability, not compromising the development speed.
To have a basic demonstration of realistic testing of DB migrations in CI/CD pipelines, let's build an index on a table that has a significant number of rows. You can see the details of this testing in this GitHub PR: https://github.com/postgres-ai/green-zone/pull/4.
If we use
CREATE INDEX to build an index on a table with data, forgetting to add
CONCURRENTLY, this will block all queries to the table while
CREATE INDEX is running. The larger our table is, the more noticeable the negative effect on our production workload will be. In the case of large tables, such mistakes cause partial downtime resulting in direct income and/or reputation losses, depending on the type of business.
To demonstrate how Database Lab Engine catches this problem during automated testing in CI/CD, I'm going to use a GitHub repository with some example DB migrations (managed by Sqitch) for our Demo database that contains random data. As you can see in commit 839be90, I commented out the word
git push is done and our unique GitHub Action finished, we can see that our change was marked as failed by DLE's DB Migration Checker:
Let's open this job and see the details:
What happened here? Behind the schenes, a pre-installed DLE server (in AWS) quickly provisioned a thin clone of the Demo database. Next, the DB change was applied in this clone, and DB Migration Checker collected telemetry, and it becomes clear that such change is going to hold an
AccessExclusiveLockё blocking other queries for a significant time (according to the settings, longer than for 10 seconds). Therefore, this change marked as failed in CI/CD. This is exactly what we need to be protected to avoid deploying such changes to production.
Of course, if we get the word
CONCURRENTLY back (as I did in commit 6059bf4), we'll have our "green light":
- Automated: DB migration testing in CI/CD pipelines
- Realistic: test results are realistic because real or close-to-real (the same size but no personal data) databases are used, thin-cloned in seconds, and destroyed after testing is done
- Fast and inexpensive: a single machine with a single disk can operate dozens of independent thin clones
- Well-tested DB changes to avoid deployment failures: DB Migration Checker automatically detects (and prevents!) long-lasting dangerous locks that could put your production systems down
- Secure: DB Migration Checker runs all tests in a secure environment: data cannot be copied outside the secure container
- Lots of helpful data points: Collect useful artifacts (such as
pg_stat_***system views) and use them to empower your DB changes review process
Currently, full automation is supported for the DB migrations tracked in GitHub repositories using one of the following tools:
- Sqitch (Example: https://github.com/agneum/runci)
- Flyway (Example: https://github.com/postgres-ai/dblab-ci-test-flyway)
- Liquibase (Example: https://github.com/postgres-ai/dblab-ci-test-liquibase)
- Ruby on Rails: Active Record Migrations (using
- Django migrations
It is also supposed that the automated testing is done using GitHub Actions. However, the list of supported Git platforms, CI/CD tools, and DB migration version control systems is quite easy to extend – you can do it (please publish an MR if you do!) or open an issue to ask about it in the DLE & DB Migration Checker issue tracker.
Terraform module for Database Lab helps you deploy the Database Lab Engine in clouds. You can find the code and detailed README here: https://gitlab.com/postgres-ai/database-lab-infrastructure.
Supported platforms and limitations of this Terraform module:
- Your source PostgreSQL database can be located anywhere
- DLE with its components will be deployed in AWS under your AWS account.
- Currently, only the "logical" mode of data retrieval (dump/restore) is supported – the only available method for most so-called managed PostgreSQL cloud platforms such as RDS Postgres, RDS Aurora Postgres, Azure Postgres, Heroku. "Physical" mode is not yet supported.
Feedback and contributions are very welcome.
- CHANGELOG – DLE and DB Migration Checker 2.4: https://gitlab.com/postgres-ai/database-lab/-/releases#2.4.0
- Read more about DB migration testing in the Database Lab docs:
- GitHub Action to integrate Database Lab with GitHub: https://github.com/marketplace/actions/database-lab-realistic-db-testing-in-ci
- Database Lab Terraform module repository (includes detailed README): https://gitlab.com/postgres-ai/database-lab-infrastructure
Feedback and contributions would be greatly appreciated:
- Database Lab Community Slack: https://slack.postgres.ai/
- DLE & DB Migration Checker issue tracker: https://gitlab.com/postgres-ai/database-lab/-/issues
- Issue tracker of the Terraform module for Database Lab: https://gitlab.com/postgres-ai/database-lab-infrastructure/-/issues
Explore Database Lab
Clone large PostgreSQL databases in seconds and get superpowers when changing DB schema and optimizing SQL queries!