Skip to main content

5 posts tagged with "database lab engine"

View All Tags

DLE 2.4: realistic DB testing in GitHub Actions; Terraform module

· 6 min read

DLE 2.4: DB Migration Checker and Terraform module

Database Lab Engine 2.4 is out#

The Database Lab Engine (DLE) is an open-source technology to enable thin cloning for PostgreSQL. Thin clones are exceptionally useful when you need to scale the development process. DLE can manage dozens of an independent clones of your database on a single machine, so each engineer or automation process works with their own database provisioned in seconds without extra costs.

DLE 2.4 brings two major capabilities to those who are interested in working with PostgreSQL thin clones:

Additionally, this release has a lot of improvements and fixes.

🚀 Add DB change testing to your CI/CD pipelines#

DB migrations – database schema and data changes, usually controlled by a special tool that tracks the changes in Git. There are many such tools: Flyway, Liquibase, and Active Record Migrations, to name a few. These tools are necessary for keeping DB schema changes sane, reliable, and predictable.

However, in most cases, testing of the changes in CI/CD is very weak because it is done using either an empty or some tiny, mocked database. As a result, with growing databases and workloads, deployments of DB changes fail more often. This problem may be so annoying for some people that they might even think about switching to some NoSQL, schemaless databases, to forget about such issues – but to meet, eventually, a bunch of others: data inconsistency or update difficulties caused by lack of normalization.

With DLE 2.4 and its DB Migration Checker component, it becomes easy to get realistic testing using thin clones of PostgreSQL databases of any size right in CI/CD pipelines. With this technology, you can drastically decrease the risk of deploying harmful DB schema changes and continue using PostgreSQL, with its excellent feature set and reliability, not compromising the development speed.

An example#

To have a basic demonstration of realistic testing of DB migrations in CI/CD pipelines, let's build an index on a table that has a significant number of rows. You can see the details of this testing in this GitHub PR: https://github.com/postgres-ai/green-zone/pull/4.

If we use CREATE INDEX to build an index on a table with data, forgetting to add CONCURRENTLY, this will block all queries to the table while CREATE INDEX is running. The larger our table is, the more noticeable the negative effect on our production workload will be. In the case of large tables, such mistakes cause partial downtime resulting in direct income and/or reputation losses, depending on the type of business.

To demonstrate how Database Lab Engine catches this problem during automated testing in CI/CD, I'm going to use a GitHub repository with some example DB migrations (managed by Sqitch) for our Demo database that contains random data. As you can see in commit 839be90, I commented out the word CONCURRENTLY. Once git push is done and our unique GitHub Action finished, we can see that our change was marked as failed by DLE's DB Migration Checker:


DB Migration Checker capturing dangerous CREATE INDEX (without CONCURRENTLY)

Let's open this job and see the details:


DB Migration Checker capturing dangerous CREATE INDEX (without CONCURRENTLY)

What happened here? Behind the schenes, a pre-installed DLE server (in AWS) quickly provisioned a thin clone of the Demo database. Next, the DB change was applied in this clone, and DB Migration Checker collected telemetry, and it becomes clear that such change is going to hold an AccessExclusiveLockё blocking other queries for a significant time (according to the settings, longer than for 10 seconds). Therefore, this change marked as failed in CI/CD. This is exactly what we need to be protected to avoid deploying such changes to production.

Of course, if we get the word CONCURRENTLY back (as I did in commit 6059bf4), we'll have our "green light":


DB Migration Checker capturing dangerous CREATE INDEX CONCURRENTLY

Key features of DLE's DB Migration Checker#

  • Automated: DB migration testing in CI/CD pipelines
  • Realistic: test results are realistic because real or close-to-real (the same size but no personal data) databases are used, thin-cloned in seconds, and destroyed after testing is done
  • Fast and inexpensive: a single machine with a single disk can operate dozens of independent thin clones
  • Well-tested DB changes to avoid deployment failures: DB Migration Checker automatically detects (and prevents!) long-lasting dangerous locks that could put your production systems down
  • Secure: DB Migration Checker runs all tests in a secure environment: data cannot be copied outside the secure container
  • Lots of helpful data points: Collect useful artifacts (such as pg_stat_*** system views) and use them to empower your DB changes review process

Currently supported tools and platforms#

Currently, full automation is supported for the DB migrations tracked in GitHub repositories using one of the following tools:

It is also supposed that the automated testing is done using GitHub Actions. However, the list of supported Git platforms, CI/CD tools, and DB migration version control systems is quite easy to extend – you can do it (please publish an MR if you do!) or open an issue to ask about it in the DLE & DB Migration Checker issue tracker.

🔷 Terraform module to deploy DLE and its components in AWS#

Terraform module for Database Lab helps you deploy the Database Lab Engine in clouds. You can find the code and detailed README here: https://gitlab.com/postgres-ai/database-lab-infrastructure.

Supported platforms and limitations of this Terraform module:

  • Your source PostgreSQL database can be located anywhere
  • DLE with its components will be deployed in AWS under your AWS account.
  • Currently, only the "logical" mode of data retrieval (dump/restore) is supported – the only available method for most so-called managed PostgreSQL cloud platforms such as RDS Postgres, RDS Aurora Postgres, Azure Postgres, Heroku. "Physical" mode is not yet supported.

Feedback and contributions are very welcome.

Useful links#

Request for feedback and contributions#

Feedback and contributions would be greatly appreciated:

Database Lab Engine 2.2 and Joe Bot 0.9

· 5 min read

DLE 2.2 and Joe 0.9

About Database Lab Engine#

The Database Lab Engine (DLE) is an open-source experimentation platform for PostgreSQL databases. The DLE instantly creates full-size thin clones of your production database which you can use to:

  1. Test database migrations
  2. Optimize SQL queries
  3. Deploy full-size staging applications

The Database Lab Engine can generate thin clones for any size database, eliminating the hours (or days!) required to create “thick” database copies using conventional methods. Thin clones are independent, fully writable, and will behave identically to production: they will have the same data and will generate the same query plans.

Learn more about the Database Lab Engine and sign up for an account at https://postgres.ai/.

Database Lab Engine 2.2.0#

Database Lab Engine (DLE) 2.2.0 further improves support for both types of PostgreSQL data directory initialization and synchronization: “physical” and “logical”. Particularly, for the “logical” type (which is useful for managed cloud PostgreSQL such as Amazon RDS users), it is now possible to setup multiple disks or disk arrays and automate data retrieval on a schedule. This gracefully cleans up the oldest versions of data, without downtime or interruptions in the lifecycle of clones.

Other improvements include:

  • Auto completion for the client CLI (“dblab”)
  • Clone container configuration — Docker parameters now can be defined in DLE config (such as --shm--size that is needed to avoid errors in newer versions of Postgres when parallel workers are used to process queries)
  • Allow requesting a clone with non-superuser access — This appears as a new option in the API and CLI called “restricted”

Database Lab Engine links:

Joe Bot 0.9.0 - A Virtual DBA for SQL Optimization#

“Joe Bot”, a virtual DBA for SQL optimization, is a revolutionary new way to troubleshoot and optimize PostgreSQL query performance. Instead of running EXPLAIN or EXPLAIN (ANALYZE, BUFFERS) directly in production, users send queries for troubleshooting to Joe Bot. Joe Bot uses the Database Lab Engine (DLE) to:

  • Generate a fresh thin clone
  • Execute the query on the clone
  • Return the resulting execution plan to the user

The returned plan is identical to production in terms of structure and data volumes – this is achieved thanks to two factors:

  • thin clones have the same data and statistics as production (at a specified point in time), and
  • the PostgreSQL planner configuration on clones matches the production configuration.

Joe Bot users not only get reliable and risk-free information on how a query will be executed on production but also they can easily apply any changes to their own thin clones and see how query behavior is affected. For example, it is possible to add a new index and see if it actually helps to speed up the query.

One key aspect of Joe Bot, is the fact that users do not see the data directly, they only work with metadata. Therefore, teams without access to production data can be granted permissions to use this tool [1]

The main change in Joe Bot 0.9.0 is improved security: in past versions, DB superuser was used. Now a non-superuser is used for all requests. This makes it impossible to use plpythonu, COPY TO PROGRAM, FDW, or dblink to perform a massive copy of data outside infrastructructure which is not well protected by a strict firewall. All users are strongly recommended to upgrade as soon as possible.

Another major new feature is the production duration estimator, currently in an “experimental” state. This feature is intended to help users understand how long a specific operation - for example, an index creation operation - will actually take on the production database, which is likely to have a different physical infrastructure (for example a different filesystem, more RAM, and/or more CPU cores) than the thin clone running on the DLE. Read more: “Query duration difference between Database Lab and production environments”.

SQL Optimization Chatbot “Joe Bot” links:


(1) Although only metadata is returned from Joe Bot, it is possible to probe data for specific values using EXPLAIN ANALYZE. Please consult security experts in your organization before providing Joe Bot to people without production-level access.


Both Joe Bot and Database Lab Engine are distributed based on OSI-approved license (AGPLv3).


Your feedback is highly appreciated!

Database Lab Engine 2.1

· 3 min read

Database Lab 2.1 release

Database Lab Engine 2.1 for PostgreSQL released#

We are happy to announce version 2.1.0 of Database Lab Engine (DLE), an open-source tool for building powerful development and testing environments based on thin cloning of PostgreSQL databases. Using Database Lab API or CLI (and if you are using Database Lab SaaS, GUI), on a single machine with, say, a 1 TiB disk, you can easily create and destroy dozens of database copies of size 1 TiB each. All these copies are independently modifiable and created/destroyed in just a few seconds. This can become a game-changer in your development and testing workflow, improving time-to-market, and reducing costs of your non-production infrastructure.

In 2.1, the main new features are:

  • Better data protection and security:
    • robust configuration defining how data is patched when snapshots are automatically created (both shell and SQL scripts are now supported),
    • an option specifying whether or not passwords for the existing DB users need to be preserved.
  • [experimental] DLE API and the CLI tool are extended to have a new feature: "CI Observer" helping control DB schema changes (DB migrations) — here is the reference on how to use it https://postgres.ai/docs/reference-guides/dblab-client-cli-reference#subcommand-start-observation. This is a small step towards the big goal: have 100% coverage for testing DB migrations in CI using full-sized thin clones. Watch the demo (turn captions on):

Links#

Check out:

Please send us any feedback you have – it is hard to overestimate its meaning for such a young project:


Database Lab Engine allows cloning PostgreSQL databases of any size in just a few seconds. This can save a lot of money for development and testing infrastructure, and at the same time, drastically improve development quality and time-to-market. Database Lab Engine is open-source software distributed under OSI-approved AGPLv3 license.

Database Lab Engine is equipped with API and CLI. Additionally, we at Postgres.ai continue developing the Enterprise version that offers GUI, authentication flexibility, and user management for Database Lab Engine API and CLI, more. The Enterprise version is in the "private beta" mode; we encourage you to sign up and request a demo.

Database Lab Engine 2.0

· 4 min read

Database Lab 2.0 release

Database Lab Engine 2.0 for PostgreSQL released#

The Postgres.ai team is proud to announce version 2.0 of Database Lab Engine (DLE) for PostgreSQL, a modern database tool for building powerful development and testing environments based on thin cloning. Using Database Lab API or CLI (and if you are using Database Lab SaaS, GUI), on a single machine with, say, a 1 TiB disk, you can easily create and destroy dozens of database copies of size 1 TiB each. All these copies are independently modifiable and created/destroyed in just a few seconds. This can become a game-changer in your development and testing workflow, improving time-to-market, and reducing costs of your non-production infrastructure.

This release continues our strategy to automate all routine tasks such as initialization of the PostgreSQL data directory, data transformation, and snapshot management. In DLE 2.0, all these tasks can be flexibly configured in a single configuration file. As a result, building dev&test environments for projects with many databases (such as those that adopted microservice architecture) becomes much easier.

The previous versions of the Database Lab introduced the core technology: thin clone provisioning, based on either ZFS (default) or LVM. It was already possible to provision full-sized multi-terabyte database clones in just a few seconds and use them for a broad spectrum of tasks such as database schema changes verification, SQL query analysis, or general application testing.

Version 2.0 speeds up and empowers the initialization of DLE itself. Instead of using custom scripts for initial and continuous data retrieval, it is now possible to configure everything in a declarative manner to get the data and be up and running.

Updates in DLE 2.0#

  • Automated data retrieval: specify the source and the method of initializing the data directory and how it is to be updated
  • Both physical (pg_basebackup, WAL-G, more) and logical methods (dump/restore, Amazon RDS, Heroku Postgres, more) are supported (see the guide Database Lab Engine data sources)
  • Any managed cloud PostgreSQL offering is now supported, with additional features for Amazon RDS (see DLE tutorial for Amazon RDS and the guide Data source: AWS RDS)
  • For continuously updated physically initialized data directory (which effectively makes your DLE a specialized replica), snapshot management is fully automated: snapshots are created and destroyed based on the schedule defined in the configuration file (see the reference Job physicalSnapshot)
  • Basic data transformation and masking supported: specify any custom script that will be applied each time a new snapshot is prepared (option preprocessingScript in both logicalSnapshot and physicalSnapshot jobs, see the Configuration reference)
  • License changed to AGPLv3
  • The documentation is significantly extended: 3 tutorials, 26 user guides, 6 references, and counting: http://postgres.ai/docs

What's next#

Check out:

Please send us any feedback you have – it is hard to overestimate its meaning for such a young project:

Database Lab Engine allows cloning PostgreSQL databases of any size in just a few seconds. This can save a lot of money for development and testing infrastructure, and at the same time, drastically improve development quality and time-to-market. Database Lab Engine is open-source software distributed under OSI-approved AGPLv3 license.

Database Lab Engine is equipped with API and CLI. Additionally, we at Postgres.ai continue developing the Enterprise version that offers GUI, authentication flexibility, and user management for Database Lab Engine API and CLI, more. The Enterprise version is in the "private beta" mode; we encourage you to sign up and request a demo.

Database Lab Engine 2.0 beta: one config to rule them all; support for Amazon RDS

· 2 min read

Database Lab Engine 2.0 beta: one config to rule them all; support for Amazon RDS#

During this Summer, we were super-busy achieving two goals that defined version 2.0 of Database Lab Engine:

  1. Make all the things in Database Lab configurable in a unified manner (single configuration file): first of all, data initialization and snapshot management.
  2. Support both physical and logical types of initialization. Particularly, allow working with an RDS database as a source.

Both targets happened to be quite challenging, but it is finally done, and now we are happy to see that all the pieces of Database Lab Engine work in containers, the whole workflow is described in a single YAML configuration file, and, last but not least, it works with RDS Postgres databases. Yay!

Check out Database Lab Engine release notes, Tutorial for RDS users, and Database Lab Engine configuration reference.

As usual, please send us any feedback you have; it is hard to overestimate its meaning for such a young project:

Database Lab Engine is open-source software distributed under OSI-approved AGPLv3 license. Database Lab Engine allows to clone PostgreSQL databases of any size in just a few seconds. This can save you a lot of money for development and testing infrastructure, and at the same time, drastically improve development quality and time-to-market.

The open-source Database Lab Engine is equipped with convenient API and CLI. Additionally, we continue developing the Enterprise version that offers GUI, authentication flexibility, and user management for Database Lab Engine API and CLI, more. The Enterprise version is in the "private beta" mode; we encourage you to sign up and request a demo.