Skip to main content

Database Lab development roadmap

Roadmap#

We work hard to develop the Database Lab Platform (DLP) and its open-source core component, Database Lab Engine (DLE). Below you can find the main ideas we are working on now or planning to work soon.

Updated: 2022-01-20

[DLE] Physical provisioning#

Physical provisioning: native support of provisioning from archives created by a specific backup solution or based on an existing Postgres database

  • Support various sources
    • Generic (anything: pg_basebackup, rsync, any backup tools)
    • Native support for specific backup tools
      • WAL-E/WAL-G
      • pgBackRest
      • Barman
      • pg_probackup
  • Continuously updated state (physical replication based on WAL shipping)
  • Snapshot management (schedule, retention policy)

[DLE] Logical provisioning#

Logical provisioning: native support of provisioning for managed PostgreSQL databases

  • Support various sources
    • Any PostgreSQL DB via dump/restore: Amazon RDS, Heroku Postgres, Azure PostgreSQL, GCP CloudSQL, Digital Ocean Postgres, etc.
    • AWS: RDS IAM
    • GCP: CloudSQL IAM
    • Azure IAM
    • Restore from backups stored on AWS S3
      • uncompressed
      • compressed (gzip, bzip2)
  • Continuously updated state (logical replication)
  • Dump/restore on the fly (without need to save dumps on disk)
  • Multiple pools, rotation/refresh on schedule
    • "Selected pool": allow to specify pool for the case when multiple DLEs are running on the same machine
    • Advanced refresh policies: force refresh on pools being in use, warning period, number of retries before forcing
  • Partial data retrieval
    • specific databases
    • specific tables
    • arbitrary filtering (column and row filtering)

[DLE] Engine features#

  • Persist clones when the engine restarts (added in DLE 3.0.0)
  • Point-in-time recovery (PITR) (Can be used for ultra-fast recovery of accidentally deleted data)
  • Troubleshoot/test standby behavior
    • Create clone running in read-only mode to allow troubleshooting hot standby issues
    • Allow launching N replicas for a clone
    • For the "physical" mode: create clone from "pre" snapshot (read-only, unpromoted; admins only)
  • Duplicate DLE (create a new DLE based on existing one)
  • Clone observability
    • "Temporary" system- and Postgres-level monitoring for clones/sessions
    • Clone analytics
    • Advanced audit
    • perf/FlameGraphs
  • Utilization of DLE instance and alerts
  • Usage and estimated savings reports
  • SSH port forwarding for API and Postgres connections
  • Tags
  • Framework for macro database experiments (work with thick/regular clones)
  • Auto-register DLE in Platform
  • Resource usage quotas for clones: CPU, RAM (container quotas, supported by Docker)
  • User quotas
  • Disk quotas (zfs set quota=xx)
  • GUI with key features (added in DLE 3.0.0)
  • Fast connection to clone's DB via CLI
  • Advanced snapshot management
    • API handle to create/destroy snapshots (for continuously updated state)
    • User-defined snapshots for clones
    • Snapshot export/import (S3)
  • Advanced schema management
    • schema diff
    • zero-downtime DDL auto-generation
  • Reset clone's state to particular database version – keeping DB creds the same (including port)
    • physical: allow choosing dataStateAt
    • logical: allow "jumping" between DB versions (pools)

[DLP] Platform features#

  • Support working with multiple DLEs
  • Backups, PITR
  • User management: basic permissions
  • User management: advanced permissions
  • SSO
  • Clone (Postgres, Postgres over SSH / port forwarding) connection options
    • LDAP
    • SSH key management
  • Security
    • Security overview: software used, incidents, code analysis
    • Basic audit logging
    • Advanced audit logging and alerting
    • Export audit logs from GUI
  • Usage stats
  • Monitoring
  • Notifications
    • Notification management – turn on/off all or specific ones
    • Non-deletable clone is abandoned / not used for too long
    • Clone and snapshot is using too much disk space / out-of-disk-space risks
    • CPU saturation
    • Disk space saturation
    • Disk IO saturation
    • Refresh cannot be done because all pools are busy and policy doesn't allow forced refresh
    • Full refresh started
    • Full refresh finished
    • Lag value is too high ("sync" container)
    • Initial data retrieval started
    • Initial data retrieval finished
    • Snapshot created
    • Snapshot deleted
  • Pricing, billing
    • pricing based on disk space used
    • report usage to postgres.ai
    • flexible pricing options
    • AWS: instance type and size based

[DLP] Automated verification of database schema and complex data changes a.k.a. DB migrations#

  • History and logging for clones/sessions
  • Automated detection of locking issues
  • Setting custom statement_timeout
  • PostgreSQL logs for the migration
  • Report in CI and Platform
  • Integration with CI tools – advanced integration
    • GitHub Actions
    • Bitbucket CI/CD
    • CitcleCI
    • Jenkins
    • GitLab CI/CD
    • Bamboo
    • TravisCI
  • Support vairous database migration tools + demo
    • Sqitch
    • Flyway
    • Liquibase
    • Ruby on Rails Active Record
    • Django migrations
  • "Production timing" estimator (experimental feature, added in DLE 2.3.0)
  • More artifacts to support decisions: pgstat*, system usage, WAL, checkpoints, etc.

[DLE] Cloning (CoW technology)#

  • ZFS
  • LVM
  • Storage-based CoW

[DLP] Automation, clouds, Kubernetes#

  • Simplify setup for major Cloud Service Providers, automation of installation in clients' accounts
    • Basic Terraform templates
    • One-click setup on AWS. AWS Marketplace
    • One-click setup on GCP. GCP Marketplace
    • One-click setup on Azure. GCP Marketplace
  • Cloud DLP ("DLP SaaS"): cloud offering (fully managed DLE and DLP)
    • AWS
    • GCP
    • Azure
  • Self-managed DLP ("DLP Enterprise"): work multiple DLEs and all DLP features in customer's account
  • Cost optimization
    • AWS spot instances
    • GCP preemptible instances (24h max)
    • Azure spot instances
    • AWS/GCP/Azure: Self-stopping instances for cost savings, keeping disk present, and refreshing when needed
  • Kubernetes support

[DLP/Joe] SQL optimization chatbot#

  • Web UI version
  • Slack chatbot
  • Telegram chatbot
  • History with Search and Share options
  • Visualizations
    • explain.depesz
    • explain.dalibo (PEV2)
    • pgMustard (WebUI/SaaS only)
    • FlameGraphs
  • Better optimization recommendations
  • Macroanalysis insights (suggestions based on postgres-checkup / pgss)
  • Hypothetical indexes
  • Hypothetical partitioning
  • Index advisor
  • Utilization control
  • Restore user sessions after Joe container restarts
  • Better chatbot security
    • Do not use DB superuser
    • Quotas (rate limits)
    • Alert admins when a quota is reached
  • Runtime execution plan observability: pg_query_state
  • Reset to specific dataStateAt
  • perf/FlameGraphs
  • Wait event sampling

[SaaS] Data masking and anonymization#

  • Basic support for masking and obfuscation
  • Hybrid setup: raw and obfuscaned/masked clones on the same DLE instance
  • Dump/restore with runtime anonimization, parallelized, via GitOps
  • Simplified setup for anonymization - GUI
  • Automated masking / anonymization

[Docs] Better documentation and demo#

  • Tutorials
    • Basic
    • RDS
    • SQL optimization using Joe bot
    • DB migration testing in CI/CD pipelines
    • Katacoda
  • User Guides
    • DLE setup and administration
    • Cloning
    • SQL optimization with Joe bot
    • Building non-production environments
  • References
    • DLE API
    • CLI
    • DLE configuration
    • Joe configuration
  • Explanations and concepts
    • SQL optimization workflow with Database Lab
    • Configuration details
    • Why and how SQL optimization is possible on thin clones
    • Security aspects
    • Secure and robust test/staging environments
  • Interactive demo
    • Basic DLE features
    • SaaS features
    • SQL optimization using Joe bot (WebUI)
    • DB migration testing (GitHub Actions)
    • Advanced examples, use cases
  • Video demonstrations