Automated validation of DB schema migrations

Object Relational Mapping frameworks, such as Hibernate, very often provide a way to automatically generate the database schema based on persistent classes. While it is very convenient as a proof of concept or for very simple applications, it is not a production ready approach. In reality you have to write scripts to version your schema and to migrate from one version to another. It is especially challenging when you have existing data in the database. Typically tools such as Liquibase or Flyway are used to manage schema and data migrations.

In automated tests, though, ORM capability of schema auto-generation is very useful — usually it is much faster than running migration scripts. A very common approach is to set up an in-memory database (e.g. H2) for tests and let ORM generate the schema. In case of JPA/Hibernate, it would mean that hbm2ddl.auto property would be set to update for tests and validate for production.

This approach imposes a challenge of ensuring that our schema migration scripts are up to date with our ORM entities. It is very error prone to do it manually. In this post I will present a way of testing it automatically. The test will be responsible for generating a diff between ORM-generated schema and schema generated with migration scripts. If there are any differences, the test will fail, so the consistency will be ensured.

Idea

Liquibase, a popular migration scripts tool, provides a diff capability, i.e. an API to compare two database schemas. We will leverage this functionality in our test. It does not mean that you have to use Liquibase for your migrations scripts, though. The diff mechanism can be used for any schemas, regardless of what has created them. Therefore you can use any migrations framework (Liquibase, Flyway or something different) and whatever ORM framework you want.

Later in the article I will provide an implementation example with Hibernate/JPA and Liquibase, but please keep in mind that the general idea is framework-agnostic.

High level test structure

The general flow in our test is going to be as follows:

  1. Provide access to the schema generated by the ORM framework — in the example, we are using JPA, Spring and HSQL database so we can just use @Autowired to inject our Datasource to the test.
  2. Provide access to the schema created by migration scripts — in the example, a new in-memory HSQLD Datasource is created and then we manually execute Liquibase scripts against it.
  3. Use Liquibase to compare two schemas and calculate DiffResult.
  4. Log all the differences so that it is possible to easily find the missing migrations.
  5. Fail if the result contains any differences.

Limitations

The presented approach allows you to compare database schema structure only (e.g. if all tables exist and have appropriate columns). It will not check if you migrated your records properly (e.g. if you moved all records from one table to another or if you properly reformatted values in a column). Data consistency must be checked by using other means (and typically a generic solution which would work for all the cases just does not exist).

Implementation example

I published a working Spring Boot application on Github. It contains two entities, migration scripts and a working test. You can play with it (by altering entity classes and/or migration scripts) to see different test results.

It produces log output similar to:

====[ Diff Result ]====
Reference Database: SA @ jdbc:hsqldb:mem:testdb (Default Schema: PUBLIC)
Comparison Database: SA @ jdbc:hsqldb:mem:migrationsTestDb (Default Schema: PUBLIC)
Compared Schemas: PUBLIC
Product Name: EQUAL
Product Version: EQUAL
Missing Catalog(s): NONE
Unexpected Catalog(s): NONE
Changed Catalog(s): NONE
Missing Column(s): 
     PUBLIC.ORDERS_CUSTOMERS.CUSTOMERS_ID
     PUBLIC.CUSTOMERS.NAME
     PUBLIC.ORDERS_CUSTOMERS.ORDERS_ID
Unexpected Column(s): NONE
Changed Column(s): NONE
Missing Foreign Key(s): 
     FKHKSFFJAVVQ4MK288GPGCVRI09(ORDERS_CUSTOMERS[CUSTOMERS_ID] -> CUSTOMERS[ID])
     FKN5NN4A0LNTWFXOUSQIPEYII8D(ORDERS_CUSTOMERS[ORDERS_ID] -> ORDERS[ID])
Unexpected Foreign Key(s): NONE
Changed Foreign Key(s): NONE
Missing Index(s): 
     SYS_IDX_FKHKSFFJAVVQ4MK288GPGCVRI09_10104 on PUBLIC.ORDERS_CUSTOMERS(CUSTOMERS_ID)
     SYS_IDX_FKN5NN4A0LNTWFXOUSQIPEYII8D_10108 on PUBLIC.ORDERS_CUSTOMERS(ORDERS_ID)
Unexpected Index(s): NONE
Changed Index(s): NONE
Missing Primary Key(s): NONE
Unexpected Primary Key(s): NONE
Changed Primary Key(s): NONE
Missing Schema(s): NONE
Unexpected Schema(s): NONE
Changed Schema(s): NONE
Missing Sequence(s): NONE
Unexpected Sequence(s): NONE
Changed Sequence(s): NONE
Missing Stored Procedure(s): NONE
Unexpected Stored Procedure(s): NONE
Changed Stored Procedure(s): NONE
Missing Table(s): 
     ORDERS_CUSTOMERS
Unexpected Table(s): NONE
Changed Table(s): NONE
Missing Unique Constraint(s): NONE
Unexpected Unique Constraint(s): NONE
Changed Unique Constraint(s): NONE
Missing View(s): NONE
Unexpected View(s): NONE
Changed View(s): NONE

Caveats

When you take a look at the code (MigrationsTest), you may notice that DiffResult generated by Liquibase is filtered. Sometimes it is needed and the reasons include:

  1. You may want to ignore tables not generated by ORM (e.g. Liquibase changelog tables).
  2. The diff may be too strict (e.g. you may want to ignore column size, as default values may differ if you compare different database engines (e.g. H2 and HSQLDB).

Filtering code is not particularly beautiful, but usually you write it once and then the test works for years.

Summary

Liquibase provides an API for comparing database schemas. It can be used to write an automated test to ensure that schema created with your migration scripts is the same as the one generated by ORM. This is very helpful — there is much lower chance that you forget about a migration when you change your entities. Please take a look at the example on Github. Perhaps this approach will be helpful for your project?

Related Post

1 response to "Automated validation of DB schema migrations"

Leave a Reply

Your email address will not be published. Required fields are marked *