Object Relational Mapping frameworks, such as Hibernate, very often provide a way to automatically generate the database schema based on persistent classes. While it is very convenient as a proof of concept or for very simple applications, it is not a production ready approach. In reality you have to write scripts to version your schema and to migrate from one version to another. It is especially challenging when you have existing data in the database. Typically tools such as Liquibase or Flyway are used to manage schema and data migrations.
In automated tests, though, ORM capability of schema auto-generation is very useful — usually it is much faster than running migration scripts. A very common approach is to set up an in-memory database (e.g. H2) for tests and let ORM generate the schema. In case of JPA/Hibernate, it would mean that hbm2ddl.auto property would be set to update for tests and validate for production.
This approach imposes a challenge of ensuring that our schema migration scripts are up to date with our ORM entities. It is very error prone to do it manually. In this post I will present a way of testing it automatically. The test will be responsible for generating a diff between ORM-generated schema and schema generated with migration scripts. If there are any differences, the test will fail, so the consistency will be ensured.
Liquibase, a popular migration scripts tool, provides a diff capability, i.e. an API to compare two database schemas. We will leverage this functionality in our test. It does not mean that you have to use Liquibase for your migrations scripts, though. The diff mechanism can be used for any schemas, regardless of what has created them. Therefore you can use any migrations framework (Liquibase, Flyway or something different) and whatever ORM framework you want.
Later in the article I will provide an implementation example with Hibernate/JPA and Liquibase, but please keep in mind that the general idea is framework-agnostic.
High level test structure
The general flow in our test is going to be as follows:
- Provide access to the schema generated by the ORM framework — in the example, we are using JPA, Spring and HSQL database so we can just use @Autowired to inject our Datasource to the test.
- Provide access to the schema created by migration scripts — in the example, a new in-memory HSQLD Datasource is created and then we manually execute Liquibase scripts against it.
- Use Liquibase to compare two schemas and calculate DiffResult.
- Log all the differences so that it is possible to easily find the missing migrations.
- Fail if the result contains any differences.
The presented approach allows you to compare database schema structure only (e.g. if all tables exist and have appropriate columns). It will not check if you migrated your records properly (e.g. if you moved all records from one table to another or if you properly reformatted values in a column). Data consistency must be checked by using other means (and typically a generic solution which would work for all the cases just does not exist).
I published a working Spring Boot application on Github. It contains two entities, migration scripts and a working test. You can play with it (by altering entity classes and/or migration scripts) to see different test results.
It produces log output similar to:
====[ Diff Result ]==== Reference Database: SA @ jdbc:hsqldb:mem:testdb (Default Schema: PUBLIC) Comparison Database: SA @ jdbc:hsqldb:mem:migrationsTestDb (Default Schema: PUBLIC) Compared Schemas: PUBLIC Product Name: EQUAL Product Version: EQUAL Missing Catalog(s): NONE Unexpected Catalog(s): NONE Changed Catalog(s): NONE Missing Column(s): PUBLIC.ORDERS_CUSTOMERS.CUSTOMERS_ID PUBLIC.CUSTOMERS.NAME PUBLIC.ORDERS_CUSTOMERS.ORDERS_ID Unexpected Column(s): NONE Changed Column(s): NONE Missing Foreign Key(s): FKHKSFFJAVVQ4MK288GPGCVRI09(ORDERS_CUSTOMERS[CUSTOMERS_ID] -> CUSTOMERS[ID]) FKN5NN4A0LNTWFXOUSQIPEYII8D(ORDERS_CUSTOMERS[ORDERS_ID] -> ORDERS[ID]) Unexpected Foreign Key(s): NONE Changed Foreign Key(s): NONE Missing Index(s): SYS_IDX_FKHKSFFJAVVQ4MK288GPGCVRI09_10104 on PUBLIC.ORDERS_CUSTOMERS(CUSTOMERS_ID) SYS_IDX_FKN5NN4A0LNTWFXOUSQIPEYII8D_10108 on PUBLIC.ORDERS_CUSTOMERS(ORDERS_ID) Unexpected Index(s): NONE Changed Index(s): NONE Missing Primary Key(s): NONE Unexpected Primary Key(s): NONE Changed Primary Key(s): NONE Missing Schema(s): NONE Unexpected Schema(s): NONE Changed Schema(s): NONE Missing Sequence(s): NONE Unexpected Sequence(s): NONE Changed Sequence(s): NONE Missing Stored Procedure(s): NONE Unexpected Stored Procedure(s): NONE Changed Stored Procedure(s): NONE Missing Table(s): ORDERS_CUSTOMERS Unexpected Table(s): NONE Changed Table(s): NONE Missing Unique Constraint(s): NONE Unexpected Unique Constraint(s): NONE Changed Unique Constraint(s): NONE Missing View(s): NONE Unexpected View(s): NONE Changed View(s): NONE
When you take a look at the code (MigrationsTest), you may notice that DiffResult generated by Liquibase is filtered. Sometimes it is needed and the reasons include:
- You may want to ignore tables not generated by ORM (e.g. Liquibase changelog tables).
- The diff may be too strict (e.g. you may want to ignore column size, as default values may differ if you compare different database engines (e.g. H2 and HSQLDB).
Filtering code is not particularly beautiful, but usually you write it once and then the test works for years.
Liquibase provides an API for comparing database schemas. It can be used to write an automated test to ensure that schema created with your migration scripts is the same as the one generated by ORM. This is very helpful — there is much lower chance that you forget about a migration when you change your entities. Please take a look at the example on Github. Perhaps this approach will be helpful for your project?