Export and compare existing databases using Liquibase

In previous post, we discussed how we can use Liquibase to deploy changeLogs to databases. Often times, when you start implementing Liquibase, you would already have a database which is being used by the application. Though Liquibase does not depend upon the existing schema, it is still a good idea to export all of the existing schema and possibly data (to some extent) and also put that information in the version control. Also, sometimes you would like to compare databases (both schema and data, again to some extent) to figure out issues, like why a special behavior is being observed in the dev environment but not in your production. In this blog post, we’ll go about same.

Exporting Schema from Existing Database

We’ll use the same database as we have been using till now, AdventureWorks2017, to export schema. It is one of the sample databases available from Microsoft. Note that we have already learned how to use liquibase.properties file to minimize the information that needs to be typed while using Liquibase command. So we need not be typing all parameters and values again. Liquibase allows you to do this with the generateChangeLog option:


liquibase
–driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
–classpath="C:\\Program Files\\Microsoft JDBC Driver 6.0 for SQL Server\\sqljdbc_6.0\\enu\\jre8\\sqljdbc42.jar"
–url="jdbc:sqlserver://localhost:1433;databaseName=AdventureWorks2017;integratedSecurity=false;"
–changeLogFile="D:\Source\generateChangeLog.xml"
–username=liquibase
–password=liquibase@123
–logLevel=info
generateChangeLog

In the above command, the changeLogFile value determines the name of the file that will contain the exported information. The extension of the file i.e. XML/JSON etc. will determine the type in which information is generated. Below is one of the snippets from file generated in my case:

export database schema using liquibase

The actual file is too large to be displayed here. Also, by default, Liquibase will generate code only for dbo schema. This can be overridden by using –– defaultSchemaName option at the command line. So, we can use below command to generate changeLog for ‘Person’ schema:


liquibase
–driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
–classpath="C:\\Program Files\\Microsoft JDBC Driver 6.0 for SQL Server\\sqljdbc_6.0\\enu\\jre8"
–url="jdbc:sqlserver://localhost:1433;databaseName=AdventureWorks2017;integratedSecurity=false;"
–changeLogFile="D:\Source\generateChangeLog–PersonSchema.xml"
–username=liquibase
–password=liquibase@123
–logLevel=info
–defaultSchemaName=Person
generateChangeLog

If you have a large number of schemas in your SQL Server database, you have to repeat this step to generate changelogs for each of those schemas. Obviously this can be time consuming, but with a little bit of scripting, this can be automated as well.

Exporting Data from Existing Database

To export data from the database, we can use another option: ––diffTypes and it has to have a value of ‘data’. The command to be run is as below:


liquibase
–driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
–classpath="C:\\Program Files\\Microsoft JDBC Driver 6.0 for SQL Server\\sqljdbc_6.0\\enu\\jre8"
–url="jdbc:sqlserver://localhost:1433;databaseName=AdventureWorks2017;integratedSecurity=false;"
–changeLogFile="D:\Source\generateChangeLog–PersonSchema.xml"
–username=liquibase
–password=liquibase@123
–logLevel=info
–defaultSchemaName=dbo
–diffTypes=data
generateChangeLog

Below is one of the snippets from file generated in my case:

export database data using liquibase

Again, by default, it will export information for dbo schema.

Limitations with exporting data/schema

Note that the open source version of Liquibase currently has some limitations. It does not export the following types of objects: Stored procedures, functions, packages, Triggers. However the commercial version does not have this limitation.

Another one of the limitations with the exported data/schema is that it is one large changeLog file with possibly thoushands of changeSets. However we can easily write a script to parse the file and then categorize the information in nice hierarchy like we can store 1 stored procedure per file and make all stored procedures under a directory named stored procedures. Same goes for other objects like tables etc. Alternatively, we can run database specific tools / languages to export this information in Database specific native SQL, and arrange in nice hierarchy. And we can also automate the mechanism used to invoke those tools. That are the topics for another blog post in future :).

Compare Databases using Liquibase

It is useful from time to time when you want to compare databases to observe how they differ mostly in terms of schema and may be data. This can be particularly helpful in determining how your dev database differs from production and why in one environment you are able to reproduce the bug but not in the other environment. Doing this manually can be cumbersome and error prone as there can be any number of objects inside the database. However, we can use liquibase to do this as well.

From liquibase perspective, we can run below command to compare the databases:


liquibase
–driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
–classpath="C:\\Program Files\\Microsoft JDBC Driver 6.0 for SQL Server\\sqljdbc_6.0\\enu\\jre8"
–url="jdbc:sqlserver://localhost:1433;databaseName=AdventureWorksProd;integratedSecurity=false;"
–username=liquibase
–password=liquibase@123
–logLevel=debug
diff
–referenceUrl="jdbc:sqlserver://localhost:1433;databaseName=AdventureWorks2017;integratedSecurity=false;"
–referenceUsername=liquibase
–referencePassword=liquibase@123

Alternatively, we can store this all information in liquibase.properties file and run just liquibase diff. For this case, liquibase.properties file needs to contain below values:


driver:com.microsoft.sqlserver.jdbc.SQLServerDriver
classpath:..\\Program Files\\Microsoft JDBC Driver 6.0 for SQL Server\\sqljdbc_6.0\\enu\\jre8\\sqljdbc42.jar
url:jdbc:sqlserver://localhost:1433;databaseName=AdventureWorksProd;integratedSecurity=false;
username:liquibase
password:liquibase@123
logLevel:debug
referenceUrl:jdbc:sqlserver://localhost:1433;databaseName=AdventureWorks2017;integratedSecurity=false;
referenceUsername:liquibase
referencePassword:liquibase@123

On successful run, we should see like below output:

diff databases using liquibase.PNG

And like other database diff tools, Liquibase can also generate a changelog for the missing objects. This is done with the diffChangeLog command:

database difference changelog

Since this is in XML format and liquibase ready, you can apply the same to negate the difference. Database objects to include in the changelog can be controlled with the includeObjects or excludeObjects parameters.

Currently, Liquibase runs the following comparisons:

  1. Version Differences
  2. Missing/unexpected tables
  3. Missing/unexpected views
  4. Missing/unexpected columns
  5. Missing/unexpected primary keys
  6. Missing/unexpected unique constraints
  7. Missing/unexpected foreign Keys
  8. Missing/unexpected sequences
  9. Missing/unexpected indexes
  10. Column definition differences (data type, auto-increment, etc.)
  11. View definition differences
  12. Data differences (limited), not checked by default

It does not (currently) check:

  1. Non-foreign key constraints (check, etc)
  2. Stored Procedures
  3. Data type length

Liquibase can diff different database types, but the results may be skewed due to differences in case and data types.

Note that if you do not see the expected difference in the output, the answer lies in the permissions on the user which is used to establish the database connection. Again just to re-emphasize, the same is applicable to all the database types supported by liquibase.

4 thoughts on “Export and compare existing databases using Liquibase

  1. Hello Sir,
    Nice Info! I have the below query.
    ——————————————
    Lets say that I have 2 tables in a db schema. Now when I do a generateChangeLog I get the info of the 2 tables as change sets. Now again I create 2 more new tables. Now when I perform generateChangeLog It is listing info about all the 4 tables. Is there any way to get only the delta of changes i.e the latest tables that are created in this case ????

    Like

    • The generated list of tables would depend on the checkpoints made in the databasechangelog table. This option is ideally suitable where you are deploying on the same database using liquibase, so that you can get delta based on previous deployments made using liquibase

      Like

  2. Really appreciate such a detailed post !!! It is helpful for anyone who wants to liquibase for their DB schema change automation

    Like

Leave a comment