postgrails   >   20201228-unalembicize  

Unalembicize: Moving away from Alembic

published 2020-12-28

I'm starting a new project and find that I don't like defining tables in Python code at all. Instead, I strongly prefer to define my database schemas in SQL. This goes with my overall philosophy of (a) writing in the language of the system you're using and (b) removing unnecessary layers of abstraction between your writing and the system you're using. These two points are just two ways of saying the same thing.

The problem with alembic is that you have to define your tables in a bespoke DSL which adds a layer of abstraction on top of SQL. Why not just write in SQL? Why are so many programmers so averse to writing SQL? SQL was designed for this job. Let's use it.

There are two potential issues with skipping alembic: (1) What about migrations? (2) What about SQLAlchemy core queries?

  1. Migrations are very nice, and I've found that diesel provides a good system for doing migrations with SQL files. It's not perfect, but it works well, and it just requires getting a diesel binary to do migrations. Now, the diesel project doesn't provide a nicely-packaged binary, so I created a docker image that provides a diesel binary -- see kruxia/diesel. Adding diesel to another glibc-based image is now as simple as
COPY --from=kruxia/diesel:buster-slim /usr/local/bin/diesel /usr/local/bin/diesel

and now you have a 7 MB executable that is ready to use:

$ ls -laFh /usr/local/bin/diesel
-rwxr-xr-x 1 root root 7.0M Nov  1 15:48 /usr/local/bin/diesel*
  1. SQLAlchemy core queries are also very nice, especially for the boring CRUD stuff that we do all day long — SELECT, INSERT, UPDATE, and DELETE that map to individual table records. Using SQLAlchemy core requires that the tables be defined as Python objects. But we don't have to define the tables in Python code. Instead of defining individual tables, we'll just reflect the database into Python from the database (see Reflecting All Tables at Once in the SQLAlchemy documentation). That way, the database is the single source of truth regarding its schema, and at the same time we can have access to everything in Python.

In summary, we can define our database schema in SQL, while enjoying all the benefits of database migrations via diesel and accessing table definitions in python via SQLAlchemy core.