|
| 1 | +# Database |
| 2 | + |
| 3 | +Information about the database and how to work with it. |
| 4 | + |
| 5 | +## Migrations |
| 6 | + |
| 7 | +MySQL does not support transactional DDL (Data Definition Language statements like `ALTER TABLE`, `CREATE TABLE`, etc.). |
| 8 | + |
| 9 | +Therefore, Django migrations run against MySQL are not truly atomic, even if `atomic = True` is set on the migration class. |
| 10 | + |
| 11 | +Here's what that means: |
| 12 | + |
| 13 | +- Implicit Commits: MySQL automatically commits the transaction before and after most DDL statements. |
| 14 | +- Partial Application: If a migration involves multiple operations (e.g., adding a field, then creating an index) and fails during the second operation, the first operation (adding the field) will not be rolled back automatically. The database schema will be left in an intermediate state. |
| 15 | +- `atomic = True` Limitation: While Django attempts to wrap the migration in a transaction, the underlying database behavior with DDL prevents full atomicity for schema changes in MySQL. DML operations (like `RunPython` or data migrations) within that transaction might be rolled back, but the DDL changes won't be. |
| 16 | + |
| 17 | +### Writing migrations |
| 18 | + |
| 19 | +Keep these tips in mind when writing migrations: |
| 20 | + |
| 21 | +1. Always create a migration via `./manage.py makemigrations` |
| 22 | + |
| 23 | + This ensures that the migration is created with the correct dependencies, sequential naming etc. (to create an empty migration that executes arbitrary code, use `./manage.py makemigrations --empty <name>`). |
| 24 | + |
| 25 | +2. If you need to execute arbitrary code, use `RunPython` or `RunSQL` operations. |
| 26 | + |
| 27 | + These operations are not wrapped in a transaction, so you need to be careful with how you write them and be mindful that the state of the database might not be consistent every time the migration is run. Validate assumptions and be careful. |
| 28 | + |
| 29 | + If your migration requires `RunPython`, make that the only operation in the migration. Split database table modification to a separate migration to ensure a partial application due to failure does not result in an invalid database state. |
| 30 | + |
| 31 | +3. Large data migrations should be run via `tasks` |
| 32 | + |
| 33 | + This ensures they run on an environment that supports long running work. In production (kubernetes) pods are disposable, so you should not assume you can run a long migration in an arbitrary pod. |
| 34 | + |
| 35 | +### Testing migrations |
| 36 | + |
| 37 | +You can test migrations locally. This should not be considered a safe verification that a migration will work in production because the data in your local database will differ significantly from production. But this will ensure your migration runs against the same schema and with some seeded data. |
| 38 | + |
| 39 | +- Find the current version of AMO on production. |
| 40 | + |
| 41 | + ```bash |
| 42 | + curl -s "https://addons.mozilla.org/__version__" |
| 43 | + ``` |
| 44 | + |
| 45 | +- Checkout the tag |
| 46 | + |
| 47 | + ```bash |
| 48 | + git checkout <tag> |
| 49 | + ``` |
| 50 | + |
| 51 | +- Make up initialized database |
| 52 | + |
| 53 | + ```bash |
| 54 | + make up INIT_CLEAN=True |
| 55 | + ``` |
| 56 | + |
| 57 | +- Checkout your branch and run the migrations |
| 58 | + |
| 59 | + ```bash |
| 60 | + git checkout <branch> |
| 61 | + make up |
| 62 | + ``` |
| 63 | + |
| 64 | +### Deploying migrations |
| 65 | + |
| 66 | +Migrations are deployed to production automatically via the deployment pipeline. |
| 67 | +Importantly, migrations are deployed before new code is deployed. That menas: |
| 68 | + |
| 69 | +- If a migration fails, it will cancel the deployment. |
| 70 | +- Migrations must be backwards compatible with the previous version of the code. see [testing migrations](#testing-migrations) |
| 71 | + |
| 72 | +Migrations run on dedicated pods in kubernetes against the primary database. That means: |
| 73 | + |
| 74 | +- Changes to the database schema will not be immediately reflected across all replicas immediately after the migration is deployed. |
| 75 | +- Long running migrations could be interupted by kubernetes and should be avoided. See [writing migrations](#writing-migrations) |
| 76 | + |
| 77 | +If you have an important data migration, consider shipping it in a dedicated release to ensure the database is migrated before required code changes are deployed. See [cherry-picking](https://mozilla.github.io/addons/server/push_duty/cherry-picking.html) for details. |
0 commit comments