Autogenerated migration script reapplies changes

flavio · October 18, 2023, 9:02pm

Hi there, we’re seeing a few issues with using the autogen tool to create migration scripts between two environments. The main issue we’re seeing is that the tool tries to reapply schema changes that aren’t required, or that have already been applied.

One of the issues we’re seeing is that the script tries to create model/blocks that already exist (which will fail due to duplicated names), and delete the previous model/block. Could this be because the IDs are different?

Part of the issue seems to be an incompatibility between the autogen tool with schema changes that have been manually applied. I’m not sure if the diff is applied by id or by name, but the issue might be around that.

Workaround:
The current workaround whenever this starts happening is to fork envs to ensure they are copies of each other. This is not ideal since there might be data changes required on some envs but not others, however it does mitigate the reapplication of schema changes since the autogen tool correctly identifies that envs have identical fields, models, and blocks.

m.finamor · October 19, 2023, 11:34am

Hello @flavio

Exactly!
To make sure that the auto-generation script works, you need to make sure that the IDs of the already existing (pre-changes) entities are the same across the environments, thats why we recommend doing a fresh fork before making the changes on the environment and auto-generating the script.

This is also the case if you manually changed something after forking two environments, as that will generate an ID discrepancy between the two environments.

So the solution for that is to make sure that the environment you are using the auto migration on is a freshly forked environment, otherwise, manual migration scripts need to be used instead.

flavio · October 23, 2023, 2:26pm

Hey @m.finamor, thank you for the reply. I’m not sure I follow how to migrate on a freshly forked env. This is the process we’re using:

We have 3 envs: dev, staging, and prod.

dev

Developers work against dev and apply any schema changes they need on that env. When PRs get merged, we migrate schema changes from dev to staging and populate any new data that might be needed. Optionally, after the migration, we might fork staging onto a new dev env if we need to cleanup the data on dev. (Data on dev is always considered “dirty” and not ready for production)

staging

Stakeholders and QA validate the staging env where features and bugfixes get bundled waiting for a deployment to the production website. Once that’s ready, we migrate schema changes from staging to prod. This process is almost identical to dev -> staging, so if needed, we might delete the staging env and fork prod onto a new staging env in order to update and cleanup the data on staging, which is also considered “dirty” and not ready for production.

prod

The content and marketing teams change data directly on the prod environment, leveraging Dato’s draft system to preview any changes before publishing them. The prod env is considered the source of truth for data and should never have its content modified by unauthorized users and, consequently, should never be deleted.

Question

Hence the question: if migrations should be applied to freshly forked envs, how can we apply changes to the prod env? Wouldn’t we lose its data by forking a “lower” env on the hierarchy?

m.finamor · October 24, 2023, 7:28am

Hello @flavio

This is what you need to make sure:

The schema of the prod environment, should not be altered after you have forked a new staging (and subsequently a dev environment). Because if you do so, the changes you made on the prod environment will generate an ID mismatch between the other environments.

So the way to go is:

Fork prod into staging and staging into dev. This will make the IDs across all environments consistent

Do not make schema changes directly on prod, and do not make schema changes directly on staging.

When you want to make a schema change, make it on dev, and migrate it to staging, and then migrate it to prod if necessary.

Do not use the same migrations you generated from dev → staging to migrate staging → prod, always generate a separate one for staging → prod

Every time you complete a migration, you should re-fork the children environments:
If you just finished a migration dev → staging: Delete dev and fork a new one from the new staing
If you just finished a migration staging → prod: Delete staging a and dev, and fork a new staging and dev.

This way you can make sure that the IDs are always consistent across all environemnts

Keep in mind that this will not work for records, and only for schema (models, blocks and fields) as auto-migration does not work for content, only for schema. If you also want to migrate content, then manual migrations scripts are necessary