We got crushed with some downtime after all our queries to DatoCMS for our article model started returning this error:
Error [ApiError]: Request failed: {"errors":[{"message":"Records of model article are currently being re-validated: when X-Exclude-Invalid header is turned on, all records needs to be ensured as valid"}]}
From my research it seems like the following is true:
This error indicates that existing content is being re-validated against recent schema changes on the article model.
The process is automatic and is triggered by structural modifications, such as adding/removing fields or changing validation rules.
That matches our situation ā we had recently modified a field on the article model.
We also use the X-Exclude-Invalid header in our queries (itās a fantastic feature for strict TypeScript workflows).
My question is: how are we supposed to modify a schema without bringing down queries that use X-Exclude-Invalid until all records are re-validated? Schema modifications are unavoidable in any project. It canāt be the case that every change results in failed queries until validation finishes.
Whatās the correct flow here? Are we missing a best practice to avoid downtime in these situations, or is this something that needs improvement on the API side of DatoCMS? I know for certain weāve changed schemas before without running into this issue.
I think this is unfortunately a problem of a certain scale With 100 records, it would be very fast. With 10,000, it would take a while (a few min?). If you try to make a query during that time, you will indeed get the error: API endpoint and header modes | Documentation ā DatoCMS
The suggestion there is to do the schema migration in a forked sandbox env, let it take its sweet time there, and then promote it to primary (which is instant) once itās ready. If you can do a content freeze during this process, you might even need a migration script⦠just freeze the primary, fork it, do the schema changes in the sandbox, and then promote the sandbox to primary once itās ready and unfreeze again. There should be no downtime that way.
Itās a hassle, but I donāt think itās quite avoidable⦠how would we be able to determine the validity of the records upon a change, without iterating through them and checking each one against the new rules / schema?
@roger Thanks. We have ~27k records in this project. Weāre ~4 hours in and still seeing the errors ā is that expected?
In your migration example you said ālet it take its sweet time there, and then promote it to primary (which is instant) once itās ready.ā Where do we monitor revalidation progress/readiness? Is there a retrievable status or UI for this?
When I filter records by validity, they all show valid. Given weāre using the X-Exclude-Invalid header, is there another state besides āinvalidā that would still trigger the āre-validatedā error during this process so we know we are safe to ship? Lastly, just double-checking revalidation is isolated per environment yeah?
@brian1 I donāt think it should take that long. Could you please email us at support@datocms.com with the specific project & environment (like its URL) that this is happening to, and I will have a developer manually check this?
No, unfortunately not You just have to query it until it succeeds in the sandbox, then you know itās āsafeā for promotion. But it shouldnāt take hours. Let me take a look at the specifics here and see what is happening⦠itās possible you ran into a bug of some sort.
Sorry about that!
Are you saying itās only the meta count that hasnāt been updated yet (pending completion of the revalidation), or is there some other filter that allows you to ābypassā X-Include-Invalid and still get back those records directly?
Yeah, the records and schemas are per-environment, so the revalidations are tied to those. But in your case, it seems like something has gone wrong, so we need to investigate further!