Remix + CDN Caching Headers

mattrothenberg_4zns · May 10, 2024, 2:18pm

Hello!

I’m about to build my 10th website with DatoCMS (!!!), and this time my client has a hard requirement that the site update as close to instantly as possible.

This rules out building the site to static (with a framework like Astro, which I’ve used for almost all of my sites so far), and I was interested in using Remix for this upcoming project.

The trouble is, the site is going to be heavily trafficked, and I’m concerned I will quickly run up against the rate limits of the professional plan if every visit to the site is N number of API requests over to DatoCMS.

As I understand it:

currently every call both to our REST and GraphQL endpoints count against the API call limit.

And so I’m curious, what’s recommended or possible in terms of caching content from DatoCMS (via something like the stale-while-revalidate header)? Am I stuck hitting the CDN for every request, or is there some middle ground where I can cache payloads for some pre-determined amount of time and revalidate in the background?

Thank you!

roger · May 10, 2024, 8:19pm

Hey @mattrothenberg_4zns, welcome to the forum, and thanks for being a loyal customer!

I think the nuance here is what your client means by “as close to instantly as possible”. Is a 1-minute delay acceptable? 5 seconds? Is it ok for a few visitors to see stale content for several seconds while the cache updates in the background, or does every single visitor need real-time information?

If this is something that has to be truly real-time with updates visible in < 1 second (like live chat, or video games, or high frequency stocks, or some other use case like that), caching it will be very difficult. It’s possible that DatoCMS (or other headless services, for that matter) wouldn’t even be the right fit for that kind of data, vs having it live in some edge KV store or a fast redis server or similar. It really depends on the use case here… what kind of data are they storing, how often is it written and read, how quickly do invalidations have to happen, how long can a client wait for a network response, etc.?

If it’s a more common use case like blog posts, many clients SAY they want fast updates, but if they are not very technical, sometimes they don’t really understand the range of what “fast” could mean in the web context.

As you probably know, using stale-while-revalidate in combination with something like incremental static regeneration is a typical way to solve “fast enough” for things like blog posts. It’s built in to frameworks like Next and can be manually added to Astro with a lot of work. You can also build similar things on top of Cloudflare Workers yourself, but by that point you’re basically reinventing Vercel.

I’m not familiar enough with Remix to be able to explain how it handles this, but it does seem like it supports SWR for rebuilding and redeploying pages (but it’s not clear to me how it ties into a hosting provider… seems to me it might just work automatically on Vercel, or manually using remix-serve on VMs? I’m not sure). Here are some links discussing it, though:

In every case, the common thing between those implementations is that they all have some sort of external caching going on, where pages are pre-baked into HTML files and served from a CDN. Those requests don’t hit our API at all and so you won’t be charged for them (except for the invalidations).

If page-level caching isn’t fast enough for you, you might have to roll your own caching proxy for our API, again using something like a serverless worker, or even just a simple Cloudflare Page Rule that caches our paths on a time-based invalidation.

If that still isn’t fast enough, you could consider mirroring all your DatoCMS content (it’s just a bunch of JSONs) into a memory cache like redis, but at that point I’d question whether this is the right service to use for data that realtime =/

Worst case, yes, you can have every visitor directly hit our API from their browser, but that will very quickly exhaust your API limits for any popular site. It also runs the risk of a malicious user extracting your API key and DoSing your account limit by just making the requests over and over again.

Despite this, some of our customers do exactly that (have their visitors directly query our APIs)… we don’t recommend it, though, because it does get very expensive quickly. Usually they come back and ask how to make it not so expensive, and we just recommend the same caching strategies as above (basically, static generation + regeneration at the file level, or at the very least a caching proxy for our API).

If I were in your situation, I’d probably try to build a few simple prototypes (in various frameworks), comparing the different caching strategies for their performance and costs. You can then make a report for your client, estimating the costs (made-up example): Approach A would cost $20/mo, but maybe 0.1% of your visitors would see slightly stale data for one minute, and your host/CDN would protect you from DoS attacks. Approach B means everybody sees up-the-second data all the time, but it’ll cost you $500/mo in estimated monthly overages, potentially spiking a lot higher if you get DoSed. Explain the tradeoffs to them (in performance vs cost) and let them pick which one they’d want you to build, and charge them accordingly?

Probably most clients would be ok sacrificing a LITTLE bit of freshness for a lot of dollars savings, but it really depends on their use case!

Hope that helps a bit?

mattrothenberg_4zns · May 10, 2024, 8:50pm

Hi Roger, thanks for the thorough and comprehensive answer. This is really helpful!

A minute delay is totally acceptable. And in the past, when using Vercel + Astro, updates usually went out in a matter of minutes. It just felt wasteful to rebuild the entire universe when replacing a comma with a semicolon, or changing a single word in a paragraph of copy. I totally understand that ISR (especially with frameworks like Next.js) are the optimal solution to that pain point. My clients also hated having to manually press a build button. It made sense to them that when they published the content, it would be “live.”

This time around, though, I won’t be deploying on Vercel, and won’t have the luxury of being able to trigger rebuilds via a webhook. Which is why I’ve opted for a traditional server-rendered framework like Remix which is effectively a long running Node process that spits out some HTML. Every Remix route has the option of specifying a set of headers, and I was hoping to pass along a custom Cache-Control header that allows me to simultaneously:

minimize the number of requests I make to DatoCMS, so as to not go over quota
ensure a timely update of content, where a minute or two of inconsistency isn’t the end of the world.

Maybe going with a fully static representation of the world is the optimal choice here and I’m hung up on the site being a more traditional, server-rendered sort of thing.

In any event, this is really helpful and thanks again!

roger · May 13, 2024, 7:29pm

Hi @mattrothenberg_4zns, thanks for the detailed explanation!

I am not 100% sure how Remix handles its internal caching, but what you’re suggesting makes sense to me (the SSR acting as an in-memory cache, with Remix’s Node server only pinging Dato if its own memory cache is expired). I’m not familiar enough with Remix to know if that’s actually how it works, but it seems to me like it could be worth some experimentation? If it actually works that way, it SHOULD limit your Dato API calls. Your visitors would only ever hit your node server, and your node server would manage its own invalidations and query Dato again as necessary (hopefully no more than once per invalidation).

However, from a performance standpoint, it doesn’t seem to me like this method (for this use case) would have any advantage over a static build. The Node server becomes a single point of failure (and a network bottleneck), vs a built HTML file that can be easily duplicated and served from CDN edge nodes all over the world. Even if, in both cases, the invalidation can be deferred (SWR), I think it’d still be faster to serve a cached HTML file than for a Node process to use a cached JSON to regenerate a HTML page and then serve that. You could also put your own Varnish cache in front of the Node server, but I think that’s just duplicating (poorly) what a good CDN would do anyway.

So TLDR, if you think there’s an advantage to using SSR for this case, some experimentation might be good… I just can’t think of one off the top of my head! Typically SSR would block the render (by checking the cache ahead of time, and fetching new data before returning anything to the visitor) to ensure freshness, at the cost of performance. Using SSR to do SWR seems like to me a roundabout way to get what a static build would do.