Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support stale-while-revalidate option #43

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Embraser01
Copy link

@Embraser01 Embraser01 commented Nov 16, 2023

Currently, there was an issue with the stale-while-revalidate options. The satisfiesWithoutRevalidation method didn't allow stale response to be returned when swr was set.

I changed the logic so that the stale response is returned in that case.
I also added tests on this and on some edge case with the max-stale request option (I'm not 100% sure it the right way but didn't find any specs on this).

Related issues: #41, #27

@kornelski
Copy link
Owner

Thanks for the PR. This functionality is quite desirable.

However, my concern is how to communicate to library users that they still need to do the revalidation. If someone simply wrote code like recommended so far:

if (cached.satisfiesWithoutRevalidation(req)) {
    return recycle(cached);
    // gone! no request made
}

then they won't know that they still need to perform revalidation in the background. I wouldn't want them to have to check headers themselves separately.

Any ideas how the interface could be changed to be more robust in this case?

I'm thinking that perhaps satisfiesWithoutRevalidation could continue to return false, and users would use some other method in the "cache miss" code path to check if they can return stale in the meantime. I'd rather require caches supporting stale-while-revalidate to have extra code, than to have a pitfall of existing users not revalidating when needed.

@Embraser01
Copy link
Author

I agree that this is not the best user interface. I tried to limit the impact of my changes so that it could be usable (for now it isn't really usable).
It would be nice for the user to know if the response given by satisfiesWithoutRevalidation is staled or fresh (given that max-stale can also return stale data).

I think having a boolean as a return value prevent complexity and is simple to use and understand however, there is some cases (swr, max-stale) where this boolean doesn't give enough info on what to do.
For now with max stale, it's easy to know if the res is staled content or not with the canStale() function (if it satisfies and can stale => staled content), but it isn't straight forward and doesn't work with stale while revalidate.

I'm not sure on what interface would work better, maybe instead of returning a boolean, it should return an enum or maybe something more flexible like an object like:

const { satisfied, revalidate } = cached.satisfiesWithoutRevalidation(req);

Where satisfied is true as long as the cached content can be returned ASAP
revalidate is true if cache is expired and should be refreshed

Most cases would be satisfied(true), revalidate(false) or satisfied(false); revalidate(true).
When using stale-while-revalidate you could have satisfied(true), revalidate(true)

That could be a new method so that would break anything

@kornelski
Copy link
Owner

Perhaps a small change could be to keep satisfiesWithoutRevalidation return false for s-w-r, and repurpose useStaleWhileRevalidate(req) to be equivalent of satisfiesWithoutRevalidation that supports s-w-r.

I also maintain a Rust version of the same package, and I'll have to add it there too. Thanks to Rust's support for enums the API ended up looking completely different:

match policy.before_request(req) {
     Fresh(response) => return response,
     Stale(request) => return policy.after_response(send(request)),
     // SWF(request, response) => { update_cache(request)); return response }
}

So maybe in JS this could be generalized to something like this:

let { request, response } = policy.evaluate(request, response);
if (request) { update_cache(request) }
if (response) { return response } else { wait for the cache here…? }

@Embraser01
Copy link
Author

Perhaps a small change could be to keep satisfiesWithoutRevalidation return false for s-w-r, and repurpose useStaleWhileRevalidate(req) to be equivalent of satisfiesWithoutRevalidation that supports s-w-r.

I think it would be better to keep a single API for all the cases. To not create a breaking change, it can be a new recommended API that would handle the swr case and maybe other cases like in your evaluate solution.

So maybe in JS this could be generalized to something like this:

let { request, response } = policy.evaluate(request, response);
if (request) { update_cache(request) }
if (response) { return response } else { wait for the cache here…? }

I'm a bit confused on the request & response naming, it doesn't give any semantics information. I think the evaluate function could return (naming TBD):

  • revalidate: true as soon as the request should be sent to the origin server
  • asyncRevalidate: true for stale-while-revalidate, false otherwise
  • cacheHit: true when the cached response can be sent (cannot be false if revalidate is false)
  • responseHeaders: (optional) The updated response headers (only if applicable and not too computing intensive 🤔)

That system would allow users to only implement simple case (no swr)

const { revalidate, asyncRevalidate, cacheHit } = policy.evaluate(req, res);

if (revalidate) {
	const newRes = await update_cache(req);
	return newRes;
}
if (cacheHit) {
	res.headers = policy.responseHeaders(res);
	return res;
}

but with easy support of swr

const { revalidate, asyncRevalidate, cacheHit } = policy.evaluate(req, res);

if (asyncRevalidate) {
	update_cache(req).catch(err => logError(err));
} else if (revalidate) {
	const newRes = await update_cache(req);
	return newRes;
}
if (cacheHit) {
	res.headers = policy.responseHeaders(res); // Could directly come from evaluate
	return res;
}

@kornelski
Copy link
Owner

I've jumped a few steps ahead in my explanation.

The updated response headers (only if applicable and not too computing intensive 🤔)

Update of headers is not optional. You're simply not allowed to skip that step, even if it took whole day to compute (but it doesn't). Therefore, I want to change the API to always give you the updated headers.

Similarly with response revalidation. If the cached version is unusable, you'll have to make a request. If it's s-w-r, you have to make the request anyway. So I want to provide up-to-date request headers for fetch/revalidate whenever it's needed.

So if you have something to return (it's fresh or s-w-r), you'll get response headers to use. If you have to make a request to the origin (because it's a miss or s-w-r), you'll get a set of headers for making that request. The observation here is that it doesn't really matter if it's a request due to cache miss or a revalidation request, because you're going to send it to the origin anyway, and it'll go back to the cache to update it.

So you don't really need to know the details of what happened. You only have two possible actions: return a response or make an upstream request. S-w-r is a case where you do both, but doesn't change what inputs you need — it's still a response to return and a request to make. That's why I suggested { request, response } as a returned object, but I take this may be unclear naming.

How about this:

  • { freshResponse: { headers: {…} } } if it matches,
  • { staleResponse: { headers: {…} }, originRequest: { headers: {…} } } if it's s-w-r,
  • { cacheMiss: true, originRequest: { headers: {…} } } if it's a miss.

@Embraser01
Copy link
Author

Thanks for the explanation, much clearer now!
I like the new naming, but after trying to write some of this to test a bit how that would work in real life, I found that it wasn't really easy to use and still let some tricky thing up to the user.

I tried again but with some inspiration from your rust example. I think I found something that would solve this:

The idea would be to have the evaluate function look like this:

type EvaluateRes<T> = {
  cacheResponse: Response;
  asyncRevalidationPromise?: Promise<T>;
};

async evaluate<T>(
  req: Request,
  res: Response,
  doRequest: (req: Request) => Promise<T>,
): Promise<EvaluateRes<T>> {
  // If cache is fresh
  //    return updated response
  // If cache is stale and stale-while-revalidate is valid
  //    call doRequest(req).catch(logSomewhere)
  //    return updated response
  // Else
  //    call await doRequest(req);
  //    send updated response

  return {} as EvaluateRes<T>;
}

And then, use it like this:

async function doRequest<T>(request: Request): Promise<T> {
  const response = await callHTTPEndpoint(request);

  const policy = new CachePolicy(request, response, {});
  if (policy.storable()) {
    await cache.save(request, { policy, response }, policy.timeToLive());
  }
  return response;
}

export async function test() {
  // And later, when you receive a new request:
  const newReq: Request = { headers: {} };

  const cacheRes = await cache.get(newReq);
  if (!cacheRes) {
    return await doRequest(newReq);
  }
  const { policy, response } = cacheRes;

  const { cacheResponse, asyncRevalidationPromise } = await policy.evaluate(
    newReq,
    response,
    doRequest,
  );

  // (optional) Let the user have some control on stale while revalidate
  if (asyncRevalidationPromise) {
    asyncRevalidationPromise.catch(() =>
      console.error('SWR request was not successful'),
    );
  }
  return cacheResponse;
}

This system remove the need for the user to know when to do the request and if they should await or not the response. It always gives back an updated response.

@Embraser01
Copy link
Author

@kornelski friendly ping on this discussion 🙂

@kornelski
Copy link
Owner

kornelski commented Dec 5, 2023

I'm not a fan of inversion of control approach. While it does make it a bit more foolproof, it also makes the API more opaque, and I'm worried it could be too restrictive for some advanced uses.

I think there can be two ways forward:

  1. Keep satisfiesWithoutRevalidation return false for s-w-r, and repurpose useStaleWhileRevalidate(req) to be equivalent of satisfiesWithoutRevalidation that supports s-w-r.

  2. Add beforeRequest and afterResponse like this: https://docs.rs/http-cache-semantics/latest/http_cache_semantics/struct.CachePolicy.html#method.before_request https://docs.rs/http-cache-semantics/latest/http_cache_semantics/struct.CachePolicy.html#method.after_response mapped to something like {freshResponse|staleResponse|originRequest} mentioned in my previous comment.

davidaucoin7377

This comment was marked as off-topic.

davidaucoin7377

This comment was marked as spam.

Copy link
Owner

@kornelski kornelski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

davidaucoin7377

This comment was marked as spam.

@kornelski kornelski requested review from davidaucoin7377 and kornelski and removed request for davidaucoin7377 and kornelski July 1, 2024 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants