Skip to content

feat: stale-if-error support#1307

Open
TartanLlama wants to merge 40 commits intomainfrom
sy/stale-if-error
Open

feat: stale-if-error support#1307
TartanLlama wants to merge 40 commits intomainfrom
sy/stale-if-error

Conversation

@TartanLlama
Copy link
Copy Markdown
Contributor

@TartanLlama TartanLlama commented Feb 12, 2026

@TartanLlama TartanLlama marked this pull request as draft February 12, 2026 13:55
Copy link
Copy Markdown
Collaborator

@cceckman-at-fastly cceckman-at-fastly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! One nit and one symbol fix

Comment thread runtime/fastly/builtins/fetch/fetch.cpp Outdated
Comment thread runtime/fastly/host-api/fastly.h Outdated
@TartanLlama TartanLlama changed the title feat: Implement stale-if-error support feat: stale-if-error support Apr 8, 2026
@TartanLlama TartanLlama marked this pull request as ready for review April 9, 2026 14:48
@TartanLlama TartanLlama requested a review from zkat April 9, 2026 14:48
Copy link
Copy Markdown
Collaborator

@cceckman-at-fastly cceckman-at-fastly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits & clarifying questions -- but the thing on 5xx handling is a blocker.

response.staleIfError = 300;
},
TypeError,
'Response set: staleIfError must be set only on unsent cache transaction responses',
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Is this the same criterion as "candidate responses", below?

In the Rust universe, we have separate types for the two, but I think they're the same in JS -- the restriction makes sense here, just might be clearer if it's using the same language.

Comment thread runtime/fastly/builtins/fetch/fetch.cpp Outdated
if (!state_res.is_err()) {
auto cache_state = state_res.unwrap();
auto status = Response::status(response_obj);
if (cache_state.is_usable_if_error() && (status >= 500 && status < 600)) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading this right -- I don't think we should have the 5xx check here.

Here, we say about the Delivery behavior:

Fastly will, by default, accept and serve any response that is syntactically valid HTTP, which includes error responses.

This is also how the Rust SDK works: if there's a stale-if-error response in cache, and we get a 5xx from the origin, we return the 5xx to the user by default. The user can override that (in vcl_fetch, or in after_send in Rust) by matching the 5xx and returning an error from that handler... which causes the send (Rust) to return the stale response.

If I'm understanding this code right -- this precludes the ability for the guest to prefer the 5xx to the stale-if-error? Like, if the server returns a 5xx, they'll just always get SIE instead? I don't think we want that.

(Note: the only case in which Delivery "forces" the stale-if-error response is if the backend is failing a health check, according to here; all other cases require an explicit choice of deliver_stale.)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I may be misreading where in the stack this code lives, though, so please correct me if so!)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, I've removed this behaviour

Comment on lines +1188 to +1193
// We have a usable stale-if-error response
auto chose_stale_res = cache_entry.transaction_choose_stale();
if (auto *err = chose_stale_res.to_err()) {
HANDLE_ERROR(cx, *err);
return std::nullopt;
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I need a bigger picture here...

usable_if_error can show up in two cases:

  1. This client is a follower of a request collapse. Revalidation has already happened, and failed, so lookup returns the stale-if-error response as "immediately available" (FOUND | USABLE are also set).
  2. This client is the leader of a request collapse. In this case the MUST_INSERT_OR_UPDATE flag is also set.

transaction_choose_stale should only be called in case (2). Are there any paths by which try_serve_stale_if_error can be hit from case (2)?

(In case (1), we shouldn't be performing the backend send or evaluating the beforeSend / afterSend hooks, so... probably OK?)

Comment thread runtime/fastly/builtins/fetch/fetch.cpp Outdated
transaction_res = host_api::HttpCacheEntry::transaction_lookup(
request_handle, std::span<uint8_t>{override_key_hash.data(), override_key_hash.size()});
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this would be the same as

Suggested change
transaction_res = host_api::HttpCacheEntry::transaction_lookup(
request_handle, std::span<uint8_t>{override_key_hash.data(), override_key_hash.size()});

right? Since .empty() implies .size() == 0 -- or is there something about whether the pointer has data or not that matters?

Comment on lines +1423 to +1430
if (cache_state.is_usable_if_error()) {
// We've got a stale-if-error response, so swap it out for the error and notify any
// request collapse followers.
auto chose_stale_res = cache_entry.transaction_choose_stale();
if (auto *err = chose_stale_res.to_err()) {
HANDLE_ERROR(cx, *err);
return false;
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand the interaction between this and try_serve_stale_with_error... is it: try_serve_stale_if_error is only invoked as part of promise processing, and this bit runs if the error occurs "not in a promise"?

Comment thread runtime/fastly/builtins/fetch/request-response.cpp Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants