Motivations
We were investigating flakiness with Cloudflare requests that already had a generous retry limit, but were flagged as Fatal by the default policy.
As it turns out, one of the errors looked like:
reqwest::Error {
kind: Request,
url: Url { ... },
source: hyper_util::client::legacy::Error(SendRequest, hyper::Error(Io, Custom { kind: InvalidData, error: "received fatal alert: BadRecordMac" }))
}
There are various reports of this BadRecordMac (rustls) or ERR_SSL_BAD_RECORD_MAC_ALERT (openssl) when using Cloudflare.
Retrying mitigates the issue, but since it's considered Fatal instead of Transient, the request fails.
Solution
Update classify_io_error to mark this error as transient.
fn classify_io_error(error: &std::io::Error) -> Retryable {
match error.kind() {
- std::io::ErrorKind::ConnectionReset | std::io::ErrorKind::ConnectionAborted => {
+ std::io::ErrorKind::ConnectionReset | std::io::ErrorKind::ConnectionAborted | std::io::ErrorKind::InvalidData => {
Retryable::Transient
}
_ => Retryable::Fatal,
}
}
Alternatives
Consider even more variants to be marked as transient.
I haven't investigated all of them, but some that might be transient from their description:
Additional context
Tested with
- reqwest-retry 0.7.0
- reqwest-middleware 0.4.0
- reqwest 0.12.4 (including
rustls-tls-native-roots)
- hyper 1.3.1
Motivations
We were investigating flakiness with Cloudflare requests that already had a generous retry limit, but were flagged as Fatal by the default policy.
As it turns out, one of the errors looked like:
There are various reports of this
BadRecordMac(rustls) orERR_SSL_BAD_RECORD_MAC_ALERT(openssl) when using Cloudflare.Retrying mitigates the issue, but since it's considered Fatal instead of Transient, the request fails.
Solution
Update
classify_io_errorto mark this error as transient.fn classify_io_error(error: &std::io::Error) -> Retryable { match error.kind() { - std::io::ErrorKind::ConnectionReset | std::io::ErrorKind::ConnectionAborted => { + std::io::ErrorKind::ConnectionReset | std::io::ErrorKind::ConnectionAborted | std::io::ErrorKind::InvalidData => { Retryable::Transient } _ => Retryable::Fatal, } }Alternatives
Consider even more variants to be marked as transient.
I haven't investigated all of them, but some that might be transient from their description:
Additional context
Tested with
rustls-tls-native-roots)