-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
Description
Summary
- Context:
ProviderCircuitStatemanages the in-memory circuit-breaker state for API providers to prevent overloading them after receiving rate-limit signals. - Bug: A race condition in the state update logic allows
isAvailable()to read an uninitialized (null)nextRetryTimewhile the circuit is marked as open, leading to aNullPointerException. - Actual vs. expected: Concurrent threads can encounter a crash (NPE) when a provider first hits a rate limit, instead of correctly identifying that the provider is unavailable.
- Impact: The application fails to process requests and may crash during the transition to a rate-limited state, especially under high concurrency when a provider is first throttled.
Code with bug
private volatile Instant nextRetryTime; // <-- BUG 🔴 [Uninitialized; remains null until first recordRateLimit call]
boolean isAvailable() {
if (circuitOpen && Instant.now().isBefore(nextRetryTime)) { // <-- BUG 🔴 [NPE if circuitOpen is true but nextRetryTime is still null]
return false;
}
// ...
}
void recordRateLimit(long retryAfterSeconds) {
circuitOpen = true; // <-- BUG 🔴 [Visibility race: other threads see circuitOpen=true before nextRetryTime is set]
if (retryAfterSeconds > 0) {
nextRetryTime = Instant.now().plusSeconds(retryAfterSeconds);
return;
}
// ...
}Evidence
- Reproduction Test Results: A multi-threaded test case concurrently calling
recordRateLimit()andisAvailable()on a freshProviderCircuitStateinstance consistently triggers the following exception:java.lang.NullPointerException: Cannot invoke "java.time.Instant.isBefore(java.time.Instant)" because "this.nextRetryTime" is null at com.williamcallahan.javachat.service.ProviderCircuitState.isAvailable(ProviderCircuitState.java:33) - Logical Trace:
- The
ProviderCircuitStateis instantiated withnextRetryTimeasnull. - Thread A receives a rate-limit error and calls
recordRateLimit(10). - Thread A executes
circuitOpen = true. SincecircuitOpenisvolatile, this change is immediately visible to other CPUs. - Thread B concurrently calls
isAvailable(). - Thread B reads
circuitOpen == trueand proceeds to evaluateInstant.now().isBefore(nextRetryTime). - Because Thread A has not yet reached the line assigning a value to
nextRetryTime, Thread B readsnull. Instant.isBefore(null)throws aNullPointerException.
- The
Why has this bug gone undetected?
This bug is extremely transient and only manifests the very first time a specific provider is rate-limited during the application's lifecycle. Once nextRetryTime is assigned any non-null value, it is never reset to null, meaning subsequent rate-limit events will not trigger the NPE. Furthermore, the race window is limited to the tiny gap between two volatile writes, making it difficult to hit without significant concurrent load at the exact moment of the first provider failure.
Recommended fix
Initialize nextRetryTime to a safe default and update the state in an order that ensures consistency for readers.
ProviderCircuitState(int maxBackoffMultiplier) {
this.nextRetryTime = Instant.EPOCH; // <-- FIX 🟢 [Initialize to prevent null reads]
// ...
}
void recordRateLimit(long retryAfterSeconds) {
Instant retryTime;
if (retryAfterSeconds > 0) {
retryTime = Instant.now().plusSeconds(retryAfterSeconds);
} else {
backoffMultiplier = Math.min(backoffMultiplier * 2, maxBackoffMultiplier);
retryTime = Instant.now().plusSeconds(backoffMultiplier);
}
this.nextRetryTime = retryTime;
this.circuitOpen = true; // <-- FIX 🟢 [Set circuitOpen LAST to ensure nextRetryTime is visible]
}Reactions are currently unavailable