Updates by autonull · Pull Request #1 · deepstupid/caffeine

autonull · 2022-07-14T16:26:01Z

No description provided.

The lambda declared (k, executor) but applied the captured outer key instead of k. Functionally identical but k should be used directly.

Fix advance() to correctly expire timers when nanoTime() crosses from negative to zero by using separate variables for delta computation (compensated) and bucket scanning (raw), and changing the compensation condition from > 0 to >= 0. Fix expire() to clamp delta in long space before narrowing to int, preventing silent no-op on large deltas. Fix getExpirationDelay() to clamp async computing entries' negative age to zero, preventing ~220-year scheduler delays (aedile#49).

Policy methods (weightOf, ageOf, getExpiresAfter, setExpiresAfter) now check node.getValue() == null before returning metadata, consistent with containsKey's handling of GC-collected weak/soft value entries.

The async branch wrapped the future-resolution work in whenCompleteAsync(..., executor()) and thenAcceptAsync(..., executor()), so the outer comparison stage was submitted to the executor directly. If the executor rejected the continuation (shutdown, bounded queue saturated), the dependent stage completed exceptionally but nobody observed the rejection — the REPLACED notification was silently lost. Run the nv/ov comparisons synchronously on the future-completer thread and let notifyRemoval perform the single executor submission. It already falls back to running the listener inline if the executor rejects, matching every other notifyRemoval call site. The comparison itself is a pointer check, so the thread it runs on is irrelevant.

The sync views (CacheView, LoadingCacheView) lacked writeReplace and readObject protection, unlike the 8 concrete cache types. A crafted byte stream could bypass writeReplace and directly deserialize a view with a null or malicious asyncCache field. Add SyncViewProxy that wraps the async cache and recreates the sync view via asyncCache.synchronous() on deserialization. Add readObject to AbstractCacheView that rejects direct deserialization attempts.

Port the upstream JCTools fix (d64b9b44) for MpscGrowableArrayQueue. If buffer allocation fails after the producer index is set to odd (signaling resize-in-progress), restore it to the original even value so other producers can resume instead of spinning indefinitely.

- Clamp ceilingPowerOfTwo(int) to 2^30 for inputs exceeding the max signed power of two, avoiding a negative return value - Add inline fallback to AsyncRemovalListener when the executor rejects, matching the sync notifyRemoval pattern - Make MpscGrowableArrayQueue.allocate non-static so tests can override it to simulate OOME during resize

- Add -Pasync[=format] to the JMH Gradle plugin. ap-loader 4.3-13 provides the native library cross-platform (macOS x64/arm64, Linux x64/arm64); the extraction directory is redirected into the build dir to avoid macOS TCC. Only one output format is requested at a time because JMH's async-profiler integration drops all samples when multiple are combined. - Add .github/actions/run-benchmark: runs JMH, uploads results to the gist, and (when async-profiler is supported) emits a CPU flame graph SVG via FlameGraph.pl into the step summary and as an artifact. Skips the profiler on OpenJ9 (Semeru) since async-profiler relies on HotSpot-specific APIs. - Collapse the six per-benchmark workflow jobs into one matrixed job over benchmark x JDK (36 cells); publish now depends on a single job.

The int overload defensively clamps at 1 << 30 but the long overload silently overflowed to Long.MIN_VALUE for inputs above 1L << 62 (including Long.MAX_VALUE). Currently unreachable in production since the long overload is only called with small compile-time constants, but the asymmetry with the int overload is a latent hazard for future callers.

The inline `argumentProviders.add { ... }` lambda embedded the absolute output-directory path in the returned arguments, which leaks into the task's cache key and breaks build-cache relocatability across workspaces. Replace with a typed `CommandLineArgumentProvider` whose `resultsDir` field is `@get:Internal`, matching the `codeGenerationTask` pattern and the project rule (PR #1947).

AbstractCacheView.readObject rejected direct deserialization, but a crafted byte stream whose class-descriptor chain omits AbstractCacheView (via TC_NULL superClassDesc) bypassed the guard: Java neither invoked readObject on the missing ancestor nor readObjectNoData (which was not declared), leaving the concrete CacheView / LoadingCacheView instance with a stream-controlled asyncCache field. Declare readObjectNoData on AbstractCacheView so the hierarchy-skip path also throws. Add byte-stream crafted tests for both concrete subtypes.

The two-step pre-check `isCompletedExceptionally() || (isDone() && join() == null)` had a race window: if the future was cancelled between the two reads, the subsequent join() threw CancellationException out of put() instead of treating it as a failed completion. The same hazard applied to obtrudeException-style forced transitions. Collapse to a single isDone() check that catches the join exception so any done-but-not-ready state (failure, cancel, or null value) uniformly removes the mapping without propagating.

doComputeIfAbsent and remap recorded eviction stats (EXPIRED / COLLECTED) when a user operation discovered an already-evicted entry, but the sibling paths in put, remove(key), remove(key, value), and removeNode silently fired notifyEviction + notifyRemoval without bumping the counter. The asymmetry meant the same logical event was counted on compute paths but invisible on direct mutation paths. Capture oldWeight in EvictContext / RemoveContext under synchronized(node) and record the eviction alongside notifyRemoval when wasEvicted() is true.

The sibling refresh paths in LocalLoadingCache.refresh and LocalAsyncLoadingCache.refresh guard their REPLACED notification with `if (discard[0] && (newValue != null))`. BoundedLocalCache.refreshIfNeeded had no such guard on its fall-through branch, so a loader returning null combined with a concurrent put (which removed the refresh from the refreshes map before the handler ran) fired notifyRemoval(key, null, REPLACED). Non-null-tolerant RemovalListener implementations NPE on the null value, which the cache swallows and logs at WARN. Gate the fall-through `cause[0] = REPLACED` on `value != null` to match the sibling paths.

If the user's Weigher or Expiry throws inside remap during the refresh handler's compute, the exception propagated out of refreshIfNeeded's handle lambda unlogged and without any stats update. The refreshed future completed exceptionally, the caller saw null, but the LoadingCache.refresh javadoc's "exception will be logged (using System. Logger) and swallowed" contract was silently violated. Wrap the refresh commit in a try/catch that mirrors the sibling LocalAsyncLoadingCache.refresh path: log at WARNING, record a load failure, and return null. The entry's previous value remains in place.

The prior note described keyReference as "set once during construction and never mutated", which is inaccurate: retire() and die() on generated weak/soft-value nodes write sentinels into the field under synchronized(node). The note also attributed the storeStoreFence's role to publishing the newly-constructed reference's non-final fields; the actual load-bearing role (per the JCStress IntermittentNull test) is ordering the old reference's ref.clear() against the setRelease of the new reference so the getValue re-check loop's invariant holds.

A refresh that is preempted by a concurrent write (put / remove / compute / etc. discarded the refresh before its handler ran) falls through to `return currentValue` in the refresh handler's compute lambda. Returning the same instance previously caused `remap` to apply the normal update block: `expireAfterUpdate`, `setWeight`, `setAccessTime`, `setVariableTime`, and — when `exceedsWriteTimeTolerance` holds for slow refreshes — `setWriteTime`. The entry's expireAfterWrite countdown effectively restarted at the refresh completion time rather than at the user's put time, shifting expiration by up to the refresh duration. Add a package-private `LocalCache.compute` overload that accepts a `boolean[] preserveTimestamps` reference. The remapping lambda sets it to `true` on branches where the same-instance return represents a no-op (not a legitimate same-instance replace). `BoundedLocalCache.remap` short-circuits after the lambda returns when the flag is set, the new value is the same instance as the old, and no eviction cause was set, skipping all metadata updates and the post-compute afterWrite/afterRead dispatch. `UnboundedLocalCache` ignores the flag (no timestamps to preserve). `refreshes.remove` remains inside the compute lambda, so the single-flight invariant is preserved (CHM bin lock still serializes the claim with the ABA check). Adopted in the three refresh handlers: `BoundedLocalCache.refreshIfNeeded`, `LocalLoadingCache.refresh`, and `LocalAsyncLoadingCache.tryComputeRefresh`.

The 0L value doubles as "unscheduled/cancelled" and as a possible output of (now + TOLERANCE) or scheduleAt when the ticker lands on a specific value. Bump the result to 1L in that case so the recursion guard in schedule() stays correct even under an immediate scheduler.

PMD and SpotBugs both emit per-source-set JSON/SARIF files named by source set (main.json, test.json, etc.), not by module. The flat `find ... -exec cp {} <dir>/ \;` clobbered same-basename files across caffeine/guava/jcache/simulator, so only the last-visited module's findings survived the merge into the Code Scanning upload. Prefix the destination filename with the module so every module's results are preserved.

Three defensive branches in BoundedLocalCache were uncovered because their preconditions can't arise through the normal refresh protocol: 1. In the refresh completion lambda, the `removed && (currentValue == oldValue)` guard at L1401 never sees `removed==true` with a mismatched currentValue — concurrent writes always drop the refresh token under the same CHM bin lock. Exercise it via `replace(..., shouldDiscardRefresh=false)` with a different new value. 2. In `remap`, the `preserveTimestamps[0]` no-op short-circuit also checks `newValue == oldValue` and `cause == null`. Production callers couple the flag to returning `currentValue` with a null cause, so those fallback branches are unreachable. Cover them via the package-level `compute(..., preserveTimestamps)` interface with a pre-set flag and (a) a mismatched return value, (b) an expired entry where the lambda returns the captured old value.

Align the Policy setter with Caffeine.refreshAfterWrite, which has always enforced duration > 0. Zero-duration refresh is pathological (every read becomes refresh-eligible) and the asymmetry was unintentional.

Every getIfPresent on an expireAfterAccess cache wrote Node.accessTime opaquely, and every read on an Expiry cache CAS'd Node.variableTime via tryExpireAfterRead. Under concurrent reads of the same hot entry, every write transitioned the Node's cache line from S to M in MESI, invalidating the line on all other cores and forcing a refetch on the next read. Opaque and CAS modes avoid the writer's fence but not the coherence traffic. Guard setAccessTime and tryExpireAfterRead with a tolerance check: skip the store if the new value is within EXPIRE_TOLERANCE (1s) of the stored one. Math.abs is used so the async-completion path (timestamps set to now + ASYNC_EXPIRY during in-flight loads) still fires when the sentinel is replaced. Tolerance is bypassed when the configured or per-entry duration is <= tolerance so sub-second expiration windows still behave exactly. Trade-off matches the pre-existing writeTime tolerance: entries may expire up to 1s earlier than the configured duration, never later. Renamed EXPIRE_WRITE_TOLERANCE to EXPIRE_TOLERANCE to reflect the broader scope. HotEntryBenchmark pins every thread to the same key. On Linux x86 (GHA, 4 threads, JDK 25) expireAccess_hot rises from 18.6M to 49.7M ops/s (2.67x) after the change, matching expireAccess_spread. The cache-line true-sharing cost is eliminated. ARM shows the same pattern.

A recent audit caught that put/remove/clear paths weren't recording eviction stats when the old entry had already been evicted (fixed in 5bb78f0). The bug lived because the ~100 parameterized tests that already asserted on the eviction notification via assertThat(context).notifications().withCause(EXPIRED|COLLECTED|SIZE) .contains(...).exclusively(); all skipped the sibling assertThat(context).stats().evictions(N).evictionWeight(W); so only a small handful of tests (~9 across EvictionTest / ReferenceTest / BoundedLocalCacheTest) exercised the stats counters for eviction-cause removals. The asymmetry was silent — the cause-based notification and the stats counters were three observations of the same logical event and nothing forced them to agree. Fold both stats checks into the Exclusive.exclusively() subject: when the cause wasEvicted(), assert evictionCount == entries.length AND evictionWeight == sum of context.weigher().weigh(k, v) over the entries. Null values (which the listener may receive on COLLECTED with cleared weak/soft refs) are treated as weight 0 to avoid NPE in user weighers. Also fold the symmetric "no evictions" check into ListenerSubject.isEmpty() and ListenerSubject.hasNoEvictions(): both paths now also assert evictions(0).evictionWeight(0). When stats recording is disabled for a parameterization, StatsSubject.awaitStatistic short-circuits to a no-op, so no-stats runs are unaffected. This closes the gap for every existing eviction-cause assertion and prevents future tests from reproducing the same blind spot.

The eviction-recording paths were split between two orderings: - record-then-notify: evictEntry, doComputeIfAbsent's compute lambda, remap - notify-then-record: removeNode, put-on-expired/collected, remove(key), remove(key, value) For an executor(Runnable::run) cache with a synchronous RemovalListener that reads cache.stats(), the same logical eviction event would surface with evictionCount already incremented at some sites and not yet at others. Not a documented contract — stats observability inside a listener isn't guaranteed — but the asymmetry is gratuitous and was caught by review. Flip the four notify-then-record sites to record-then-notify so every eviction-recording path uses the same order. Also drop a stray double space in refreshIfNeeded's containsKey call.

ben-manes force-pushed the master branch 2 times, most recently from 940dca6 to 2f271e9 Compare July 23, 2022 22:15

ben-manes force-pushed the master branch from 40b36bb to 25b7d17 Compare August 6, 2022 22:12

ben-manes force-pushed the master branch 2 times, most recently from ac8e049 to 08b24cd Compare August 27, 2022 03:02

ben-manes force-pushed the master branch 17 times, most recently from a337c19 to 48d5246 Compare September 10, 2022 01:17

ben-manes force-pushed the master branch from dc0fb48 to ff43bd2 Compare September 17, 2022 05:48

ben-manes force-pushed the master branch 2 times, most recently from 53f222c to 00b1503 Compare September 26, 2022 02:50

ben-manes force-pushed the master branch 2 times, most recently from 73f40d3 to 6abd616 Compare October 2, 2022 09:40

ben-manes force-pushed the master branch 2 times, most recently from a7d31f5 to d74756d Compare October 14, 2022 06:02

ben-manes force-pushed the master branch from 6689b31 to 7499160 Compare October 20, 2022 16:28

ben-manes added 30 commits March 29, 2026 20:51

Use lambda parameter instead of captured outer variable

9863bb1

The lambda declared (k, executor) but applied the captured outer key instead of k. Functionally identical but k should be used directly.

Skip collected entries in policy query methods

75e23bd

Policy methods (weightOf, ageOf, getExpiresAfter, setExpiresAfter) now check node.getValue() == null before returning metadata, consistent with containsKey's handling of GC-collected weak/soft value entries.

add mutation testing for coverage gap analysis

bd70323

increase pit coverage

c4508be

Pass caller's value to hasExpired to avoid a re-read

556f76d

additional test coverage

affc148

dependency updates [skip ci]

4037060

Reject zero duration in Policy.setRefreshesAfter

19cedcf

Align the Policy setter with Caffeine.refreshAfterWrite, which has always enforced duration > 0. Zero-duration refresh is pathological (every read becomes refresh-eligible) and the asymmetry was unintentional.

defer weighing the entry until after the putIfAbsent hit fast-path

8da5a7a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updates#1

Updates#1
autonull wants to merge 1207 commits intodeepstupid:multilevel_researchfrom
ben-manes:master

autonull commented Jul 14, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

Conversation

autonull commented Jul 14, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants