UPSTREAM PR #1364: feat: add support for the eta parameter to ancestral samplers by loci-dev · Pull Request #93 · auroralabs-loci/stable-diffusion.cpp

loci-dev · 2026-03-25T04:21:04Z

Note

Source pull request: leejet/stable-diffusion.cpp#1364

Applies the eta parameter to the DPM++(2s) and Euler ancestral implementations, so the amount of injected noise can be adjusted (e.g. Euler A with eta=0 should be the same as Euler). It reuses the calculation from the RES samplers, so it's mostly a refactor and interface/UI adjustments.

#1363 already includes this, but since this is self-contained and useful on its own, I believe it's worth including directly.

loci-review · 2026-03-25T05:14:47Z

Overview

Analysis of 49,629 functions across two binaries reveals minimal performance impact. Modified: 68 functions (0.14%), New: 4, Removed: 0, Unchanged: 49,557 (99.86%).

Binaries analyzed:

build.bin.sd-cli: +0.023% power consumption
build.bin.sd-server: -0.14% power consumption

Function Analysis

Most performance changes occur in C++ STL functions due to compiler code generation differences, not application source modifications:

std::_Rb_tree::end() (sd-cli): Response time +183ns (+228%), throughput time +183ns (+307%). CFG shows entry block increased 9x (21ns → 195ns) with added indirect jump. No source changes—system library function.

std::__make_move_if_noexcept_iterator (sd-cli): Response time +185ns (+196%), throughput time +185ns (+317%). Entry block split with additional branches. Used in prompt attention parsing, not inference hot path.

GGMLRunner::alloc_params_ctx (sd-server): Response time -171ns (-4.8%), throughput time unchanged. Improvement from optimized downstream functions (ggml_init, ggml_log_internal). One-time initialization function.

ggml_log_internal (sd-server): Response time -44ns (-9.8%), throughput time -44ns (-25.2%). CFG consolidated from 8 to 7 blocks, eliminating intermediate jump.

Other analyzed functions (std::vector::back(), smart pointer operations, iterators) show mixed changes (±38-190ns) in non-critical paths with no source modifications.

Additional Findings

No functions in performance-critical inference paths (UNet/DiT forward passes, attention mechanisms, VAE operations) were affected. All changes are in initialization, logging, or STL utilities. The consistent pattern of CFG reorganization across STL functions suggests compiler version or optimization flag differences between builds rather than code regressions.

🔎 Full breakdown: Loci Inspector
💬 Questions? Tag @loci-dev

wbruna added 2 commits March 24, 2026 08:18

refactor: group sigma_up and sigma_down calculations

4105a77

feat: add support for the eta parameter to ancestral samplers

dcd9d6a

loci-dev deployed to stable-diffusion-cpp-prod March 25, 2026 04:21 — with GitHub Actions Active

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #1364: feat: add support for the eta parameter to ancestral samplers#93

UPSTREAM PR #1364: feat: add support for the eta parameter to ancestral samplers#93
loci-dev wants to merge 2 commits intomainfrom
loci/pr-1364-sd_samplers_eta

loci-dev commented Mar 25, 2026

Uh oh!

loci-review bot commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

loci-dev commented Mar 25, 2026

Uh oh!

loci-review bot commented Mar 25, 2026

Overview

Function Analysis

Additional Findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants