UPSTREAM PR #1354: feat: add Euler CFG++ and Euler-A CFG++ samplers#85
UPSTREAM PR #1354: feat: add Euler CFG++ and Euler-A CFG++ samplers#85
Conversation
OverviewAnalysis of 49,691 functions (85 modified, 74 new, 70 removed) across two binaries shows minor overall impact with near-zero power consumption changes:
Commit Function AnalysisMost Impacted Functions: std::_Rb_tree::begin() (sd-cli, two variants) - Red-black tree iterator for tensor maps
std::vector::back() (sd-cli) - IMPROVEMENT
Sampler name matching lambda (sd-server, main.cpp:912-942)
GGMLRunner::alloc_params_ctx() (sd-server) - IMPROVEMENT
ggml_log_internal() (sd-server) - IMPROVEMENT
Other analyzed functions (regex operations, smart pointer cleanup, hash table operations, error handlers) showed minor changes (<200ns) in non-critical paths with no impact on inference performance. Additional FindingsGPU/ML Operations: No direct GPU kernel modifications. GGML infrastructure improvements (memory allocation 5-8% faster, logging 10% faster) benefit model initialization. Core inference pipeline unaffected. STL Pattern: Compiler optimization differences created divergent outcomes - vector/smart pointer operations improved significantly while red-black tree/regex operations regressed. Net impact negligible as regressions occur in non-hot paths (initialization, validation). 🔎 Full breakdown: Loci Inspector |
Note
Source pull request: leejet/stable-diffusion.cpp#1354
This PR adds support for the Euler CFG++ and Euler Ancestral CFG++ samplers: CFG++.
The logic from the code has been adapted from their repository and checked against ComfyUI's implementation and I tried to keep the sampler style as close as possible to the existing ones.
Some changes were needed in
src/stable-diffusion.cppas this specific sampler requires the unconditioned output in order to work.This currently doesn't work with Spectrum cache.
As any CFG++ sampler you must use very low CFG values (for SDXL often less than 2).
I'd be very grateful if anyone could review this, as it's the first sampler I implement that requires this kind of changes.