Feat/optimizations stability by Aatricks · Pull Request #21 · Aatricks/LightDiffusion-Next

Aatricks · 2026-04-13T13:04:06Z

This pull request introduces several documentation and code improvements aimed at clarifying LightDiffusion-Next's optimization stack, improving extensibility for model function wrappers, and refining the behavior and documentation of various caching and batching optimizations. The most significant changes include a new implementation report in the documentation, enhanced explanations of caching strategies, improved handling of model function wrappers for optimization stacking, and refinements to conditional batching logic.

Documentation improvements:

Added links and sections for an "Implemented Optimizations Report" in the documentation, providing a source-based breakdown of current optimizations and their implementation status. [1] [2] [3]
Clarified and updated the documentation for WaveSpeed/DeepCache and First Block Cache (FBCache), emphasizing which optimizations are user-facing and which are groundwork for future or specialized use. Updated configuration and usage instructions accordingly. [1] [2] [3] [4] [5] [6] [7] [8]
Updated prompt caching documentation to reflect current cache structure, behavior, and best practices. [1] [2] [3]
Clarified support status for SageAttention and SpargeAttn on RTX 50 series GPUs in both the README and SageAttention documentation. [1] [2]

Codebase enhancements:

Introduced ModelFunctionWrapperChain in ModelPatcher.py to allow multiple model function wrappers to be composed and applied in order, preventing silent disabling of earlier optimizations when stacking wrappers. Updated set_model_unet_function_wrapper to use this chain. [1] [2]
Improved conditional batching logic in calc_cond_batch to make batched_cfg behavior more explicit and robust, with clearer separation of conditional and unconditional branches when batching is disabled. [1] [2]
Updated and expanded unit tests to support new batching logic and to record batch sizes for verification.

Other changes:

Minor clarification and rewording in advanced CFG optimizations documentation to better explain the effects and limitations of joint conditional/unconditional batching. [1] [2] [3]
Small code cleanup: removed device capability check in sageattention_enabled() to reflect updated support for newer GPUs.

- Updated optimizations.md to include a link to the Implemented Optimizations Report and clarified WaveSpeed caching strategies. - Added prompt-caching.md to document the new prompt attention caching feature, including configuration and best practices. - Revised sageattention.md to reflect the current support status for RTX 5060/5070/5080/5090. - Modified wavespeed.md to clarify the caching strategies and their implementation details. - Updated mkdocs.yml to include the new prompt-caching.md and implemented-optimizations-report.md in the navigation. - Refactored Device.py to simplify the sageattention_enabled function. - Introduced ModelFunctionWrapperChain in ModelPatcher.py to allow multiple model function wrappers to compose correctly. - Enhanced calc_cond_batch function in cond.py to support a batched configuration toggle. - Added unit tests for conditional batching and optimization plumbing to ensure correct functionality and integration.

Aatricks added 2 commits April 12, 2026 12:23

docs: add link to Implemented Optimizations Report in README

a268f37

Aatricks self-assigned this Apr 13, 2026

Aatricks merged commit ace06cd into main Apr 13, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/optimizations stability#21

Feat/optimizations stability#21
Aatricks merged 2 commits intomainfrom
feat/optimizations-stability

Aatricks commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Aatricks commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant