Skip to content

Feat/optimizations stability#21

Merged
Aatricks merged 2 commits intomainfrom
feat/optimizations-stability
Apr 13, 2026
Merged

Feat/optimizations stability#21
Aatricks merged 2 commits intomainfrom
feat/optimizations-stability

Conversation

@Aatricks
Copy link
Copy Markdown
Owner

This pull request introduces several documentation and code improvements aimed at clarifying LightDiffusion-Next's optimization stack, improving extensibility for model function wrappers, and refining the behavior and documentation of various caching and batching optimizations. The most significant changes include a new implementation report in the documentation, enhanced explanations of caching strategies, improved handling of model function wrappers for optimization stacking, and refinements to conditional batching logic.

Documentation improvements:

  • Added links and sections for an "Implemented Optimizations Report" in the documentation, providing a source-based breakdown of current optimizations and their implementation status. [1] [2] [3]
  • Clarified and updated the documentation for WaveSpeed/DeepCache and First Block Cache (FBCache), emphasizing which optimizations are user-facing and which are groundwork for future or specialized use. Updated configuration and usage instructions accordingly. [1] [2] [3] [4] [5] [6] [7] [8]
  • Updated prompt caching documentation to reflect current cache structure, behavior, and best practices. [1] [2] [3]
  • Clarified support status for SageAttention and SpargeAttn on RTX 50 series GPUs in both the README and SageAttention documentation. [1] [2]

Codebase enhancements:

  • Introduced ModelFunctionWrapperChain in ModelPatcher.py to allow multiple model function wrappers to be composed and applied in order, preventing silent disabling of earlier optimizations when stacking wrappers. Updated set_model_unet_function_wrapper to use this chain. [1] [2]
  • Improved conditional batching logic in calc_cond_batch to make batched_cfg behavior more explicit and robust, with clearer separation of conditional and unconditional branches when batching is disabled. [1] [2]
  • Updated and expanded unit tests to support new batching logic and to record batch sizes for verification.

Other changes:

  • Minor clarification and rewording in advanced CFG optimizations documentation to better explain the effects and limitations of joint conditional/unconditional batching. [1] [2] [3]
  • Small code cleanup: removed device capability check in sageattention_enabled() to reflect updated support for newer GPUs.

- Updated optimizations.md to include a link to the Implemented Optimizations Report and clarified WaveSpeed caching strategies.
- Added prompt-caching.md to document the new prompt attention caching feature, including configuration and best practices.
- Revised sageattention.md to reflect the current support status for RTX 5060/5070/5080/5090.
- Modified wavespeed.md to clarify the caching strategies and their implementation details.
- Updated mkdocs.yml to include the new prompt-caching.md and implemented-optimizations-report.md in the navigation.
- Refactored Device.py to simplify the sageattention_enabled function.
- Introduced ModelFunctionWrapperChain in ModelPatcher.py to allow multiple model function wrappers to compose correctly.
- Enhanced calc_cond_batch function in cond.py to support a batched configuration toggle.
- Added unit tests for conditional batching and optimization plumbing to ensure correct functionality and integration.
@Aatricks Aatricks self-assigned this Apr 13, 2026
@Aatricks Aatricks merged commit ace06cd into main Apr 13, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant