Conversation
- Updated optimizations.md to include a link to the Implemented Optimizations Report and clarified WaveSpeed caching strategies. - Added prompt-caching.md to document the new prompt attention caching feature, including configuration and best practices. - Revised sageattention.md to reflect the current support status for RTX 5060/5070/5080/5090. - Modified wavespeed.md to clarify the caching strategies and their implementation details. - Updated mkdocs.yml to include the new prompt-caching.md and implemented-optimizations-report.md in the navigation. - Refactored Device.py to simplify the sageattention_enabled function. - Introduced ModelFunctionWrapperChain in ModelPatcher.py to allow multiple model function wrappers to compose correctly. - Enhanced calc_cond_batch function in cond.py to support a batched configuration toggle. - Added unit tests for conditional batching and optimization plumbing to ensure correct functionality and integration.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces several documentation and code improvements aimed at clarifying LightDiffusion-Next's optimization stack, improving extensibility for model function wrappers, and refining the behavior and documentation of various caching and batching optimizations. The most significant changes include a new implementation report in the documentation, enhanced explanations of caching strategies, improved handling of model function wrappers for optimization stacking, and refinements to conditional batching logic.
Documentation improvements:
Codebase enhancements:
ModelFunctionWrapperChaininModelPatcher.pyto allow multiple model function wrappers to be composed and applied in order, preventing silent disabling of earlier optimizations when stacking wrappers. Updatedset_model_unet_function_wrapperto use this chain. [1] [2]calc_cond_batchto makebatched_cfgbehavior more explicit and robust, with clearer separation of conditional and unconditional branches when batching is disabled. [1] [2]Other changes:
sageattention_enabled()to reflect updated support for newer GPUs.