Skip to content

Add KVAE 1.0#13033

Merged
yiyixuxu merged 17 commits intohuggingface:mainfrom
ddavidchick:add-kvae-1.0
Mar 23, 2026
Merged

Add KVAE 1.0#13033
yiyixuxu merged 17 commits intohuggingface:mainfrom
ddavidchick:add-kvae-1.0

Conversation

@ddavidchick
Copy link
Copy Markdown
Contributor

@ddavidchick ddavidchick commented Jan 26, 2026

What does this PR do?

Add KVAE1.0 from Kandinsky team. git repo.

Before submitting

  • [-] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [+] Did you read the contributor guideline?
  • [+] Did you read our philosophy doc (important for complex PRs)?
  • [-] Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
  • [+] Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • [-] Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@yiyixuxu @asomoza

@yiyixuxu
Copy link
Copy Markdown
Collaborator

thanks for the PR!
Can you explain what is this used for/with? is it mainly for future research, or can we use it with an existing Kandinsky 5 pipeline we support?

@ddavidchick
Copy link
Copy Markdown
Contributor Author

KVAEs are autoencoders with strong reconstruction quality and a well-structured latent space, developed by our team. They weren't used in Kandinsky 5, as they were created a bit later, so this is mainly for future research. They can be used as a VAE backbone for developing new diffusion models.

@leffff
Copy link
Copy Markdown
Contributor

leffff commented Feb 9, 2026

Hi @yiyixuxu !
What are our next steps in merging this new VAE into Diffusers? We would love to integrate our latest VAE!
We are very excited, because this is the new SOTA VAE our team has developed. For now it can be used as is. For training new Diffusion models. Later we expect our next models to use this VAE.

@leffff
Copy link
Copy Markdown
Contributor

leffff commented Feb 10, 2026

Hi @asomoza !
What are our next steps in merging this new VAE into Diffusers?

@ddavidchick
Copy link
Copy Markdown
Contributor Author

Hi @yiyixuxu @asomoza !
just wanted to follow up on this PR. Please let us know if you've had a chance to look at it or if there’s any more information we can provide to help with the review. Thanks!

@asomoza
Copy link
Copy Markdown
Member

asomoza commented Feb 16, 2026

Hi @leffff @ddavidchick we don't have the bandwidth right now specially for a model that's not used right now, please have a little more patience for the review and comments, we will address this when we have the bandwith.

@ddavidchick
Copy link
Copy Markdown
Contributor Author

Hi @yiyixuxu @asomoza !
Can you provide any updates? We're about to release new models, which are dependent on KVAE.

Copy link
Copy Markdown
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I left some feedbacks

Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae_video.py Outdated
Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae.py Outdated
Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae.py Outdated
Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae.py Outdated
Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae.py Outdated
Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae_video.py Outdated
Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae_video.py Outdated
Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae_video.py Outdated
Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae_video.py Outdated
Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae_video.py Outdated
@MeiYi-dev
Copy link
Copy Markdown

MeiYi-dev commented Mar 6, 2026

Hi @yiyixuxu @asomoza ! Can you provide any updates? We're about to release new models, which are dependent on KVAE.

@leffff @ddavidchick Are the new models going to be unified audio-video generation models or just video only models?

@ddavidchick
Copy link
Copy Markdown
Contributor Author

Thanks, @yiyixuxu. I’ve updated the code based on your feedback. Please take a look and let me know if it’s ready to merge.

Copy link
Copy Markdown
Member

@asomoza asomoza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, left some comments, we should also add some tests like the other vaes

Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae.py Outdated
Comment thread docs/source/en/api/models/autoencoder_kl_kvae.md
Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae.py Outdated
Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae_video.py Outdated
Comment thread docs/source/en/api/models/autoencoder_kl_kvae.md
Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae.py Outdated
@ddavidchick
Copy link
Copy Markdown
Contributor Author

@asomoza, thanks! I've addressed all the comments and added tests. Let me know if anything else needs updating.

Copy link
Copy Markdown
Member

@asomoza asomoza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just one small comment, otherwise looks good to me. I'll let @yiyixuxu do the final review and merge

Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae.py Outdated
@asomoza
Copy link
Copy Markdown
Member

asomoza commented Mar 20, 2026

@bot /style

@github-actions
Copy link
Copy Markdown
Contributor

Style fix is beginning .... View the workflow run here.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I left some comments
I think we can merge this soon!

Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae.py Outdated
Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae.py Outdated
Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae.py Outdated
Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae.py Outdated
Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae.py Outdated
@ddavidchick
Copy link
Copy Markdown
Contributor Author

@yiyixuxu thanks! I've addressed the latest comments

Copy link
Copy Markdown
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@yiyixuxu
Copy link
Copy Markdown
Collaborator

@bot /style

@github-actions
Copy link
Copy Markdown
Contributor

Style fix is beginning .... View the workflow run here.

@yiyixuxu
Copy link
Copy Markdown
Collaborator

can you run make style and fix the white space issues? will merge once CI pass

@ddavidchick
Copy link
Copy Markdown
Contributor Author

@yiyixuxu done

@yiyixuxu yiyixuxu merged commit ef309a1 into huggingface:main Mar 23, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants