Add KVAE 1.0 by ddavidchick · Pull Request #13033 · huggingface/diffusers

ddavidchick · 2026-01-26T23:18:51Z

What does this PR do?

Add KVAE1.0 from Kandinsky team. git repo.

Before submitting

[-] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[+] Did you read the contributor guideline?
[+] Did you read our philosophy doc (important for complex PRs)?
[-] Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
[+] Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
[-] Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@yiyixuxu @asomoza

yiyixuxu · 2026-01-27T22:19:26Z

thanks for the PR!
Can you explain what is this used for/with? is it mainly for future research, or can we use it with an existing Kandinsky 5 pipeline we support?

ddavidchick · 2026-01-28T00:12:44Z

KVAEs are autoencoders with strong reconstruction quality and a well-structured latent space, developed by our team. They weren't used in Kandinsky 5, as they were created a bit later, so this is mainly for future research. They can be used as a VAE backbone for developing new diffusion models.

leffff · 2026-02-09T11:45:26Z

Hi @yiyixuxu !
What are our next steps in merging this new VAE into Diffusers? We would love to integrate our latest VAE!
We are very excited, because this is the new SOTA VAE our team has developed. For now it can be used as is. For training new Diffusion models. Later we expect our next models to use this VAE.

leffff · 2026-02-10T12:40:38Z

Hi @asomoza !
What are our next steps in merging this new VAE into Diffusers?

ddavidchick · 2026-02-16T11:01:23Z

Hi @yiyixuxu @asomoza !
just wanted to follow up on this PR. Please let us know if you've had a chance to look at it or if there’s any more information we can provide to help with the review. Thanks!

asomoza · 2026-02-16T13:49:47Z

Hi @leffff @ddavidchick we don't have the bandwidth right now specially for a model that's not used right now, please have a little more patience for the review and comments, we will address this when we have the bandwith.

ddavidchick · 2026-03-05T12:39:56Z

Hi @yiyixuxu @asomoza !
Can you provide any updates? We're about to release new models, which are dependent on KVAE.

yiyixuxu

thanks, I left some feedbacks

MeiYi-dev · 2026-03-06T06:33:41Z

Hi @yiyixuxu @asomoza ! Can you provide any updates? We're about to release new models, which are dependent on KVAE.

@leffff @ddavidchick Are the new models going to be unified audio-video generation models or just video only models?

…add-kvae-1.0

ddavidchick · 2026-03-11T23:05:18Z

Thanks, @yiyixuxu. I’ve updated the code based on your feedback. Please take a look and let me know if it’s ready to merge.

asomoza

thanks, left some comments, we should also add some tests like the other vaes

ddavidchick · 2026-03-20T16:50:22Z

@asomoza, thanks! I've addressed all the comments and added tests. Let me know if anything else needs updating.

asomoza

just one small comment, otherwise looks good to me. I'll let @yiyixuxu do the final review and merge

asomoza · 2026-03-20T20:20:22Z

@bot /style

github-actions · 2026-03-20T20:20:47Z

Style fix is beginning .... View the workflow run here.

HuggingFaceDocBuilderDev · 2026-03-20T21:00:55Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

yiyixuxu

thanks, I left some comments
I think we can merge this soon!

ddavidchick · 2026-03-23T10:50:58Z

@yiyixuxu thanks! I've addressed the latest comments

yiyixuxu

thanks!

yiyixuxu · 2026-03-23T20:04:35Z

@bot /style

github-actions · 2026-03-23T20:05:01Z

Style fix is beginning .... View the workflow run here.

yiyixuxu · 2026-03-23T20:10:47Z

can you run make style and fix the white space issues? will merge once CI pass

ddavidchick · 2026-03-23T20:28:25Z

@yiyixuxu done

ddavidchick added 5 commits January 22, 2026 14:05

add kvae2d

53b4fd9

add kvae3d video

b72f733

add docs for kvae2d and kvae3d video

4799ea6

style fixes

6b6d7d7

fix kvae3d docs

9d52d68

yiyixuxu reviewed Mar 6, 2026

View reviewed changes

ddavidchick added 4 commits March 11, 2026 13:22

Merge branch 'main' of https://github.com/huggingface/diffusers into …

0afb7ce

…add-kvae-1.0

fix normalzation

19f4206

fix kvae video for code style

f988d23

fix kvae video

b56b1bd

asomoza reviewed Mar 12, 2026

View reviewed changes

ddavidchick added 4 commits March 19, 2026 12:33

kvae minor fixes

e5b7991

add gradient ckpting for kvaes

6aae20c

get rid of inplace ops kvae video

e021f3f

add tests for KVAEs

cc31c77

asomoza approved these changes Mar 20, 2026

View reviewed changes

Comment thread src/diffusers/models/autoencoders/autoencoder_kl_kvae.py Outdated

yiyixuxu reviewed Mar 20, 2026

View reviewed changes

kvae2d normalization style change

a4b1670

Merge branch 'main' into add-kvae-1.0

20f8c5a

yiyixuxu approved these changes Mar 23, 2026

View reviewed changes

kvaes fix style

51224e6

update dummy_pt_objects test for kvaes

7ce6716

yiyixuxu merged commit ef309a1 into huggingface:main Mar 23, 2026
11 checks passed

Conversation

ddavidchick commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

yiyixuxu commented Jan 27, 2026

Uh oh!

ddavidchick commented Jan 28, 2026

Uh oh!

leffff commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leffff commented Feb 10, 2026

Uh oh!

ddavidchick commented Feb 16, 2026

Uh oh!

asomoza commented Feb 16, 2026

Uh oh!

ddavidchick commented Mar 5, 2026

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MeiYi-dev commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ddavidchick commented Mar 11, 2026

Uh oh!

asomoza left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ddavidchick commented Mar 20, 2026

Uh oh!

asomoza left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

asomoza commented Mar 20, 2026

Uh oh!

github-actions Bot commented Mar 20, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Mar 20, 2026

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ddavidchick commented Mar 23, 2026

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

yiyixuxu commented Mar 23, 2026

Uh oh!

github-actions Bot commented Mar 23, 2026

Uh oh!

yiyixuxu commented Mar 23, 2026

Uh oh!

ddavidchick commented Jan 26, 2026 •

edited

Loading

leffff commented Feb 9, 2026 •

edited

Loading

MeiYi-dev commented Mar 6, 2026 •

edited

Loading