🎉 Excited to share our latest work on extracting better features from diffusion models co-led by Stefan Baumann and Kolja Bauer. ✨ Diffusion models are amazing at learning world representations. Their features power many tasks: • Semantic correspondence • Depth estimation • Semantic segmentation 🤔 But have you ever wondered why we extract diffusion features from noisy images? Doesn’t that destroy valuable information? We show it does - and also requires finding correct hyperparameters for every downstream task. We thought, there had to be a better way. And there is. 🚀 With just 30 minutes of task-agnostic finetuning on a single GPU, we eliminate the need for noisy inputs. ✅ No noise ✅ No timestep tuning ✅ Better features, better performance across many tasks We make code and cleaned 🧹 weights available for SD 1.5 and SD 2.1. Have a look now! ⬇️
This is great
PhD Student in Generative AI for Computer Vision | CDTM | Data Science @ LMU
3mo📝 Paper: https://arxiv.org/abs/2412.03439 💻 Code: https://github.com/CompVis/cleandift 🤗 Hugging Face: https://huggingface.co/CompVis/cleandift