Unfortunately, existing methods require model fine-tuning with distillation losses to encourage diversity and representativeness. However, these methods do not guarantee sample diversity, limiting their performance.
We propose a mode-guided diffusion model that leverages a pre-trained diffusion model without the need for fine-tuning using distillation losses. Our approach addresses dataset diversity in three stages: Mode Discovery to identify distinct data modes, Mode Guidance to enhance intra-class diversity, and Stop Guidance to mitigate artifacts in synthetic samples that affect performance.
We evaluate our approach on ImageNette, ImageIDC, ImageNet-100, and ImageNet-1K, achieving accuracy improvements of 4.4%, 2.9%, 1.6%, and 1.6%, respectively, over state-of-the-art methods. Our method eliminates the need for fine-tuning diffusion models with distillation losses, significantly reducing computational costs.
@inproceedings{chan2025mgd3,
title = {{MGD}$^3$: Mode-Guided Dataset Distillation using Diffusion Models},
author = {Chan Santiago, Jeffrey A. and Tirupattur, Praveen and Nayak, Gaurav Kumar and Liu, Gaowen and Shah, Mubarak},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning (ICML)},
year = {2025},
}