AI & ML Papers
32.8K subscribers
7.06K photos
519 videos
24 files
7.71K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
AI & ML Papers
Photo
🔥 Reinforcing Few-step Generators via Reward-Tilted Distribution Matching

💡 The paper proposes a new framework called Reward-Tilted Distribution Matching Distillation, or RTDMD, to improve the alignment of few-step image generation models with human preferences. The problem addressed is that current few-step diffusion distillation methods can generate images efficiently but struggle to align with human preferences. RTDMD is a two-stage approach that combines distribution matching distillation with reward-guided reinforcement learning.

In the first stage, the authors introduce Ambient-Consistent Distribution Matching Distillation, which performs distribution matching and uses a consistency regularizer to help the model track the generator distribution.

In the second stage, the authors jointly optimize two terms: a distribution matching term and a reward maximization term. They derive a hybrid policy gradient that combines a gradient-based estimator with direct reward backpropagation to reduce variance.

The authors also introduce step-subset GRPO to further reduce variance. The experiments demonstrate that RTDMD achieves state-of-the-art results across preference, aesthetic, and compositional metrics with only 4 inference steps, outperforming previous few-step text-to-image generation methods.

The RTDMD framework is tested on several datasets, including SD3, SD3.5, and FLUX.2, and the results show that it can generate high-quality images that align with human preferences. The code and models are made available for further research and development. Overall, the paper contributes a new framework for few-step image generation that can efficiently generate high-quality images that align with human preferences.


📅 Published on May 25

🔗 Links:
• GitHub: https://github.com/huggingface
• arXiv: https://arxiv.org/abs/2605.26108
• PDF: https://arxiv.org/pdf/2605.26108

🤖 Models citing this paper:
https://huggingface.co/Harahan/FLUX2-4B-RTDMD
https://huggingface.co/Harahan/SD35M-RTDMD

━━━━━━━━━━━━━━━━━━━━━━━━
📢 By: https://xn--r1a.website/PaperNexus

#FewStepGenerators #RewardTiltedDistributionMatching #ImageGenerationModels #DiffusionDistillationMethods #ReinforcementLearningForGenerativeModels