AI & ML Papers

🔥 Flow-OPD: On-Policy Distillation for Flow Matching Models

💡 The paper addresses limitations in existing Flow Matching text-to-image models, which suffer from two main issues: reward sparsity and gradient interference. These problems lead to poor generation quality and alignment metrics. To overcome these challenges, the authors propose Flow-OPD, a two-stage alignment approach that combines on-policy distillation and manifold anchor regularization.

In the first stage, the authors fine-tune domain-specialized teacher models using single-reward GRPO fine-tuning, allowing each expert to reach its performance ceiling. Then, they establish a robust initial policy through a Flow-based Cold-Start scheme and consolidate heterogeneous expertise into a single student model.

The authors also introduce Manifold Anchor Regularization, which leverages a task-agnostic teacher to provide full-data supervision and anchors generation to a high-quality manifold. This helps mitigate aesthetic degradation commonly observed in purely RL-driven alignment.

The results show that Flow-OPD significantly improves generation quality and alignment metrics, raising the GenEval score from 63 to 92 and the OCR accuracy from 59 to 94. This represents an overall improvement of roughly 10 points over vanilla GRPO, while preserving image fidelity and human-preference alignment. The authors also observe an emergent teacher-surpassing effect, where the student model outperforms the teacher models. Overall, Flow-OPD establishes a scalable alignment paradigm for building generalist text-to-image models.

📅 Published on May 8

🔗 Links:
• arXiv: https://arxiv.org/abs/2605.08063
• PDF: https://arxiv.org/pdf/2605.08063
• Project Page: https://costaliya.github.io/Flow-OPD/
• GitHub: https://github.com/CostaliyA/Flow-OPD ⭐ 79

🤖 Models citing this paper:
• https://huggingface.co/CostaliyA/Flow-OPD

━━━━━━━━━━━━━━━━━━━━━━━━
📢 By: https://xn--r1a.website/PaperNexus

#FlowMatchingModels #OnPolicyDistillation #TextToImageSynthesis #ManifoldAnchorRegularization #FlowOPD

arXiv.org

Flow-OPD: On-Policy Distillation for Flow Matching Models

Existing Flow Matching (FM) text-to-image models suffer from two critical bottlenecks under multi-task alignment: the reward sparsity induced by scalar-valued rewards, and the gradient...

❤1

473 views21:48

✨ Join Best TG Channels

👋 Join Our WhatsApp Channel

📝 Contact / Collaborate

AI & ML Papers

Photo

🔥 AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations

💡 The paper addresses the challenge of creating high-quality scientific illustrations, which is a time-consuming and labor-intensive process. To tackle this problem, the authors introduce FigureBench, a large-scale benchmark consisting of 3300 high-quality scientific text-figure pairs, covering various text-to-illustration tasks from different sources. This benchmark provides a foundation for training and evaluating models that generate scientific illustrations from long-form scientific texts.

The authors also propose AutoFigure, an agentic framework that automatically generates high-quality scientific illustrations based on long-form scientific texts. AutoFigure engages in extensive thinking, recombination, and validation processes to produce a layout that is both structurally sound and aesthetically refined, resulting in a scientific illustration that achieves both structural completeness and aesthetic appeal.

The performance of AutoFigure is evaluated using the FigureBench benchmark, and the results demonstrate that AutoFigure consistently outperforms various baseline methods, producing publication-ready scientific illustrations. The authors release the code, dataset, and other resources to facilitate further research and development in this area.

Overall, the paper contributes to the development of automated tools for generating high-quality scientific illustrations, which can help alleviate the bottleneck in creating these illustrations and improve the communication of complex scientific and technical concepts. The introduction of FigureBench and AutoFigure provides a significant step forward in this direction, with the potential to benefit both academia and industry.

📅 Published on Feb 3

🔗 Links:
• GitHub: https://github.com/huggingface
• arXiv: https://arxiv.org/abs/2602.03828
• PDF: https://arxiv.org/pdf/2602.03828

📊 Datasets citing this paper:
• https://huggingface.co/datasets/WestlakeNLP/FigureBench
• https://huggingface.co/datasets/samhug856/FigureBench

🚀 Spaces citing this paper:
• https://huggingface.co/spaces/vikashmakeit/garment-to-pattern

━━━━━━━━━━━━━━━━━━━━━━━━
📢 By: https://xn--r1a.website/PaperNexus

#ScientificIllustrations #TextToImageSynthesis #FigureGeneration #AutoFigure #ScientificVisualization

GitHub

Hugging Face

The AI community building the future. Hugging Face has 443 repositories available. Follow their code on GitHub.

❤1

500 views15:52

✨ Join Best TG Channels

👋 Join Our WhatsApp Channel

📝 Contact / Collaborate

AI & ML Papers

Photo

🔥 Multi-Resolution Flow Matching: Training-Free Diffusion Acceleration via Staged Sampling

💡 The paper proposes a training-free acceleration strategy for text-to-image diffusion models called MrFlow. The problem with existing multi-resolution generation strategies is that they can produce noticeable blurring or artifacts due to upsampling in the latent space and selective modification of partial regions. MrFlow addresses this issue by using a staged low-to-high-resolution pipeline. It first generates the main structure at low resolution, then performs super-resolution in the pixel space using a lightweight pretrained model, injects low-strength noise to enable high-frequency resampling, and finally refines the details at high resolution. The results show that MrFlow achieves a 10x end-to-end acceleration while maintaining a high level of image quality, with only a 1 percent gap in performance compared to the original model. Additionally, MrFlow can be combined with other acceleration strategies, such as timestep distillation, to achieve even higher acceleration of up to 25x. The key advantage of MrFlow is that it does not require any training or runtime modifications, making it a hardware-agnostic and efficient solution for accelerating text-to-image diffusion models.

📅 Published on Jul 2

🔗 Links:
• GitHub: https://github.com/huggingface
• arXiv: https://arxiv.org/abs/2607.01642
• PDF: https://arxiv.org/pdf/2607.01642

🤖 Models citing this paper:
• https://huggingface.co/Xingyu-Zheng/MrFlow

🚀 Spaces citing this paper:
• https://huggingface.co/spaces/Xingyu-Zheng/mrflow-fast-diffusion

━━━━━━━━━━━━━━━━━━━━━━━━
📢 By: https://xn--r1a.website/PaperNexus

#DiffusionModels #TextToImageSynthesis #MultiResolutionGeneration #StagedSampling #SuperResolutionTechniques

GitHub

Hugging Face

The AI community building the future. Hugging Face has 443 repositories available. Follow their code on GitHub.

❤2

414 views17:53

✨ Join Best TG Channels

👋 Join Our WhatsApp Channel

📝 Contact / Collaborate

About

Blog

Apps

Platform