AI & ML Papers
32.8K subscribers
7.05K photos
519 videos
24 files
7.7K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
AI & ML Papers
Photo
🔥 Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?

💡 The paper proposes a novel framework called Robust-U1 to enhance the robustness of multimodal large language models against visual corruptions. The problem addressed is that existing models perform poorly when faced with real-world visual corruptions such as noise or blur. Current approaches to improve robustness have limitations, either lacking interpretability or being unable to restore lost pixel-level details.

The Robust-U1 framework is designed to equip models with explicit visual self-recovery capability, allowing them to recover corrupted visual content by themselves. The approach consists of three stages: supervised fine-tuning for initial reconstruction, reinforcement learning with dual rewards to align high visual quality, and multimodal reasoning that considers both the corrupted input and the recovered image.

The results show that Robust-U1 achieves state-of-the-art robustness on a real-world corruption benchmark and maintains superior performance under adversarial corruptions on general visual question answering benchmarks. The analysis confirms that high-quality visual recovery directly enhances reasoning performance, establishing self-recovery as a critical mechanism for robust visual understanding. Overall, the paper demonstrates that multimodal large language models can self-recover corrupted visual content, leading to improved robustness and performance in visual understanding tasks.


📅 Published on Jun 6

🔗 Links:
• GitHub: https://github.com/huggingface
• arXiv: https://arxiv.org/abs/2606.08063
• PDF: https://arxiv.org/pdf/2606.08063
• Project Page: https://huggingface.co/spaces/Jiaqi-hkust/Robust-U1

🤖 Models citing this paper:
https://huggingface.co/Jiaqi-hkust/Robust-U1-SFT
https://huggingface.co/Jiaqi-hkust/Robust-U1-RL
https://huggingface.co/Jiaqi-hkust/Robust-U1

🚀 Spaces citing this paper:
https://huggingface.co/spaces/Jiaqi-hkust/Robust-U1

━━━━━━━━━━━━━━━━━━━━━━━━
📢 By: https://xn--r1a.website/PaperNexus

#MultimodalLearning #VisualContentRecovery #RobustLanguageModels #SelfRecoveryMechanisms #CorruptionResistantAI