AI & ML Papers
33K subscribers
7.11K photos
532 videos
24 files
7.78K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
AI & ML Papers
Photo
🔥 AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

💡 The paper presents AReaL, a large-scale asynchronous reinforcement learning system designed for training large language models on reasoning tasks. The problem with existing synchronous reinforcement learning systems is that they alternate between generation and training in a batch setting, which leads to severe system-level inefficiency and underutilization of GPUs. This is because generation must wait until the longest output in the batch is completed before the model can be updated.

To address this issue, AReaL decouples generation from training, allowing rollout workers to continuously generate new outputs without waiting, while training workers update the model whenever a batch of data is collected. This asynchronous approach leads to substantially higher GPU utilization. To stabilize reinforcement learning training, AReaL balances the workload of rollout and training workers to control data staleness and adopts a staleness-enhanced PPO variant to better handle outdated training samples.

The results show that AReaL achieves up to 2.57 times training speedup compared to the best synchronous systems with the same number of GPUs, while matching or even improving final performance. The system was tested on math and code reasoning benchmarks, demonstrating the effectiveness of the asynchronous approach. The code for AReaL is made available, allowing others to build upon and utilize the system. Overall, AReaL provides a more efficient and scalable solution for training large language models on reasoning tasks using reinforcement learning.


📅 Published on May 30, 2025

🔗 Links:
• GitHub: https://github.com/huggingface
• arXiv: https://arxiv.org/abs/2505.24298
• PDF: https://arxiv.org/pdf/2505.24298

🤖 Models citing this paper:
https://huggingface.co/inclusionAI/AReaL-boba-2-8B
https://huggingface.co/inclusionAI/AReaL-boba-2-14B
https://huggingface.co/inclusionAI/AReaL-boba-2-8B-Open

📊 Datasets citing this paper:
https://huggingface.co/datasets/inclusionAI/AReaL-tau2-data

🚀 Spaces citing this paper:
https://huggingface.co/spaces/rzvn/Medieval-Village-AI

━━━━━━━━━━━━━━━━━━━━━━━━
📢 By: https://xn--r1a.website/PaperNexus

#AsynchronousReinforcementLearning #LanguageReasoningTasks #LargeScaleLanguageModels #ReinforcementLearningSystems #DeepLearningForNaturalLanguageProcessing