AI & ML Papers
Photo
🔥 Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers
📅 Published on Aug 12, 2024
🔗 Links:
• arXiv: https://arxiv.org/abs/2408.06195
• PDF: https://arxiv.org/pdf/2408.06195
• GitHub: https://github.com/codelion/optillm ⭐ 3.7k
🚀 Spaces citing this paper:
• https://huggingface.co/spaces/algorithmicsuperintelligence/OptiLLM
• https://huggingface.co/spaces/fabiodr/optillm
• https://huggingface.co/spaces/EduuGomes/CachoeiraBot
━━━━━━━━━━━━━━━━━━━━━━━━
📢 By: https://xn--r1a.website/PaperNexus
#MutualReasoning #LLMProblemSolving #MonteCarloTreeSearch #SelfPlayLearning #LanguageModelOptimization
💡 This paper introduces a new approach called rStar that improves the reasoning capabilities of small language models without requiring fine-tuning or larger models. The problem addressed is that small language models often struggle with complex reasoning tasks, which can limit their ability to solve problems. The rStar method involves a self-play mutual generation-discrimination process, where one small language model generates reasoning trajectories using a Monte Carlo Tree Search with human-like reasoning actions, and another similar model acts as a discriminator to verify these trajectories. The trajectories that are mutually agreed upon are considered more likely to be correct. The results show that rStar can effectively solve diverse reasoning problems, including math and strategy-based tasks, and significantly improves the accuracy of small language models. For example, rStar boosts the accuracy of one model from 12.51 percent to 63.91 percent on a specific task, and from 36.46 percent to 81.88 percent on another model. Overall, the rStar approach makes smaller language models stronger problem-solvers without requiring additional training or larger models.
📅 Published on Aug 12, 2024
🔗 Links:
• arXiv: https://arxiv.org/abs/2408.06195
• PDF: https://arxiv.org/pdf/2408.06195
• GitHub: https://github.com/codelion/optillm ⭐ 3.7k
🚀 Spaces citing this paper:
• https://huggingface.co/spaces/algorithmicsuperintelligence/OptiLLM
• https://huggingface.co/spaces/fabiodr/optillm
• https://huggingface.co/spaces/EduuGomes/CachoeiraBot
━━━━━━━━━━━━━━━━━━━━━━━━
📢 By: https://xn--r1a.website/PaperNexus
#MutualReasoning #LLMProblemSolving #MonteCarloTreeSearch #SelfPlayLearning #LanguageModelOptimization
arXiv.org
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers
This paper introduces rStar, a self-play mutual reasoning approach that significantly improves reasoning capabilities of small language models (SLMs) without fine-tuning or superior models. rStar...