Gen-Verse/ReasonFlux
ReasonFlux-32B beats o1-preview and DeepSeek-V3 with only 500 thought templates
Language: Python
#chain_of_thought #deepseek_r1 #deepseek_v3 #llm_rlhf #o1_mini #o1_preview #reinforcement_learning #sft_data
Stars: 194 Issues: 2 Forks: 10
https://github.com/Gen-Verse/ReasonFlux
ReasonFlux-32B beats o1-preview and DeepSeek-V3 with only 500 thought templates
Language: Python
#chain_of_thought #deepseek_r1 #deepseek_v3 #llm_rlhf #o1_mini #o1_preview #reinforcement_learning #sft_data
Stars: 194 Issues: 2 Forks: 10
https://github.com/Gen-Verse/ReasonFlux
GitHub
GitHub - Gen-Verse/ReasonFlux: [NeurIPS 2025 Spotlight] ReasonFlux (long-CoT), ReasonFlux-PRM (process reward model) and ReasonFlux…
[NeurIPS 2025 Spotlight] ReasonFlux (long-CoT), ReasonFlux-PRM (process reward model) and ReasonFlux-Coder (code generation) - Gen-Verse/ReasonFlux
👍1