turningpoint-ai/VisualThinker-R1-Zero
Explore the Multimodal “Aha Moment” on 2B Model
Language:Python
Total stars: 208
Stars trend:
#python
#deepseek, #deepseekr1, #deepseekr1zero, #grpo, #multimodal, #multimodaljourney, #multimodalr1, #posttraining, #r1, #r1zero, #reasoning, #reinforcementlearning
Explore the Multimodal “Aha Moment” on 2B Model
Language:Python
Total stars: 208
Stars trend:
5 Mar 2025
1am ▍ +3
2am +0
3am ▏ +1
4am █ +8
5am ██▉ +23
6am ██▉ +23
7am ██▌ +20
8am ▉ +7
9am █▏ +9
10am ▉ +7
11am ▉ +7
12pm █▌ +12
#python
#deepseek, #deepseekr1, #deepseekr1zero, #grpo, #multimodal, #multimodaljourney, #multimodalr1, #posttraining, #r1, #r1zero, #reasoning, #reinforcementlearning
OpenPipe/ART
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, Kimi, and more!
Language:Python
Total stars: 1030
Stars trend:
#python
#agent, #agenticai, #grpo, #kimiai, #llms, #lora, #qwen, #qwen3, #reinforcementlearning, #rl
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, Kimi, and more!
Language:Python
Total stars: 1030
Stars trend:
11 Jul 2025
5pm ▉ +7
6pm █▌ +12
7pm ██ +16
8pm █▌ +12
9pm █▎ +10
10pm █▎ +10
11pm ▋ +5
12 Jul 2025
12am ▉ +7
#python
#agent, #agenticai, #grpo, #kimiai, #llms, #lora, #qwen, #qwen3, #reinforcementlearning, #rl