bytedance/UI-TARS-desktop
A GUI Agent application based on UI-TARS(Vision-Lanuage Model) that allows you to control your computer using natural language.
Language: TypeScript
#agent #browser_use #computer_use #electron #gui_agents #vision #vite #vlm
Stars: 505 Issues: 8 Forks: 35
https://github.com/bytedance/UI-TARS-desktop
A GUI Agent application based on UI-TARS(Vision-Lanuage Model) that allows you to control your computer using natural language.
Language: TypeScript
#agent #browser_use #computer_use #electron #gui_agents #vision #vite #vlm
Stars: 505 Issues: 8 Forks: 35
https://github.com/bytedance/UI-TARS-desktop
GitHub
GitHub - bytedance/UI-TARS-desktop: The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra - bytedance/UI-TARS-desktop
❤1
ByteDance-Seed/Seed1.5-VL
Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.
Language: Jupyter Notebook
#cookbook #large_language_model #multimodal_large_language_models #vision_language_model
Stars: 404 Issues: 0 Forks: 3
https://github.com/ByteDance-Seed/Seed1.5-VL
Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.
Language: Jupyter Notebook
#cookbook #large_language_model #multimodal_large_language_models #vision_language_model
Stars: 404 Issues: 0 Forks: 3
https://github.com/ByteDance-Seed/Seed1.5-VL
GitHub
GitHub - ByteDance-Seed/Seed1.5-VL: Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal…
Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks. ...
👍1
OpenHelix-Team/VLA-Adapter
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model
Language: Python
#embodied_ai #robotics #vision_language_action_model
Stars: 188 Issues: 4 Forks: 8
https://github.com/OpenHelix-Team/VLA-Adapter
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model
Language: Python
#embodied_ai #robotics #vision_language_action_model
Stars: 188 Issues: 4 Forks: 8
https://github.com/OpenHelix-Team/VLA-Adapter
GitHub
GitHub - OpenHelix-Team/VLA-Adapter: VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model - OpenHelix-Team/VLA-Adapter