CircleRadon/Osprey
The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"
Language: Python
#mllm #pixel_understanding #sam #visual_instruction_tuning
Stars: 200 Issues: 1 Forks: 6
https://github.com/CircleRadon/Osprey
The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"
Language: Python
#mllm #pixel_understanding #sam #visual_instruction_tuning
Stars: 200 Issues: 1 Forks: 6
https://github.com/CircleRadon/Osprey
GitHub
GitHub - CircleRadon/Osprey: [CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"
[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning" - CircleRadon/Osprey
ictnlp/LLaVA-Mini
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
Language: Python
#efficient #gpt4o #gpt4v #large_language_models #large_multimodal_models #llama #llava #multimodal #multimodal_large_language_models #video #vision #vision_language_model #visual_instruction_tuning
Stars: 173 Issues: 7 Forks: 11
https://github.com/ictnlp/LLaVA-Mini
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
Language: Python
#efficient #gpt4o #gpt4v #large_language_models #large_multimodal_models #llama #llava #multimodal #multimodal_large_language_models #video #vision #vision_language_model #visual_instruction_tuning
Stars: 173 Issues: 7 Forks: 11
https://github.com/ictnlp/LLaVA-Mini
GitHub
GitHub - ictnlp/LLaVA-Mini: LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images,…
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner. - GitHub - ictnlp/LLaVA-Mini: LLaVA-Mi...