OFA-Sys/Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Language: Python
#chinese #computer_vision #multi_modal_learning #nlp #pytorch #vision_and_language_pre_training
Stars: 80 Issues: 0 Forks: 7
https://github.com/OFA-Sys/Chinese-CLIP
  
  Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Language: Python
#chinese #computer_vision #multi_modal_learning #nlp #pytorch #vision_and_language_pre_training
Stars: 80 Issues: 0 Forks: 7
https://github.com/OFA-Sys/Chinese-CLIP
GitHub
  
  GitHub - OFA-Sys/Chinese-CLIP: Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
  Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation. - OFA-Sys/Chinese-CLIP
👍1🔥1
  NVlabs/prismer
The implementation of "Prismer: A Vision-Language Model with An Ensemble of Experts".
Language: Python
#image_captioning #language_model #multi_modal_learning #multi_task_learning #vision_and_language #vision_language_model #vqa
Stars: 479 Issues: 6 Forks: 21
https://github.com/NVlabs/prismer
  
  The implementation of "Prismer: A Vision-Language Model with An Ensemble of Experts".
Language: Python
#image_captioning #language_model #multi_modal_learning #multi_task_learning #vision_and_language #vision_language_model #vqa
Stars: 479 Issues: 6 Forks: 21
https://github.com/NVlabs/prismer
GitHub
  
  GitHub - NVlabs/prismer: The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".
  The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts". - NVlabs/prismer
🔥3
  HKUDS/VideoRAG
"VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos"
Language: Python
#large_language_models #llms #long_video_understanding #multi_modal_llms #rag #retrieval_augmented_generation
Stars: 201 Issues: 1 Forks: 14
https://github.com/HKUDS/VideoRAG
  
  "VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos"
Language: Python
#large_language_models #llms #long_video_understanding #multi_modal_llms #rag #retrieval_augmented_generation
Stars: 201 Issues: 1 Forks: 14
https://github.com/HKUDS/VideoRAG
GitHub
  
  GitHub - HKUDS/VideoRAG: "VideoRAG: Chat with Your Videos"
  "VideoRAG: Chat with Your Videos". Contribute to HKUDS/VideoRAG development by creating an account on GitHub.
⚡1👍1
  ses4255/Versatile-OCR-Program
Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
Language: Python
#doclayout #educational_data #exam_ocr #machine_learning #ml_datasets #multi_modal #ocr #openai #paper_ocr #table_parsing
Stars: 250 Issues: 0 Forks: 11
https://github.com/ses4255/Versatile-OCR-Program
  
  Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
Language: Python
#doclayout #educational_data #exam_ocr #machine_learning #ml_datasets #multi_modal #ocr #openai #paper_ocr #table_parsing
Stars: 250 Issues: 0 Forks: 11
https://github.com/ses4255/Versatile-OCR-Program
GitHub
  
  GitHub - ses4255/Versatile-OCR-Program: Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
  Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams) - ses4255/Versatile-OCR-Program
❤1👍1
  