clovaai/donut
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Language: Python
#computer_vision #document_ai #eccv_2022 #multimodal_pre_trained_model #nlp #ocr
Stars: 98 Issues: 2 Forks: 5
https://github.com/clovaai/donut
  
  Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Language: Python
#computer_vision #document_ai #eccv_2022 #multimodal_pre_trained_model #nlp #ocr
Stars: 98 Issues: 2 Forks: 5
https://github.com/clovaai/donut
GitHub
  
  GitHub - clovaai/donut: Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generatorโฆ
  Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022 - clovaai/donut
โค1
  OpenGVLab/InternChat
InternChat allows you to interact with ChatGPT by clicking, dragging and drawing using a pointing device.
Language: Python
#chatgpt #click #foundation_model #gpt #gpt_4 #gradio #husky #image_captioning #internimage #langchain #llama #llm #multimodal #ocr #sam #segment_anything #vicuna #video #video_generation #vqa
Stars: 231 Issues: 1 Forks: 10
https://github.com/OpenGVLab/InternChat
  
  InternChat allows you to interact with ChatGPT by clicking, dragging and drawing using a pointing device.
Language: Python
#chatgpt #click #foundation_model #gpt #gpt_4 #gradio #husky #image_captioning #internimage #langchain #llama #llm #multimodal #ocr #sam #segment_anything #vicuna #video #video_generation #vqa
Stars: 231 Issues: 1 Forks: 10
https://github.com/OpenGVLab/InternChat
GitHub
  
  GitHub - OpenGVLab/InternGPT: InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Nowโฆ
  InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editin...
  Danily07/Translumo
Advanced real-time screen translator for games, hardcoded subtitles in videos, static text and etc.
Language: C#
#autotranslate #easyocr #game_translation #mlnet #ocr #translation
Stars: 239 Issues: 5 Forks: 4
https://github.com/Danily07/Translumo
  
  Advanced real-time screen translator for games, hardcoded subtitles in videos, static text and etc.
Language: C#
#autotranslate #easyocr #game_translation #mlnet #ocr #translation
Stars: 239 Issues: 5 Forks: 4
https://github.com/Danily07/Translumo
GitHub
  
  GitHub - ramjke/Translumo: Advanced real-time screen translator for games, hardcoded subtitles in videos, static text and etc.
  Advanced real-time screen translator for games, hardcoded subtitles in videos, static text and etc. - ramjke/Translumo
๐9๐3๐ค2
  junhoyeo/BetterOCR
๐ Better text detection by combining multiple OCR engines (EasyOCR, Tesseract) with ๐ง LLM.
Language: Python
#ai #chatgpt #chatgpt_api #easyocr #llm #ocr #openai #openai_api #tesseract #tesseract_ocr
Stars: 154 Issues: 4 Forks: 7
https://github.com/junhoyeo/BetterOCR
  
  ๐ Better text detection by combining multiple OCR engines (EasyOCR, Tesseract) with ๐ง LLM.
Language: Python
#ai #chatgpt #chatgpt_api #easyocr #llm #ocr #openai #openai_api #tesseract #tesseract_ocr
Stars: 154 Issues: 4 Forks: 7
https://github.com/junhoyeo/BetterOCR
GitHub
  
  GitHub - junhoyeo/BetterOCR: ๐ Better text detection by combining multiple OCR engines (EasyOCR, Tesseract, and Pororo) with ๐ง โฆ
  ๐ Better text detection by combining multiple OCR engines (EasyOCR, Tesseract, and Pororo) with ๐ง  LLM. - junhoyeo/BetterOCR
๐3๐1
  reworkd/tarsier
Vision utilities for web interaction agents ๐
Language: Jupyter Notebook
#gpt4v #llms #ocr #playwright #pypi_package #python #selenium #webscraping
Stars: 236 Issues: 3 Forks: 14
https://github.com/reworkd/tarsier
  
  Vision utilities for web interaction agents ๐
Language: Jupyter Notebook
#gpt4v #llms #ocr #playwright #pypi_package #python #selenium #webscraping
Stars: 236 Issues: 3 Forks: 14
https://github.com/reworkd/tarsier
GitHub
  
  GitHub - reworkd/tarsier: Vision utilities for web interaction agents ๐
  Vision utilities for web interaction agents ๐. Contribute to reworkd/tarsier development by creating an account on GitHub.
  VikParuchuri/texify
OCR model for math that outputs LaTeX and markdown
Language: Python
#deep_learning #latex #markdown #ocr
Stars: 142 Issues: 0 Forks: 7
https://github.com/VikParuchuri/texify
  
  OCR model for math that outputs LaTeX and markdown
Language: Python
#deep_learning #latex #markdown #ocr
Stars: 142 Issues: 0 Forks: 7
https://github.com/VikParuchuri/texify
GitHub
  
  GitHub - VikParuchuri/texify: Math OCR model that outputs LaTeX and markdown
  Math OCR model that outputs LaTeX and markdown. Contribute to VikParuchuri/texify development by creating an account on GitHub.
๐1
  robertknight/ocrs
A modern OCR engine (extracts text from images), written in Rust
Language: Rust
#computer_vision #machine_learning #ocr
Stars: 220 Issues: 3 Forks: 4
https://github.com/robertknight/ocrs
  
  A modern OCR engine (extracts text from images), written in Rust
Language: Rust
#computer_vision #machine_learning #ocr
Stars: 220 Issues: 3 Forks: 4
https://github.com/robertknight/ocrs
GitHub
  
  GitHub - robertknight/ocrs: Rust library and CLI tool for OCR (extracting text from images)
  Rust library and CLI tool for OCR (extracting text from images) - robertknight/ocrs
๐ฅฐ1๐1
  VikParuchuri/tabled
Detect and extract tables to markdown and csv
Language: Python
#deep_learning #ocr #tables
Stars: 245 Issues: 4 Forks: 7
https://github.com/VikParuchuri/tabled
  
  Detect and extract tables to markdown and csv
Language: Python
#deep_learning #ocr #tables
Stars: 245 Issues: 4 Forks: 7
https://github.com/VikParuchuri/tabled
GitHub
  
  GitHub - VikParuchuri/tabled: Detect and extract tables to markdown and csv
  Detect and extract tables to markdown and csv. Contribute to VikParuchuri/tabled development by creating an account on GitHub.
๐1
  umlx5h/LLPlayer
The media player for language learning, with dual subtitles, AI-generated subtitles, realtime-OCR, translation, word lookup, and more!
Language: C#
#asr #csharp #flyleaf #language_learning #media_player #ocr #player #tesseract #video #video_player #whisper #wpf #yt_dlp
Stars: 253 Issues: 5 Forks: 4
https://github.com/umlx5h/LLPlayer
  
  The media player for language learning, with dual subtitles, AI-generated subtitles, realtime-OCR, translation, word lookup, and more!
Language: C#
#asr #csharp #flyleaf #language_learning #media_player #ocr #player #tesseract #video #video_player #whisper #wpf #yt_dlp
Stars: 253 Issues: 5 Forks: 4
https://github.com/umlx5h/LLPlayer
GitHub
  
  GitHub - umlx5h/LLPlayer: The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translationโฆ
  The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation, and more! - umlx5h/LLPlayer
โค2๐1
  ses4255/Versatile-OCR-Program
Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
Language: Python
#doclayout #educational_data #exam_ocr #machine_learning #ml_datasets #multi_modal #ocr #openai #paper_ocr #table_parsing
Stars: 250 Issues: 0 Forks: 11
https://github.com/ses4255/Versatile-OCR-Program
  
  Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
Language: Python
#doclayout #educational_data #exam_ocr #machine_learning #ml_datasets #multi_modal #ocr #openai #paper_ocr #table_parsing
Stars: 250 Issues: 0 Forks: 11
https://github.com/ses4255/Versatile-OCR-Program
GitHub
  
  GitHub - ses4255/Versatile-OCR-Program: Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
  Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams) - ses4255/Versatile-OCR-Program
โค1๐1
  TimmyOVO/deepseek-ocr.rs
Rust implementation of DeepSeek-OCR with OpenAI-compatible server & CLI No Python environment needed - just download and run.
Language: Rust
#candle #ocr #ocr_recognition #openai #rust
Stars: 808 Issues: 4 Forks: 61
https://github.com/TimmyOVO/deepseek-ocr.rs
  
  Rust implementation of DeepSeek-OCR with OpenAI-compatible server & CLI No Python environment needed - just download and run.
Language: Rust
#candle #ocr #ocr_recognition #openai #rust
Stars: 808 Issues: 4 Forks: 61
https://github.com/TimmyOVO/deepseek-ocr.rs
GitHub
  
  GitHub - TimmyOVO/deepseek-ocr.rs: Rust implementation of DeepSeek-OCR with OpenAI-compatible server & CLI No Python environmentโฆ
  Rust implementation of DeepSeek-OCR with OpenAI-compatible server & CLI No Python environment needed - just download and run. - TimmyOVO/deepseek-ocr.rs
โค1
  