#rust #document_ocr #document_processing #ocr #ocr_recognition #pdf #pdf_parser #text_extraction
LiteParse is a fast, local PDF parser that extracts text with bounding boxes, can use OCR, and works in Rust, Python, Node.js, and the browser. It also makes screenshots and can handle files like DOCX, XLSX, PPTX, and images after conversion. Benefit: you can turn documents into clean text or JSON on your own machine, which helps with private, quick, and structured document processing.
https://github.com/run-llama/liteparse
LiteParse is a fast, local PDF parser that extracts text with bounding boxes, can use OCR, and works in Rust, Python, Node.js, and the browser. It also makes screenshots and can handle files like DOCX, XLSX, PPTX, and images after conversion. Benefit: you can turn documents into clean text or JSON on your own machine, which helps with private, quick, and structured document processing.
https://github.com/run-llama/liteparse
GitHub
GitHub - run-llama/liteparse: A fast, helpful, and open-source document parser
A fast, helpful, and open-source document parser. Contribute to run-llama/liteparse development by creating an account on GitHub.