paperless-ngx/paperless-ngx
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
Language:Python
Total stars: 24407
Stars trend:
#python
#angular, #archiving, #django, #dms, #documentmanagement, #documentmanagementsystem, #machinelearning, #ocr, #opticalcharacterrecognition, #pdf
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
Language:Python
Total stars: 24407
Stars trend:
31 Jan 2025
6am ▎ +2
7am ▎ +2
8am ▎ +2
9am ▎ +2
10am +0
11am ▉ +7
12pm █▏ +9
1pm █▉ +15
2pm █▋ +13
3pm █▉ +15
4pm █▌ +12
5pm █▋ +13
#python
#angular, #archiving, #django, #dms, #documentmanagement, #documentmanagementsystem, #machinelearning, #ocr, #opticalcharacterrecognition, #pdf
documenso/documenso
The Open Source DocuSign Alternative.
Language:TypeScript
Total stars: 9272
Stars trend:
#typescript
#digitalsignature, #documentsigning, #docusignalternative, #esignature, #esign, #esignature, #nextauth, #nextjs, #opensource, #padesstandard, #pdf, #pdfsign, #pdfsignature, #postgresql, #prisma, #selfhosted, #signing, #typescript
The Open Source DocuSign Alternative.
Language:TypeScript
Total stars: 9272
Stars trend:
1 Feb 2025
3am ▌ +4
4am +0
5am +0
6am ▏ +1
7am ▏ +1
8am ▏ +1
9am +0
10am ▍ +3
11am ██▎ +18
12pm ██▎ +18
1pm ██ +16
2pm ██▎ +18
#typescript
#digitalsignature, #documentsigning, #docusignalternative, #esignature, #esign, #esignature, #nextauth, #nextjs, #opensource, #padesstandard, #pdf, #pdfsign, #pdfsignature, #postgresql, #prisma, #selfhosted, #signing, #typescript
ocrmypdf/OCRmyPDF
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Language:Python
Total stars: 14952
Stars trend:
#python
#imageprocessing, #ocr, #pdf, #python, #tesseract
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Language:Python
Total stars: 14952
Stars trend:
2 Feb 2025
6am ▏ +1
7am +0
8am ▏ +1
9am ▏ +1
10am ▍ +3
11am ▎ +2
12pm ▍ +3
1pm █▎ +10
2pm ███▎ +26
3pm █▌ +12
4pm █▋ +13
5pm █▉ +15
#python
#imageprocessing, #ocr, #pdf, #python, #tesseract
T8RIN/ImageToolbox
🖼️ Image Toolbox is a powerful app for advanced image manipulation. It offers dozens of features, from basic tools like crop and draw to filters, OCR, and a wide range of image processing options
Language:Kotlin
Total stars: 4913
Stars trend:
#kotlin
#aes256, #android, #backgroundremoval, #cleanarchitecture, #crop, #djvu, #editphoto, #exif, #fdroid, #filterimage, #imagemanipulation, #jetpackcompose, #jxl, #kotlin, #materialyou, #ocrrecognition, #pdf, #psd, #qrcodescanner, #watermark
🖼️ Image Toolbox is a powerful app for advanced image manipulation. It offers dozens of features, from basic tools like crop and draw to filters, OCR, and a wide range of image processing options
Language:Kotlin
Total stars: 4913
Stars trend:
9 Feb 2025
5am ▏ +1
6am █ +8
7am ▊ +6
8am █▋ +13
9am █▎ +10
10am █▍ +11
11am █ +8
12pm ▌ +4
1pm ▎ +2
2pm ▍ +3
3pm ▍ +3
4pm ▉ +7
#kotlin
#aes256, #android, #backgroundremoval, #cleanarchitecture, #crop, #djvu, #editphoto, #exif, #fdroid, #filterimage, #imagemanipulation, #jetpackcompose, #jxl, #kotlin, #materialyou, #ocrrecognition, #pdf, #psd, #qrcodescanner, #watermark
desgeeko/pdfsyntax
A Python library to inspect and modify the internal structure of a PDF file
Language:Python
Total stars: 621
Stars trend:
#python
#api, #browse, #inspection, #library, #parser, #pdf, #pdfsyntax, #python, #read, #syntax, #transformation, #write
A Python library to inspect and modify the internal structure of a PDF file
Language:Python
Total stars: 621
Stars trend:
10 Feb 2025
2pm █████▎ +42
3pm ████▌ +36
4pm ███▉ +31
5pm ███▎ +26
6pm ██▎ +18
#python
#api, #browse, #inspection, #library, #parser, #pdf, #pdfsyntax, #python, #read, #syntax, #transformation, #write
Stirling-Tools/Stirling-PDF
#1 Locally hosted web application that allows you to perform various operations on PDF files
Language:Java
Total stars: 49976
Stars trend:
#java
#docker, #java, #pdf, #pdfconverter, #pdfeditor, #pdfmanipulation, #pdfmerger, #pdfocr, #pdftools, #pdfwebapps, #pdfmerger
#1 Locally hosted web application that allows you to perform various operations on PDF files
Language:Java
Total stars: 49976
Stars trend:
14 Feb 2025
3am ▏ +1
4am +0
5am ▉ +7
6am █▏ +9
7am ██ +16
8am ▉ +7
9am █▍ +11
10am ▉ +7
11am █▌ +12
12pm ██▍ +19
1pm ████▊ +38
#java
#docker, #java, #pdf, #pdfconverter, #pdfeditor, #pdfmanipulation, #pdfmerger, #pdfocr, #pdftools, #pdfwebapps, #pdfmerger
Goldziher/kreuzberg
A text extraction library supporting PDFs, images, office documents and more
Language:Python
Total stars: 304
Stars trend:
#python
#asyncio, #docx, #ocr, #pdf, #textextraction
A text extraction library supporting PDFs, images, office documents and more
Language:Python
Total stars: 304
Stars trend:
15 Feb 2025
12am █ +8
1am ▋ +5
2am █ +8
3am ▊ +6
4am ▉ +7
5am ▉ +7
6am ▊ +6
7am ▎ +2
8am █ +8
9am █ +8
10am █▋ +13
#python
#asyncio, #docx, #ocr, #pdf, #textextraction
CatchTheTornado/text-extract-api
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
Language:Python
Total stars: 2248
Stars trend:
#python
#anonymization, #api, #extract, #json, #llm, #ocr, #ocrpython, #pdf, #pii
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
Language:Python
Total stars: 2248
Stars trend:
15 Feb 2025
6am ▉ +7
7am █▎ +10
8am ▌ +4
9am ▉ +7
10am ▉ +7
11am ▍ +3
12pm ▊ +6
1pm ▋ +5
2pm █ +8
3pm █▎ +10
4pm █ +8
5pm ▍ +3
#python
#anonymization, #api, #extract, #json, #llm, #ocr, #ocrpython, #pdf, #pii
opendatalab/MinerU
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Language:Python
Total stars: 27107
Stars trend:
#python
#ai4science, #documentanalysis, #extractdata, #layoutanalysis, #ocr, #parser, #pdf, #pdfconverter, #pdfextractorllm, #pdfextractorpretrain, #pdfextractorrag, #pdfparser, #python
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Language:Python
Total stars: 27107
Stars trend:
3 Mar 2025
3am █▋ +13
4am ▋ +5
5am ▉ +7
6am █▍ +11
7am █▏ +9
8am ▊ +6
9am ▉ +7
10am █ +8
11am ▊ +6
12pm ▊ +6
1pm █ +8
2pm ▉ +7
#python
#ai4science, #documentanalysis, #extractdata, #layoutanalysis, #ocr, #parser, #pdf, #pdfconverter, #pdfextractorllm, #pdfextractorpretrain, #pdfextractorrag, #pdfparser, #python
koodo-reader/koodo-reader
A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux and Web
Language:JavaScript
Total stars: 21188
Stars trend:
#javascript
#book, #cb7, #cbr, #cbt, #cbz, #comic, #docx, #ebook, #epub, #fb2, #html, #markdown, #mobi, #pdf, #reader, #rtf, #txt, #xml
A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux and Web
Language:JavaScript
Total stars: 21188
Stars trend:
4 Mar 2025
12am █▉ +15
1am ▌ +4
2am █▏ +9
3am █▏ +9
4am ▎ +2
5am █▏ +9
6am ▊ +6
7am ▍ +3
8am ▌ +4
9am ▋ +5
10am ▉ +7
11am ▋ +5
#javascript
#book, #cb7, #cbr, #cbt, #cbz, #comic, #docx, #ebook, #epub, #fb2, #html, #markdown, #mobi, #pdf, #reader, #rtf, #txt, #xml
docling-project/docling
Get your documents ready for gen AI
Language:Python
Total stars: 26148
Stars trend:
#python
#ai, #convert, #documentparser, #documentparsing, #documents, #docx, #html, #markdown, #pdf, #pdfconverter, #pdftojson, #pdftotext, #pptx, #tables, #xlsx
Get your documents ready for gen AI
Language:Python
Total stars: 26148
Stars trend:
6 Apr 2025
11am ▏ +1
12pm +0
1pm ▍ +3
2pm ▏ +1
3pm ▌ +4
4pm ▌ +4
5pm ██▏ +17
6pm █ +8
7pm ▊ +6
8pm █▏ +9
9pm █▌ +12
10pm ██ +16
#python
#ai, #convert, #documentparser, #documentparsing, #documents, #docx, #html, #markdown, #pdf, #pdfconverter, #pdftojson, #pdftotext, #pptx, #tables, #xlsx
oomol-lab/pdf-craft
PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books. The project has just started.
Language:Python
Total stars: 1537
Stars trend:
#python
#ai, #document, #ocr, #pdf
PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books. The project has just started.
Language:Python
Total stars: 1537
Stars trend:
10 Apr 2025
4pm ▏ +1
5pm +0
6pm +0
7pm +0
8pm +0
9pm +0
10pm ▏ +1
11pm ▏ +1
11 Apr 2025
12am ████▊ +38
1am ██████████▊ +86
2am ████████ +64
#python
#ai, #document, #ocr, #pdf
QuivrHQ/MegaParse
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
Language:Python
Total stars: 6094
Stars trend:
#python
#docx, #llm, #parser, #pdf, #powerpoint
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
Language:Python
Total stars: 6094
Stars trend:
25 Apr 2025
3pm █▎ +10
4pm █▍ +11
5pm █▏ +9
6pm █▍ +11
7pm ▌ +4
8pm █ +8
9pm ▉ +7
10pm ▋ +5
11pm ▉ +7
26 Apr 2025
12am █▎ +10
1am █▎ +10
#python
#docx, #llm, #parser, #pdf, #powerpoint
EvanZhouDev/llm.pdf
Run LLMs inside a PDF file.
Language:Python
Total stars: 251
Stars trend:
#python
#ai, #llm, #pdf
Run LLMs inside a PDF file.
Language:Python
Total stars: 251
Stars trend:
26 Apr 2025
5pm ▎ +2
6pm ▉ +7
7pm █▍ +11
8pm █ +8
9pm █▍ +11
10pm █ +8
11pm ▋ +5
27 Apr 2025
12am █▎ +10
1am █ +8
2am ▍ +3
3am █▏ +9
4am █ +8
#python
#ai, #llm, #pdf
Stirling-Tools/Stirling-PDF
#1 Locally hosted web application that allows you to perform various operations on PDF files
Language:Java
Total stars: 58829
Stars trend:
#java
#docker, #java, #pdf, #pdfconverter, #pdfeditor, #pdfmanipulation, #pdfmerger, #pdfocr, #pdftools, #pdfwebapps, #pdfmerger
#1 Locally hosted web application that allows you to perform various operations on PDF files
Language:Java
Total stars: 58829
Stars trend:
16 May 2025
4am ▋ +5
5am █ +8
6am ▉ +7
7am █ +8
8am █▉ +15
9am █▌ +12
10am █ +8
11am ██▎ +18
12pm █▎ +10
1pm █▏ +9
2pm ▉ +7
3pm █▎ +10
#java
#docker, #java, #pdf, #pdfconverter, #pdfeditor, #pdfmanipulation, #pdfmerger, #pdfocr, #pdftools, #pdfwebapps, #pdfmerger
clawsoftware/clawPDF
Open Source Virtual (Network) Printer for Windows that allows you to create PDFs, OCR text, and print images, with advanced features usually available only in enterprise solutions.
Language:C#
Total stars: 1043
Stars trend:
#csharp
#imageprocessing, #merge, #networkprinter, #ocr, #pdf, #pdfmerger, #pdfprinter, #print, #printer, #terminalserver, #windows
Open Source Virtual (Network) Printer for Windows that allows you to create PDFs, OCR text, and print images, with advanced features usually available only in enterprise solutions.
Language:C#
Total stars: 1043
Stars trend:
19 May 2025
12pm ▍ +3
1pm █████▌ +44
2pm ███████▎ +58
3pm ██████▌ +52
4pm ██▋ +21
#csharp
#imageprocessing, #merge, #networkprinter, #ocr, #pdf, #pdfmerger, #pdfprinter, #print, #printer, #terminalserver, #windows
iamgio/quarkdown
🪐 Markdown with superpowers — from ideas to presentations, articles and books.
Language:Kotlin
Total stars: 1292
Stars trend:
#kotlin
#compiler, #markdown, #markdownparser, #markuplanguage, #paper, #pdf, #presentations, #programminglanguage, #scriptinglanguage, #slides, #typesetting, #typesettingsystem
🪐 Markdown with superpowers — from ideas to presentations, articles and books.
Language:Kotlin
Total stars: 1292
Stars trend:
2 Jun 2025
10pm ▏ +1
11pm +0
3 Jun 2025
12am +0
1am ▏ +1
2am ▏ +1
3am ▏ +1
4am +0
5am +0
6am +0
7am +0
8am ██████▍ +51
9am █████████ +72
#kotlin
#compiler, #markdown, #markdownparser, #markuplanguage, #paper, #pdf, #presentations, #programminglanguage, #scriptinglanguage, #slides, #typesetting, #typesettingsystem
T8RIN/ImageToolbox
🖼️ Image Toolbox is a powerful app for advanced image manipulation. It offers dozens of features, from basic tools like crop and draw to filters, OCR, and a wide range of image processing options
Language:Kotlin
Total stars: 7435
Stars trend:
#kotlin
#android, #backgroundremoval, #cleanarchitecture, #crop, #editphoto, #exif, #fdroid, #filterimage, #imagemanipulation, #jetpackcompose, #jxl, #kotlin, #materialyou, #ocrrecognition, #pdf, #photocollage, #photoeditor, #psd, #qrcodescanner, #watermark
🖼️ Image Toolbox is a powerful app for advanced image manipulation. It offers dozens of features, from basic tools like crop and draw to filters, OCR, and a wide range of image processing options
Language:Kotlin
Total stars: 7435
Stars trend:
7 Jun 2025
9pm ▍ +3
10pm ▏ +1
11pm ▏ +1
8 Jun 2025
12am ▋ +5
1am █ +8
2am █▌ +12
3am █▌ +12
4am █▍ +11
5am ▉ +7
6am █ +8
7am ▉ +7
#kotlin
#android, #backgroundremoval, #cleanarchitecture, #crop, #editphoto, #exif, #fdroid, #filterimage, #imagemanipulation, #jetpackcompose, #jxl, #kotlin, #materialyou, #ocrrecognition, #pdf, #photocollage, #photoeditor, #psd, #qrcodescanner, #watermark
opendatalab/MinerU
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Language:Python
Total stars: 35977
Stars trend:
#python
#ai4science, #documentanalysis, #extractdata, #layoutanalysis, #ocr, #parser, #pdf, #pdfconverter, #pdfextractorllm, #pdfextractorpretrain, #pdfextractorrag, #pdfparser, #python
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Language:Python
Total stars: 35977
Stars trend:
22 Jun 2025
3pm ▍ +3
4pm ▎ +2
5pm ██▍ +19
6pm ██▍ +19
7pm █▏ +9
8pm ▍ +3
9pm ▍ +3
10pm █▌ +12
11pm ██▍ +19
23 Jun 2025
12am ███ +24
1am ████▋ +37
2am █████▊ +46
#python
#ai4science, #documentanalysis, #extractdata, #layoutanalysis, #ocr, #parser, #pdf, #pdfconverter, #pdfextractorllm, #pdfextractorpretrain, #pdfextractorrag, #pdfparser, #python
docling-project/docling
Get your documents ready for gen AI
Language:Python
Total stars: 33012
Stars trend:
#python
#ai, #convert, #documentparser, #documentparsing, #documents, #docx, #html, #markdown, #pdf, #pdfconverter, #pdftojson, #pdftotext, #pptx, #tables, #xlsx
Get your documents ready for gen AI
Language:Python
Total stars: 33012
Stars trend:
29 Jun 2025
6am ▎ +2
7am ▉ +7
8am ▎ +2
9am ▌ +4
10am █ +8
11am █▎ +10
12pm █ +8
1pm █▋ +13
2pm █▎ +10
3pm █▊ +14
4pm █▋ +13
5pm ▌ +4
#python
#ai, #convert, #documentparser, #documentparsing, #documents, #docx, #html, #markdown, #pdf, #pdfconverter, #pdftojson, #pdftotext, #pptx, #tables, #xlsx