zhihu/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.
Language: C++
#cpm #cuda #gpt #inference_engine #llama #llm #llm_serving #minicpm #pytorch #qwen
Stars: 192 Issues: 1 Forks: 16
https://github.com/zhihu/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.
Language: C++
#cpm #cuda #gpt #inference_engine #llama #llm #llm_serving #minicpm #pytorch #qwen
Stars: 192 Issues: 1 Forks: 16
https://github.com/zhihu/ZhiLight
GitHub
GitHub - zhihu/ZhiLight: A highly optimized LLM inference acceleration engine for Llama and its variants.
A highly optimized LLM inference acceleration engine for Llama and its variants. - zhihu/ZhiLight
liweiphys/layra
LAYRA is a ready-to-use visual RAG system with a complete UI built with Next.js and FastAPI, preserving document layout, tables, paragraphs, and graphical elements without any structural fragmentation.
Language: TypeScript
#agent #colpali #colqwen #document_parser #fastapi #gpt_4o #knowledge_base #llm #nextjs #pdf_parser #qwen #rag #visual_rag
Stars: 190 Issues: 3 Forks: 15
https://github.com/liweiphys/layra
LAYRA is a ready-to-use visual RAG system with a complete UI built with Next.js and FastAPI, preserving document layout, tables, paragraphs, and graphical elements without any structural fragmentation.
Language: TypeScript
#agent #colpali #colqwen #document_parser #fastapi #gpt_4o #knowledge_base #llm #nextjs #pdf_parser #qwen #rag #visual_rag
Stars: 190 Issues: 3 Forks: 15
https://github.com/liweiphys/layra
GitHub
GitHub - liweiphys/layra: LAYRA is a ready-to-use visual RAG system with a complete UI built with Next.js and FastAPI, preserving…
LAYRA is a ready-to-use visual RAG system with a complete UI built with Next.js and FastAPI, preserving document layout, tables, paragraphs, and graphical elements without any structural fragmentat...