zhihu/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.
Language: C++
#cpm #cuda #gpt #inference_engine #llama #llm #llm_serving #minicpm #pytorch #qwen
Stars: 192 Issues: 1 Forks: 16
https://github.com/zhihu/ZhiLight
  
  A highly optimized inference acceleration engine for Llama and its variants.
Language: C++
#cpm #cuda #gpt #inference_engine #llama #llm #llm_serving #minicpm #pytorch #qwen
Stars: 192 Issues: 1 Forks: 16
https://github.com/zhihu/ZhiLight
GitHub
  
  GitHub - zhihu/ZhiLight: A highly optimized LLM inference acceleration engine for Llama and its variants.
  A highly optimized LLM inference acceleration engine for Llama and its variants. - zhihu/ZhiLight
👍1
  liweiphys/layra
LAYRA is a ready-to-use visual RAG system with a complete UI built with Next.js and FastAPI, preserving document layout, tables, paragraphs, and graphical elements without any structural fragmentation.
Language: TypeScript
#agent #colpali #colqwen #document_parser #fastapi #gpt_4o #knowledge_base #llm #nextjs #pdf_parser #qwen #rag #visual_rag
Stars: 190 Issues: 3 Forks: 15
https://github.com/liweiphys/layra
  
  LAYRA is a ready-to-use visual RAG system with a complete UI built with Next.js and FastAPI, preserving document layout, tables, paragraphs, and graphical elements without any structural fragmentation.
Language: TypeScript
#agent #colpali #colqwen #document_parser #fastapi #gpt_4o #knowledge_base #llm #nextjs #pdf_parser #qwen #rag #visual_rag
Stars: 190 Issues: 3 Forks: 15
https://github.com/liweiphys/layra
GitHub
  
  GitHub - liweiphys/layra: LAYRA—an enterprise-ready, out-of-the-box solution—unlocks next-generation intelligent systems powered…
  LAYRA—an enterprise-ready, out-of-the-box solution—unlocks next-generation intelligent systems powered by visual RAG and limitless visual multi-step agent workflow orchestration. - liweiphys/layra
👍1
  yassa9/qwen600
Static suckless single batch CUDA-only qwen3-0.6B mini inference engine
Language: Cuda
#cuda #cuda_programming #gpu #llamacpp #llm #llm_inference #qwen #qwen3 #transformer
Stars: 287 Issues: 1 Forks: 17
https://github.com/yassa9/qwen600
  
  Static suckless single batch CUDA-only qwen3-0.6B mini inference engine
Language: Cuda
#cuda #cuda_programming #gpu #llamacpp #llm #llm_inference #qwen #qwen3 #transformer
Stars: 287 Issues: 1 Forks: 17
https://github.com/yassa9/qwen600
GitHub
  
  GitHub - yassa9/qwen600: Static suckless single batch CUDA-only qwen3-0.6B mini inference engine
  Static suckless single batch CUDA-only qwen3-0.6B mini inference engine - yassa9/qwen600
❤1
  tinkle-community/nofx
AI-powered Binance futures trading bot with DeepSeek/Qwen, featuring multi-AI competition, Sharpe ratio self-evolution, and real-time dashboard
Language: Go
#ai_trading #cryptocurrency #deepseek #futures_trading #llm #llm_trading #nof1ai #qwen #trading_bot
Stars: 381 Issues: 5 Forks: 115
https://github.com/tinkle-community/nofx
  
  AI-powered Binance futures trading bot with DeepSeek/Qwen, featuring multi-AI competition, Sharpe ratio self-evolution, and real-time dashboard
Language: Go
#ai_trading #cryptocurrency #deepseek #futures_trading #llm #llm_trading #nof1ai #qwen #trading_bot
Stars: 381 Issues: 5 Forks: 115
https://github.com/tinkle-community/nofx
GitHub
  
  GitHub - tinkle-community/nofx: Multi-exchange AI trading platform (Binance/Hyperliquid/Aster) with multi-AI competition(deeps…
  Multi-exchange AI trading platform (Binance/Hyperliquid/Aster) with multi-AI competition(deepseek/qwen/claude), self-evolution, and real-time dashboard - tinkle-community/nofx
2❤1
  