GitHub repos

SqueezeAILab/LLMCompiler
LLMCompiler: An LLM Compiler for Parallel Function Calling
Language: Python
#efficient_inference #function_calling #large_language_models #llama #llama2 #llm #llm_agent #llm_agents #llm_framework #llms #natural_language_processing #nlp #parallel_function_call #transformer
Stars: 216 Issues: 0 Forks: 11
https://github.com/SqueezeAILab/LLMCompiler

GitHub

GitHub - SqueezeAILab/LLMCompiler: [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling

[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling - SqueezeAILab/LLMCompiler

2.13K views17:23

GitHub repos

Writesonic/GPTRouter
Smoothly Manage Multiple LLMs (OpenAI, Anthropic, Azure) and Image Models (Dall-E, SDXL), Speed Up Responses, and Ensure Non-Stop Reliability.
Language: TypeScript
#anthropic #azure_openai #cohere #google_gemini #langchain #llama_index #llm #llmops #llms #mlops #openai #palm_api
Stars: 130 Issues: 6 Forks: 16
https://github.com/Writesonic/GPTRouter

GitHub

GitHub - Writesonic/GPTRouter: Smoothly Manage Multiple LLMs (OpenAI, Anthropic, Azure) and Image Models (Dall-E, SDXL), Speed…

Smoothly Manage Multiple LLMs (OpenAI, Anthropic, Azure) and Image Models (Dall-E, SDXL), Speed Up Responses, and Ensure Non-Stop Reliability. - Writesonic/GPTRouter

❤5

2.23K views05:23

GitHub repos

SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Language: C
#falcon #large_language_models #llama #llm #llm_inference #local_inference
Stars: 792 Issues: 8 Forks: 32
https://github.com/SJTU-IPADS/PowerInfer

GitHub

GitHub - SJTU-IPADS/PowerInfer: High-speed Large Language Model Serving for Local Deployment

High-speed Large Language Model Serving for Local Deployment - SJTU-IPADS/PowerInfer

👍2❤1

2.03K views11:23

GitHub repos

hpcaitech/SwiftInfer
Efficient AI Inference & Serving
Language: Python
#artificial_intelligence #deep_learning #gpt #inference #llama #llama2 #llm_inference #llm_serving
Stars: 299 Issues: 3 Forks: 14
https://github.com/hpcaitech/SwiftInfer

GitHub

GitHub - hpcaitech/SwiftInfer: Efficient AI Inference & Serving

Efficient AI Inference & Serving. Contribute to hpcaitech/SwiftInfer development by creating an account on GitHub.

🔥1

2.51K views11:25

GitHub repos

mishushakov/llm-scraper
Turn any webpage into structured data using LLMs
Language: TypeScript
#ai #browser #browser_automation #gpt #langchain #llama #llm #openai #playwright #puppeteer #scraper
Stars: 332 Issues: 6 Forks: 23
https://github.com/mishushakov/llm-scraper

GitHub

GitHub - mishushakov/llm-scraper: Turn any webpage into structured data using LLMs

Turn any webpage into structured data using LLMs. Contribute to mishushakov/llm-scraper development by creating an account on GitHub.

👍2

2.47K views10:30

GitHub repos

mbzuai-oryx/LLaVA-pp
🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
Language: Python
#conversation #llama_3_llava #llama_3_vision #llama3 #llama3_llava #llama3_vision #llava #llava_llama3 #llava_phi3 #llm #lmms #phi_3_llava #phi_3_vision #phi3 #phi3_llava #phi3_vision #vision_language
Stars: 297 Issues: 2 Forks: 13
https://github.com/mbzuai-oryx/LLaVA-pp

GitHub

GitHub - mbzuai-oryx/LLaVA-pp: 🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3) - mbzuai-oryx/LLaVA-pp

2.72K views16:30

GitHub repos

WinampDesktop/winamp
Iconic media player
Language: C++
#llama #media #player #winamp
Stars: 1751 Issues: 15 Forks: 428
https://github.com/WinampDesktop/winamp

👍1👎1🥴1🌚1

2.06K views10:00

GitHub repos

vietanhdev/llama-assistant
AI-powered assistant to help you with your daily tasks, powered by Llama 3.2. It can recognize your voice, process natural language, and perform various actions based on your commands: summarizing text, rephasing sentences, answering questions, writing emails, and more.
Language: Python
#llama #llama_3_2 #llama3 #llava #moondream #owen #personal_assistant #private_gpt
Stars: 170 Issues: 0 Forks: 12
https://github.com/vietanhdev/llama-assistant

GitHub

GitHub - nrl-ai/llama-assistant: AI-powered assistant to help you with your daily tasks, powered by Llama 3, DeepSeek R1, and many…

AI-powered assistant to help you with your daily tasks, powered by Llama 3, DeepSeek R1, and many more models on HuggingFace. - nrl-ai/llama-assistant

👍1

2.19K views10:00

GitHub repos

maiqingqiang/NotebookMLX
📋 NotebookMLX - An Open Source version of NotebookLM (Ported NotebookLlama)
Language: Jupyter Notebook
#ai #llama #mlx #notebookllama #notebooklm
Stars: 121 Issues: 1 Forks: 7
https://github.com/maiqingqiang/NotebookMLX

GitHub

GitHub - johnmai-dev/NotebookMLX: 📋 NotebookMLX - An Open Source version of NotebookLM (Ported NotebookLlama)

📋 NotebookMLX - An Open Source version of NotebookLM (Ported NotebookLlama) - johnmai-dev/NotebookMLX

1.8K views16:00

GitHub repos

edwko/OuteTTS
Interface for OuteTTS models.
Language: Python
#gguf #llama #text_to_speech #transformers #tts
Stars: 278 Issues: 6 Forks: 13
https://github.com/edwko/OuteTTS

GitHub

GitHub - edwko/OuteTTS: Interface for OuteTTS models.

Interface for OuteTTS models. Contribute to edwko/OuteTTS development by creating an account on GitHub.

1.79K views11:00

GitHub repos

papersgpt/papersgpt-for-zotero
Zotero AI plugin chatting papers with ChatGPT, Gemini, Claude, Llama 3.2, QwQ-32B-Preview, Marco-o1, Gemma, Mistral and Phi-3.5
Language: JavaScript
#ai #chatgpt #claude #gemini #gemma #llama #marco_o1 #mistral #paper #phi_3 #qwq_32b_preview #summary #zotero #zotero_plugin
Stars: 232 Issues: 3 Forks: 1
https://github.com/papersgpt/papersgpt-for-zotero

GitHub

GitHub - papersgpt/papersgpt-for-zotero: A powerful Zotero AI and MCP plugin with ChatGPT, Gemini, Claude, Grok, DeepSeek, OpenRouter…

A powerful Zotero AI and MCP plugin with ChatGPT, Gemini, Claude, Grok, DeepSeek, OpenRouter, Kimi, GLM, SiliconFlow, GPT-oss, Gemma 3, Qwen 3 - papersgpt/papersgpt-for-zotero

❤1

1.94K views17:00

GitHub repos

zhihu/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.
Language: C++
#cpm #cuda #gpt #inference_engine #llama #llm #llm_serving #minicpm #pytorch #qwen
Stars: 192 Issues: 1 Forks: 16
https://github.com/zhihu/ZhiLight

GitHub

GitHub - zhihu/ZhiLight: A highly optimized LLM inference acceleration engine for Llama and its variants.

A highly optimized LLM inference acceleration engine for Llama and its variants. - zhihu/ZhiLight

👍1

1.79K views17:00

GitHub repos

ictnlp/LLaVA-Mini
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
Language: Python
#efficient #gpt4o #gpt4v #large_language_models #large_multimodal_models #llama #llava #multimodal #multimodal_large_language_models #video #vision #vision_language_model #visual_instruction_tuning
Stars: 173 Issues: 7 Forks: 11
https://github.com/ictnlp/LLaVA-Mini

GitHub

GitHub - ictnlp/LLaVA-Mini: LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images,…

LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner. - GitHub - ictnlp/LLaVA-Mini: LLaVA-Mi...

1.89K views23:00

GitHub repos

therealoliver/Deepdive-llama3-from-scratch
Achieve the llama3 inference step-by-step, grasp the core concepts, master the process derivation, implement the code.
Language: Jupyter Notebook
#attention #attention_mechanism #gpt #inference #kv_cache #language_model #llama #llm_configuration #llms #mask #multi_head_attention #positional_encoding #residuals #rms #rms_norm #rope #rotary_position_encoding #swiglu #tokenizer #transformer
Stars: 388 Issues: 0 Forks: 28
https://github.com/therealoliver/Deepdive-llama3-from-scratch

GitHub

GitHub - therealoliver/Deepdive-llama3-from-scratch: Achieve the llama3 inference step-by-step, grasp the core concepts, master…

Achieve the llama3 inference step-by-step, grasp the core concepts, master the process derivation, implement the code. - therealoliver/Deepdive-llama3-from-scratch

👍1

1.69K views11:00

GitHub repos

dipampaul17/KVSplit
Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.
Language: Python
#apple_silicon #generative_ai #kv_cache #llama_cpp #llm #m1 #m2 #m3 #memory_optimization #metal #optimization #quantization
Stars: 222 Issues: 1 Forks: 5
https://github.com/dipampaul17/KVSplit

GitHub

GitHub - dipampaul17/KVSplit: Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache…

Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with &am...

1.78K views10:00

GitHub repos

NU-QRG/optiml
Acceleration library for LLM agents.
Language: C++
#llama #llm
Stars: 198 Issues: 7 Forks: 44
https://github.com/NU-QRG/optiml

1.51K views04:00

About

Blog

Apps

Platform