GitHub repos – Telegram

GitHub repos

25.9K subscribers

18 photos

2 videos

11.2K links

Welcome to GitHub repos. Here you'll find valuable information on the latest trending projects. Subscribe to stay informed and gain insights from the thriving GitHub community.

Download Telegram

About

Blog

Apps

Platform

25.9K subscribers

thu-ml/SageAttention
Quantized Attention that achieves speedups of 2.1x and 2.7x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
Language: Python
#attention #inference_acceleration #llm #quantization
Stars: 145 Issues: 6 Forks: 3
https://github.com/thu-ml/SageAttention

GitHub - thu-ml/SageAttention: Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers,…

Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models. - thu-ml/SageAttention

👍3

1.83K views22:00

mit-han-lab/nunchaku
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Language: Cuda
#diffusion_models #flux #genai #lora #mlsys #quantization
Stars: 249 Issues: 10 Forks: 13
https://github.com/mit-han-lab/nunchaku

GitHub - mit-han-lab/nunchaku: [ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models - mit-han-lab/nunchaku

1.76K views11:00