✨SQ-format: A Unified Sparse-Quantized Hardware-friendly Data Format for LLMs
📝 Summary:
The SQ-format is a unified sparse-quantized data format for LLM post-training quantization. It improves accuracy and efficiency balance by combining sparse and low-precision matrix multiplications. This enables better performance and throughput, especially for outlier activations, supporting next...
🔹 Publication Date: Published on Dec 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.05409
• PDF: https://arxiv.org/pdf/2512.05409
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LLMs #Quantization #SparseML #HardwareAcceleration #AIResearch
📝 Summary:
The SQ-format is a unified sparse-quantized data format for LLM post-training quantization. It improves accuracy and efficiency balance by combining sparse and low-precision matrix multiplications. This enables better performance and throughput, especially for outlier activations, supporting next...
🔹 Publication Date: Published on Dec 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.05409
• PDF: https://arxiv.org/pdf/2512.05409
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LLMs #Quantization #SparseML #HardwareAcceleration #AIResearch
❤1
✨Fairy2i: Training Complex LLMs from Real LLMs with All Parameters in {pm 1, pm i}
📝 Summary:
Fairy2i converts pre-trained real-valued LLMs to a complex form, enabling efficient low-bit quantization while reusing existing checkpoints. It achieves near full-precision performance for LLaMA-2 7B at 2-bit, significantly outperforming real-valued binary methods.
🔹 Publication Date: Published on Dec 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2512.02901
• PDF: https://arxiv.org/pdf/2512.02901
• Github: https://github.com/PKULab1806/Fairy2i-W2
🔹 Models citing this paper:
• https://huggingface.co/PKU-DS-LAB/Fairy2i-W2
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LLM #Quantization #ModelCompression #DeepLearning #AIResearch
📝 Summary:
Fairy2i converts pre-trained real-valued LLMs to a complex form, enabling efficient low-bit quantization while reusing existing checkpoints. It achieves near full-precision performance for LLaMA-2 7B at 2-bit, significantly outperforming real-valued binary methods.
🔹 Publication Date: Published on Dec 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2512.02901
• PDF: https://arxiv.org/pdf/2512.02901
• Github: https://github.com/PKULab1806/Fairy2i-W2
🔹 Models citing this paper:
• https://huggingface.co/PKU-DS-LAB/Fairy2i-W2
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LLM #Quantization #ModelCompression #DeepLearning #AIResearch
❤2
✨BitNet b1.58 2B4T Technical Report
📝 Summary:
BitNet b1.58 2B4T is the first open-source 1-bit Large Language Model with 2 billion parameters. It matches full-precision LLM performance while offering significant improvements in computational efficiency like reduced memory and energy. The model weights are openly released for research.
🔹 Publication Date: Published on Apr 16, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.12285
• PDF: https://arxiv.org/pdf/2504.12285
• Github: https://github.com/microsoft/bitnet
🔹 Models citing this paper:
• https://huggingface.co/microsoft/bitnet-b1.58-2B-4T
• https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf
• https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-bf16
✨ Spaces citing this paper:
• https://huggingface.co/spaces/suayptalha/Chat-with-Bitnet-b1.58-2B-4T
• https://huggingface.co/spaces/aizip-dev/SLM-RAG-Arena
• https://huggingface.co/spaces/Tonic/Native_1-bit_LLM
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LLM #AI #Quantization #OpenSourceAI #DeepLearning
📝 Summary:
BitNet b1.58 2B4T is the first open-source 1-bit Large Language Model with 2 billion parameters. It matches full-precision LLM performance while offering significant improvements in computational efficiency like reduced memory and energy. The model weights are openly released for research.
🔹 Publication Date: Published on Apr 16, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2504.12285
• PDF: https://arxiv.org/pdf/2504.12285
• Github: https://github.com/microsoft/bitnet
🔹 Models citing this paper:
• https://huggingface.co/microsoft/bitnet-b1.58-2B-4T
• https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf
• https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-bf16
✨ Spaces citing this paper:
• https://huggingface.co/spaces/suayptalha/Chat-with-Bitnet-b1.58-2B-4T
• https://huggingface.co/spaces/aizip-dev/SLM-RAG-Arena
• https://huggingface.co/spaces/Tonic/Native_1-bit_LLM
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LLM #AI #Quantization #OpenSourceAI #DeepLearning
arXiv.org
BitNet b1.58 2B4T Technical Report
We introduce BitNet b1.58 2B4T, the first open-source, native 1-bit Large Language Model (LLM) at the 2-billion parameter scale. Trained on a corpus of 4 trillion tokens, the model has been...
✨BitNet Distillation
📝 Summary:
BitNet Distillation fine-tunes LLMs to 1.58-bit precision using SubLN, attention distillation, and continual pre-training. It achieves comparable performance to full-precision models, offering 10x memory savings and 2.65x faster inference.
🔹 Publication Date: Published on Oct 15, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.13998
• PDF: https://arxiv.org/pdf/2510.13998
• Github: https://github.com/microsoft/BitNet
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LLM #Quantization #ModelCompression #DeepLearning #AI
📝 Summary:
BitNet Distillation fine-tunes LLMs to 1.58-bit precision using SubLN, attention distillation, and continual pre-training. It achieves comparable performance to full-precision models, offering 10x memory savings and 2.65x faster inference.
🔹 Publication Date: Published on Oct 15, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.13998
• PDF: https://arxiv.org/pdf/2510.13998
• Github: https://github.com/microsoft/BitNet
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LLM #Quantization #ModelCompression #DeepLearning #AI
✨Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation
📝 Summary:
Quartet II improves LLM pre-training in NVFP4 by introducing MS-EDEN for enhanced unbiased gradient estimation, significantly reducing quantization error. This achieves better accuracy and up to 4.2x faster execution on NVIDIA Blackwell GPUs compared to BF16.
🔹 Publication Date: Published on Jan 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.22813
• PDF: https://arxiv.org/pdf/2601.22813
• Github: https://github.com/IST-DASLab/Quartet-II
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LLM #DeepLearning #Quantization #GPUAcceleration #AIResearch
📝 Summary:
Quartet II improves LLM pre-training in NVFP4 by introducing MS-EDEN for enhanced unbiased gradient estimation, significantly reducing quantization error. This achieves better accuracy and up to 4.2x faster execution on NVIDIA Blackwell GPUs compared to BF16.
🔹 Publication Date: Published on Jan 30
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.22813
• PDF: https://arxiv.org/pdf/2601.22813
• Github: https://github.com/IST-DASLab/Quartet-II
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LLM #DeepLearning #Quantization #GPUAcceleration #AIResearch
❤1
✨QuantLRM: Quantization of Large Reasoning Models via Fine-Tuning Signals
📝 Summary:
QuantLRM improves Large Reasoning Model quantization by using weight update magnitudes from fine-tuning to estimate channel importance. It protects both smallest and largest updates, consistently outperforming traditional methods and applying even to non-fine-tuned models.
🔹 Publication Date: Published on Jan 31
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02581
• PDF: https://arxiv.org/pdf/2602.02581
• Github: https://github.com/psunlpgroup/QuantLRM
🔹 Models citing this paper:
• https://huggingface.co/nanzhang/QuantLRM-R1-Qwen-32B-3-bit
• https://huggingface.co/nanzhang/QuantLRM-R1-Llama-70B-3-bit
• https://huggingface.co/nanzhang/QuantLRM-R1-Qwen3-8B-3-bit
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#Quantization #LargeLanguageModels #DeepLearning #AI #ModelCompression
📝 Summary:
QuantLRM improves Large Reasoning Model quantization by using weight update magnitudes from fine-tuning to estimate channel importance. It protects both smallest and largest updates, consistently outperforming traditional methods and applying even to non-fine-tuned models.
🔹 Publication Date: Published on Jan 31
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02581
• PDF: https://arxiv.org/pdf/2602.02581
• Github: https://github.com/psunlpgroup/QuantLRM
🔹 Models citing this paper:
• https://huggingface.co/nanzhang/QuantLRM-R1-Qwen-32B-3-bit
• https://huggingface.co/nanzhang/QuantLRM-R1-Llama-70B-3-bit
• https://huggingface.co/nanzhang/QuantLRM-R1-Qwen3-8B-3-bit
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#Quantization #LargeLanguageModels #DeepLearning #AI #ModelCompression
✨Quantized Evolution Strategies: High-precision Fine-tuning of Quantized LLMs at Low-precision Cost
📝 Summary:
Quantized LLMs are difficult to fine-tune directly using existing methods. Quantized Evolution Strategies QES enables full-parameter fine-tuning of quantized LLMs. It uses error feedback and seed replay for high-precision optimization at low memory cost, outperforming prior methods.
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03120
• PDF: https://arxiv.org/pdf/2602.03120
• Github: https://github.com/dibbla/Quantized-Evolution-Strategies
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LLM #Quantization #FineTuning #EvolutionStrategies #AI
📝 Summary:
Quantized LLMs are difficult to fine-tune directly using existing methods. Quantized Evolution Strategies QES enables full-parameter fine-tuning of quantized LLMs. It uses error feedback and seed replay for high-precision optimization at low memory cost, outperforming prior methods.
🔹 Publication Date: Published on Feb 3
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03120
• PDF: https://arxiv.org/pdf/2602.03120
• Github: https://github.com/dibbla/Quantized-Evolution-Strategies
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LLM #Quantization #FineTuning #EvolutionStrategies #AI
✨QuantVLA: Scale-Calibrated Post-Training Quantization for Vision-Language-Action Models
📝 Summary:
QuantVLA is a training-free post-training quantization framework for vision-language-action models. Through scale-calibrated components, it significantly reduces memory and speeds up inference while maintaining performance, enabling efficient deployment for embodied AI.
🔹 Publication Date: Published on Feb 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.20309
• PDF: https://arxiv.org/pdf/2602.20309
• Project Page: https://quantvla.github.io/
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#Quantization #VLAModels #EmbodiedAI #AIResearch #DeepLearning
📝 Summary:
QuantVLA is a training-free post-training quantization framework for vision-language-action models. Through scale-calibrated components, it significantly reduces memory and speeds up inference while maintaining performance, enabling efficient deployment for embodied AI.
🔹 Publication Date: Published on Feb 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.20309
• PDF: https://arxiv.org/pdf/2602.20309
• Project Page: https://quantvla.github.io/
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#Quantization #VLAModels #EmbodiedAI #AIResearch #DeepLearning
❤1
✨MASQuant: Modality-Aware Smoothing Quantization for Multimodal Large Language Models
📝 Summary:
MASQuant improves multimodal LLM quantization by resolving smoothing misalignment and cross-modal invariance. It uses modality-aware smoothing and SVD whitening for cross-modal compensation, achieving stable, competitive performance.
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04800
• PDF: https://arxiv.org/pdf/2603.04800
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#MultimodalAI #LLM #Quantization #DeepLearning #AIResearch
📝 Summary:
MASQuant improves multimodal LLM quantization by resolving smoothing misalignment and cross-modal invariance. It uses modality-aware smoothing and SVD whitening for cross-modal compensation, achieving stable, competitive performance.
🔹 Publication Date: Published on Mar 5
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.04800
• PDF: https://arxiv.org/pdf/2603.04800
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#MultimodalAI #LLM #Quantization #DeepLearning #AIResearch
✨6Bit-Diffusion: Inference-Time Mixed-Precision Quantization for Video Diffusion Models
📝 Summary:
This paper introduces a mixed-precision quantization framework for video diffusion transformers. It dynamically allocates NVFP4/INT8 based on layer volatility and uses Temporal Delta Cache to skip computations, significantly reducing memory and cost while preserving quality.
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18742
• PDF: https://arxiv.org/pdf/2603.18742
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#Quantization #DiffusionModels #VideoAI #DeepLearning #ModelOptimization
📝 Summary:
This paper introduces a mixed-precision quantization framework for video diffusion transformers. It dynamically allocates NVFP4/INT8 based on layer volatility and uses Temporal Delta Cache to skip computations, significantly reducing memory and cost while preserving quality.
🔹 Publication Date: Published on Mar 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.18742
• PDF: https://arxiv.org/pdf/2603.18742
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#Quantization #DiffusionModels #VideoAI #DeepLearning #ModelOptimization
✨SignRoundV2: Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs
📝 Summary:
SignRoundV2 is a post-training quantization method for LLMs. It achieves competitive, near full-precision accuracy even at extremely low-bits like 2-bits. This is done via layer-wise bit allocation and pre-tuning scale search.
🔹 Publication Date: Published on Dec 4, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.04746
• PDF: https://arxiv.org/pdf/2512.04746
• Project Page: https://github.com/intel/auto-round
• Github: https://github.com/intel/auto-round
🔹 Models citing this paper:
• https://huggingface.co/Intel/MiroThinker-v1.5-30B-gguf-q2ks-mixed-AutoRound
• https://huggingface.co/Intel/DeepSeek-R1-0528-Qwen3-8B-int4-AutoRound
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LLMs #Quantization #DeepLearning #AI #MachineLearning
📝 Summary:
SignRoundV2 is a post-training quantization method for LLMs. It achieves competitive, near full-precision accuracy even at extremely low-bits like 2-bits. This is done via layer-wise bit allocation and pre-tuning scale search.
🔹 Publication Date: Published on Dec 4, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2512.04746
• PDF: https://arxiv.org/pdf/2512.04746
• Project Page: https://github.com/intel/auto-round
• Github: https://github.com/intel/auto-round
🔹 Models citing this paper:
• https://huggingface.co/Intel/MiroThinker-v1.5-30B-gguf-q2ks-mixed-AutoRound
• https://huggingface.co/Intel/DeepSeek-R1-0528-Qwen3-8B-int4-AutoRound
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LLMs #Quantization #DeepLearning #AI #MachineLearning