✨Exposing the Systematic Vulnerability of Open-Weight Models to Prefill Attacks
📝 Summary:
A study reveals prefill attacks as a critical, underexplored vulnerability in open-weight language models. These attacks, which predefine initial response tokens, consistently compromise major models, necessitating urgent defense development.
🔹 Publication Date: Published on Feb 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.14689
• PDF: https://arxiv.org/pdf/2602.14689
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#PrefillAttacks #LLMSecurity #AIvulnerability #OpenWeightModels #LanguageModels
📝 Summary:
A study reveals prefill attacks as a critical, underexplored vulnerability in open-weight language models. These attacks, which predefine initial response tokens, consistently compromise major models, necessitating urgent defense development.
🔹 Publication Date: Published on Feb 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.14689
• PDF: https://arxiv.org/pdf/2602.14689
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#PrefillAttacks #LLMSecurity #AIvulnerability #OpenWeightModels #LanguageModels
✨Can I Have Your Order? Monte-Carlo Tree Search for Slot Filling Ordering in Diffusion Language Models
📝 Summary:
McDiffuSE uses Monte Carlo Tree Search to optimize slot infilling order in Masked Diffusion Models, enhancing reasoning performance. It achieved significant gains, revealing non-sequential generation and larger exploration are key to overcoming model biases.
🔹 Publication Date: Published on Feb 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.12586
• PDF: https://arxiv.org/pdf/2602.12586
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#MonteCarloTreeSearch #DiffusionModels #NLP #LanguageModels #AI
📝 Summary:
McDiffuSE uses Monte Carlo Tree Search to optimize slot infilling order in Masked Diffusion Models, enhancing reasoning performance. It achieved significant gains, revealing non-sequential generation and larger exploration are key to overcoming model biases.
🔹 Publication Date: Published on Feb 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.12586
• PDF: https://arxiv.org/pdf/2602.12586
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#MonteCarloTreeSearch #DiffusionModels #NLP #LanguageModels #AI
❤1
✨HLE-Verified: A Systematic Verification and Structured Revision of Humanity's Last Exam
📝 Summary:
HLE-Verified systematically validates and revises the HLE benchmark, resolving noisy items through expert review and model-based checks. This improves language model evaluation accuracy by 7-10 percentage points, especially on erroneous items, enabling more reliable measurement of model capabilit...
🔹 Publication Date: Published on Feb 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.13964
• PDF: https://arxiv.org/pdf/2602.13964
✨ Datasets citing this paper:
• https://huggingface.co/datasets/skylenage/HLE-Verified
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LLMEvaluation #Benchmarking #LanguageModels #AIResearch #NLP
📝 Summary:
HLE-Verified systematically validates and revises the HLE benchmark, resolving noisy items through expert review and model-based checks. This improves language model evaluation accuracy by 7-10 percentage points, especially on erroneous items, enabling more reliable measurement of model capabilit...
🔹 Publication Date: Published on Feb 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.13964
• PDF: https://arxiv.org/pdf/2602.13964
✨ Datasets citing this paper:
• https://huggingface.co/datasets/skylenage/HLE-Verified
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LLMEvaluation #Benchmarking #LanguageModels #AIResearch #NLP
✨Sink-Aware Pruning for Diffusion Language Models
📝 Summary:
Diffusion Language Models have high inference costs. This paper finds that their attention sinks are often unstable, unlike in autoregressive models. Sink-Aware Pruning identifies and removes these unstable sinks, improving efficiency and quality without retraining.
🔹 Publication Date: Published on Feb 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.17664
• PDF: https://arxiv.org/pdf/2602.17664
• Github: https://github.com/VILA-Lab/Sink-Aware-Pruning
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#DiffusionModels #LanguageModels #ModelPruning #NLP #AIResearch
📝 Summary:
Diffusion Language Models have high inference costs. This paper finds that their attention sinks are often unstable, unlike in autoregressive models. Sink-Aware Pruning identifies and removes these unstable sinks, improving efficiency and quality without retraining.
🔹 Publication Date: Published on Feb 19
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.17664
• PDF: https://arxiv.org/pdf/2602.17664
• Github: https://github.com/VILA-Lab/Sink-Aware-Pruning
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#DiffusionModels #LanguageModels #ModelPruning #NLP #AIResearch
✨One-step Language Modeling via Continuous Denoising
📝 Summary:
This paper introduces flow-based language models that use continuous denoising over one-hot token encodings. They surpass discrete diffusion models in quality and speed, particularly for few-step generation, challenging discrete diffusion's necessity for discrete data.
🔹 Publication Date: Published on Feb 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.16813
• PDF: https://arxiv.org/pdf/2602.16813
• Project Page: https://one-step-lm.github.io/
• Github: https://github.com/david3684/flm
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LanguageModels #GenerativeAI #DeepLearning #NLP #AI
📝 Summary:
This paper introduces flow-based language models that use continuous denoising over one-hot token encodings. They surpass discrete diffusion models in quality and speed, particularly for few-step generation, challenging discrete diffusion's necessity for discrete data.
🔹 Publication Date: Published on Feb 18
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.16813
• PDF: https://arxiv.org/pdf/2602.16813
• Project Page: https://one-step-lm.github.io/
• Github: https://github.com/david3684/flm
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LanguageModels #GenerativeAI #DeepLearning #NLP #AI
This media is not supported in your browser
VIEW IN TELEGRAM
✨Reward Prediction with Factorized World States
📝 Summary:
StateFactory transforms observations into hierarchical object-attribute structures using language models. This enables superior zero-shot reward prediction across domains by measuring semantic similarity, significantly improving agent planning performance.
🔹 Publication Date: Published on Mar 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09400
• PDF: https://arxiv.org/pdf/2603.09400
• Project Page: https://statefactory.github.io/
• Github: https://github.com/yijunshens/StateFactory
✨ Datasets citing this paper:
• https://huggingface.co/datasets/YijunShen/RewardPrediction
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#RewardPrediction #AI #LanguageModels #MachineLearning #AgentPlanning
📝 Summary:
StateFactory transforms observations into hierarchical object-attribute structures using language models. This enables superior zero-shot reward prediction across domains by measuring semantic similarity, significantly improving agent planning performance.
🔹 Publication Date: Published on Mar 10
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.09400
• PDF: https://arxiv.org/pdf/2603.09400
• Project Page: https://statefactory.github.io/
• Github: https://github.com/yijunshens/StateFactory
✨ Datasets citing this paper:
• https://huggingface.co/datasets/YijunShen/RewardPrediction
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#RewardPrediction #AI #LanguageModels #MachineLearning #AgentPlanning
✨Training Language Models via Neural Cellular Automata
📝 Summary:
This paper introduces using Neural Cellular Automata NCA to generate synthetic data for pre-pre-training language models, addressing natural language limitations. This approach improves performance, accelerates convergence, and transfers to reasoning tasks, often outperforming extensive natural l...
🔹 Publication Date: Published on Mar 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.10055
• PDF: https://arxiv.org/pdf/2603.10055
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#AI #LanguageModels #NeuralCellularAutomata #SyntheticData #NLP
📝 Summary:
This paper introduces using Neural Cellular Automata NCA to generate synthetic data for pre-pre-training language models, addressing natural language limitations. This approach improves performance, accelerates convergence, and transfers to reasoning tasks, often outperforming extensive natural l...
🔹 Publication Date: Published on Mar 9
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.10055
• PDF: https://arxiv.org/pdf/2603.10055
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#AI #LanguageModels #NeuralCellularAutomata #SyntheticData #NLP
✨LoopRPT: Reinforcement Pre-Training for Looped Language Models
📝 Summary:
LoopRPT is a reinforcement pre-training framework for looped language models. It directly shapes intermediate representations by assigning reinforcement signals to latent steps, improving latent reasoning. This leads to better accuracy-computation trade-offs and enhanced early-stage reasoning.
🔹 Publication Date: Published on Mar 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19714
• PDF: https://arxiv.org/pdf/2603.19714
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#ReinforcementLearning #LanguageModels #AI #NLP #DeepLearning
📝 Summary:
LoopRPT is a reinforcement pre-training framework for looped language models. It directly shapes intermediate representations by assigning reinforcement signals to latent steps, improving latent reasoning. This leads to better accuracy-computation trade-offs and enhanced early-stage reasoning.
🔹 Publication Date: Published on Mar 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.19714
• PDF: https://arxiv.org/pdf/2603.19714
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#ReinforcementLearning #LanguageModels #AI #NLP #DeepLearning
✨Diffutron: A Masked Diffusion Language Model for Turkish Language
📝 Summary:
Diffutron introduces a compact masked diffusion language model for Turkish. It uses resource-efficient LoRA-based pre-training and progressive instruction tuning. The model achieves competitive performance for non-autoregressive Turkish text generation despite its small size.
🔹 Publication Date: Published on Mar 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.20466
• PDF: https://arxiv.org/pdf/2603.20466
🔹 Models citing this paper:
• https://huggingface.co/diffutron/DiffutronLM-0.3B-Instruct
• https://huggingface.co/diffutron/DiffutronLM-0.3B-Base
• https://huggingface.co/diffutron/DiffutronLM-0.3B-1st-Stage
✨ Datasets citing this paper:
• https://huggingface.co/datasets/diffutron/DiffutronLM-Pretraining-Corpus
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LanguageModels #TurkishNLP #DiffusionModels #NLP #AI
📝 Summary:
Diffutron introduces a compact masked diffusion language model for Turkish. It uses resource-efficient LoRA-based pre-training and progressive instruction tuning. The model achieves competitive performance for non-autoregressive Turkish text generation despite its small size.
🔹 Publication Date: Published on Mar 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.20466
• PDF: https://arxiv.org/pdf/2603.20466
🔹 Models citing this paper:
• https://huggingface.co/diffutron/DiffutronLM-0.3B-Instruct
• https://huggingface.co/diffutron/DiffutronLM-0.3B-Base
• https://huggingface.co/diffutron/DiffutronLM-0.3B-1st-Stage
✨ Datasets citing this paper:
• https://huggingface.co/datasets/diffutron/DiffutronLM-Pretraining-Corpus
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#LanguageModels #TurkishNLP #DiffusionModels #NLP #AI
✨T5Gemma-TTS Technical Report
📝 Summary:
T5Gemma-TTS is an encoder-decoder codec language model that improves voice cloning and duration control for multilingual speech synthesis. It uses cross-attention for persistent text conditioning and Progress-Monitoring Rotary Position Embedding PM-RoPE for better target speech length tracking. I...
🔹 Publication Date: Published on Apr 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01760
• PDF: https://arxiv.org/pdf/2604.01760
• Github: https://github.com/Aratako/T5Gemma-TTS
🔹 Models citing this paper:
• https://huggingface.co/Aratako/T5Gemma-TTS-2b-2b
✨ Spaces citing this paper:
• https://huggingface.co/spaces/Aratako/T5Gemma-TTS-Demo
• https://huggingface.co/spaces/litagin/T5Gemma-TTS-Demo
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#SpeechSynthesis #TTS #VoiceCloning #Multilingual #LanguageModels
📝 Summary:
T5Gemma-TTS is an encoder-decoder codec language model that improves voice cloning and duration control for multilingual speech synthesis. It uses cross-attention for persistent text conditioning and Progress-Monitoring Rotary Position Embedding PM-RoPE for better target speech length tracking. I...
🔹 Publication Date: Published on Apr 2
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.01760
• PDF: https://arxiv.org/pdf/2604.01760
• Github: https://github.com/Aratako/T5Gemma-TTS
🔹 Models citing this paper:
• https://huggingface.co/Aratako/T5Gemma-TTS-2b-2b
✨ Spaces citing this paper:
• https://huggingface.co/spaces/Aratako/T5Gemma-TTS-Demo
• https://huggingface.co/spaces/litagin/T5Gemma-TTS-Demo
==================================
For more data science resources:
✓ https://xn--r1a.website/DataScienceT
#SpeechSynthesis #TTS #VoiceCloning #Multilingual #LanguageModels