AI & ML Papers
32.8K subscribers
7.05K photos
519 videos
24 files
7.71K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding

📝 Summary:
Speculative decoding accelerates RL post-training by preserving output distributions while improving rollout throughput, with projected 2.5x speedup at large scales. AI-generated summary RL post-train...

🔹 Publication Date: Published on Apr 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.26779
• PDF: https://arxiv.org/pdf/2604.26779

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ClawGym: A Scalable Framework for Building Effective Claw Agents

📝 Summary:
ClawGym presents a scalable framework for developing Claw-style personal agents with synthetic training data, verified workspaces, and benchmark evaluation. AI-generated summary Claw-style environment...

🔹 Publication Date: Published on Apr 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.26904
• PDF: https://arxiv.org/pdf/2604.26904
• Project Page: https://github.com/ClawGym

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising

📝 Summary:
X-WAM is a unified 4D world model that combines real-time robotic action execution with high-fidelity 4D world synthesis using pretrained video diffusion models and asynchronous noise sampling for imp...

🔹 Publication Date: Published on Apr 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.26694
• PDF: https://arxiv.org/pdf/2604.26694
• Project Page: https://sharinka0715.github.io/X-WAM/

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents

📝 Summary:
GLM-5V-Turbo is a foundation model that integrates multimodal perception as a core reasoning component for AI agents. This improves performance in multimodal coding and visual tool use, while maintaining strong text-only capabilities.

🔹 Publication Date: Published on Apr 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.26752
• PDF: https://arxiv.org/pdf/2604.26752
• Github: https://github.com/zai-org/GLM-V

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion

📝 Summary:
Diffusion Templates presents a unified framework that decouples base-model inference from controllable capabilities, enabling modular and composable control methods across various diffusion model appl...

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.24351
• PDF: https://arxiv.org/pdf/2604.24351
• Project Page: https://modelscope.github.io/diffusion-templates-web/

🔹 Models citing this paper:
https://huggingface.co/DiffSynth-Studio/Template-KleinBase4B-ControlNet
https://huggingface.co/DiffSynth-Studio/Template-KleinBase4B-Brightness
https://huggingface.co/DiffSynth-Studio/Template-KleinBase4B-SoftRGB

Datasets citing this paper:
https://huggingface.co/datasets/DiffSynth-Studio/ImagePulseV2-Edit-Inpaint
https://huggingface.co/datasets/DiffSynth-Studio/ImagePulseV2-Edit-Background
https://huggingface.co/datasets/DiffSynth-Studio/ImagePulseV2-Edit-Change

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
FASH-iCNN: Making Editorial Fashion Identity Inspectable Through Multimodal CNN Probing

📝 Summary:
FASH-iCNN is a multimodal system that identifies fashion house, era, and color tradition from garment photographs with high accuracy, revealing that texture and luminance are primary carriers of edito...

🔹 Publication Date: Published on Apr 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.26186
• PDF: https://arxiv.org/pdf/2604.26186

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
A Survey on LLM-based Conversational User Simulation

📝 Summary:
This paper surveys recent advancements in LLM-based conversational user simulation. It introduces a novel taxonomy of user granularity and simulation objectives, analyzing core techniques and evaluation methodologies to inform future research.

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.24977
• PDF: https://arxiv.org/pdf/2604.24977

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Tequila: Trapping-free Ternary Quantization for Large Language Models

📝 Summary:
Tequila is a new ternary quantization method for LLMs that solves deadzone trapping. It reactivates trapped weights as dynamic biases, significantly improving accuracy and inference speed. This makes LLM deployment on resource-constrained devices practical.

🔹 Publication Date: Published on Sep 28, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.23809
• PDF: https://arxiv.org/pdf/2509.23809
• Github: https://github.com/Tencent/AngelSlim

🔹 Models citing this paper:
https://huggingface.co/AngelSlim/Qwen3-a3B_eagle3
https://huggingface.co/AngelSlim/Qwen3-32B_eagle3
https://huggingface.co/AngelSlim/Qwen3-1.7B_eagle3

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Large Language Models Explore by Latent Distilling

📝 Summary:
Exploratory Sampling ESamp boosts LLM diversity beyond lexical variation. It uses a lightweight Distiller to predict hidden representations, biasing decoding towards novel semantic patterns via prediction error. ESamp boosts reasoning efficiency and creative writing, with low overhead.

🔹 Publication Date: Published on Apr 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.24927
• PDF: https://arxiv.org/pdf/2604.24927
• Github: https://github.com/LinesHogan/tllm

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#LLM #AI #NLP #DeepLearning #GenerativeAI
1
Probing Visual Planning in Image Editing Models

📝 Summary:
This paper redefines visual planning as a single-step image transformation using abstract puzzles for evaluation. Their EAR paradigm and AMAZE dataset reveal that current AI models, despite finetuning, cannot match human zero-shot efficiency, highlighting a gap in visual reasoning.

🔹 Publication Date: Published on Apr 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.22868
• PDF: https://arxiv.org/pdf/2604.22868
• Project Page: https://spatigen.github.io/amaze.io/
• Github: https://github.com/spatigen/amaze

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#VisualPlanning #ImageEditing #ComputerVision #AIResearch #MachineLearning
Media is too big
VIEW IN TELEGRAM
RADIO-ViPE: Online Tightly Coupled Multi-Modal Fusion for Open-Vocabulary Semantic SLAM in Dynamic Environments

📝 Summary:
RADIO-ViPE is an online semantic SLAM system providing open-vocabulary grounding from raw monocular RGB video, needing no calibration or depth. It tightly couples vision-language embeddings with geometry, handling dynamic environments effectively. This enables robust real-world deployment for aut...

🔹 Publication Date: Published on Apr 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.26067
• PDF: https://arxiv.org/pdf/2604.26067
• Project Page: https://be2rlab.github.io/radio_vipe
• Github: https://github.com/be2rlab/RADIO-ViPE

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
3
Praxy Voice: Voice-Prompt Recovery + BUPS for Commercial-Class Indic TTS from a Frozen Non-Indic Base at Zero Commercial-Training-Data Cost

📝 Summary:
Researchers enhanced a non-Indic text-to-speech system to achieve commercial-quality output for Indic languages Telugu, Tamil, Hindi at zero commercial data cost. They combined a unified phoneme space, LoRA adaptation, and voice-prompt recovery, matching or exceeding commercial baselines.

🔹 Publication Date: Published on Apr 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.25441
• PDF: https://arxiv.org/pdf/2604.25441
• Project Page: https://huggingface.co/spaces/Praxel/praxy-voice-demo
• Github: https://github.com/praxelhq/praxy

🔹 Models citing this paper:
https://huggingface.co/Praxel/praxy-voice-r6

Spaces citing this paper:
https://huggingface.co/spaces/Praxel/praxy-voice-demo

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
PSP: An Interpretable Per-Dimension Accent Benchmark for Indic Text-to-Speech

📝 Summary:
A new benchmark called PSP measures accent in Indic languages through six phonological dimensions, revealing inconsistencies between standard evaluation metrics and actual accent fidelity. AI-generate...

🔹 Publication Date: Published on Apr 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.25476
• PDF: https://arxiv.org/pdf/2604.25476
• Github: https://github.com/praxelhq/psp-eval

🔹 Models citing this paper:
https://huggingface.co/Praxel/praxy-voice-r6

Datasets citing this paper:
https://huggingface.co/datasets/Praxel/psp-native-centroids

Spaces citing this paper:
https://huggingface.co/spaces/Praxel/praxy-voice-demo

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
FAMA: Failure-Aware Meta-Agentic Framework for Open-Source LLMs in Interactive Tool Use Environments

📝 Summary:
Failure-Aware Meta-Agentic framework improves open-source LLM performance in conversational scenarios by identifying common errors and deploying specialized agents to correct them. AI-generated summar...

🔹 Publication Date: Published on Apr 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.25135
• PDF: https://arxiv.org/pdf/2604.25135

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Operating-Layer Controls for Onchain Language-Model Agents Under Real Capital

📝 Summary:
Autonomous language-model agents managing real cryptocurrency trades demonstrated high reliability through comprehensive system design encompassing prompt compilation, policy validation, and execution...

🔹 Publication Date: Published on Apr 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.26091
• PDF: https://arxiv.org/pdf/2604.26091
• Project Page: https://www.dxrg.ai/

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Enhanced Privacy and Communication Efficiency in Non-IID Federated Learning with Adaptive Quantization and Differential Privacy

📝 Summary:
Adaptive quantization combined with differential privacy reduces communication overhead in federated learning while maintaining model accuracy and privacy guarantees. AI-generated summary Federated le...

🔹 Publication Date: Published on Apr 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.23426
• PDF: https://arxiv.org/pdf/2604.23426
• Github: https://github.com/eardic/FL_DPQS

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Sample Selection Using Multi-Task Autoencoders in Federated Learning with Non-IID Data

📝 Summary:
Federated learning sample selection methods using multitask autoencoders, outlier detection techniques, and deep support vector data description enhance model accuracy under non-IID and noisy conditio...

🔹 Publication Date: Published on Apr 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.26116
• PDF: https://arxiv.org/pdf/2604.26116
• Project Page: https://github.com/eardic/FL_DPQS

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Synthetic Computers at Scale for Long-Horizon Productivity Simulation

📝 Summary:
Synthetic Computers at Scale creates realistic computer environments with folders and content. This enables long-horizon productivity simulations for AI agents, improving their performance through experiential learning and scalable self-improvement.

🔹 Publication Date: Published on Apr 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.28181
• PDF: https://arxiv.org/pdf/2604.28181
• Project Page: https://huggingface.co/datasets/microsoft/synthetic-computers-at-scale

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Heterogeneous Scientific Foundation Model Collaboration

📝 Summary:
Eywa is a heterogeneous agentic framework that extends language-centric systems to scientific foundation models by integrating domain-specific models with language-based reasoning interfaces for impro...

🔹 Publication Date: Published on Apr 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.27351
• PDF: https://arxiv.org/pdf/2604.27351
• Project Page: https://www.zihao.website/eywa.github.io/
• Github: https://www.zihao.website/eywa.github.io/

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research