AI & ML Papers

🔥 Next-Latent Prediction Transformers Learn Compact World Models

💡 The paper introduces Next-Latent Prediction, a method that enhances transformer architectures by adding self-supervised latent state prediction to the standard next-token training. The problem with standard transformers is that they lack an incentive to compress history into compact latent states, leading to poor generalization. To address this, the authors propose Next-Latent Prediction, which trains a transformer to learn latent representations that can predict the next latent state given the next output token. This approach injects a recurrent inductive bias into transformers, encouraging them to form compact internal world models with their own belief states and transition dynamics. The method is simple and efficient, and it does not change the architecture, parallel training, or inference of the transformer. The authors show that this approach leads to significant gains in downstream accuracy, representation compression, and lookahead planning across various benchmarks, including world modeling, reasoning, planning, and language modeling. The results demonstrate that Next-Latent Prediction is a effective paradigm for shaping transformer representations toward stronger generalization.

📅 Published on Nov 8, 2025

🔗 Links:
• GitHub: https://github.com/huggingface
• arXiv: https://arxiv.org/abs/2511.05963
• PDF: https://arxiv.org/pdf/2511.05963
• Project Page: https://jaydenteoh.github.io/blog/2026/nextlat

📊 Datasets citing this paper:
• https://huggingface.co/datasets/JaydenTeoh/manhattan

━━━━━━━━━━━━━━━━━━━━━━━━
📢 By: https://xn--r1a.website/PaperNexus

#NextLatentPrediction #TransformerArchitectures #SelfSupervisedLearning #LatentStatePrediction #CompactWorldModels

GitHub

Hugging Face

The AI community building the future. Hugging Face has 443 repositories available. Follow their code on GitHub.

❤2

517 views01:51

✨ Join Best TG Channels

👋 Join Our WhatsApp Channel

📝 Contact / Collaborate

About

Blog

Apps

Platform