An X-ray-emitting protocluster at z ≈ 5.7 reveals rapid structure growth https://www.nature.com/articles/s41586-025-09973-1
Nature
An X-ray-emitting protocluster at z ≈ 5.7 reveals rapid structure growth
Nature - Discovery of a protocluster at z = 5.68, merely one billion years after the Big Bang, suggests that large-scale structure must have formed more rapidly in some regions of the...
Soft Contamination Means Benchmarks Test Shallow Generalization https://arxiv.org/abs/2602.12413
arXiv.org
Soft Contamination Means Benchmarks Test Shallow Generalization
If LLM training data is polluted with benchmark test data, then benchmark performance gives biased estimates of out-of-distribution (OOD) generalization. Typical decontamination filters use n-gram...
Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning https://arxiv.org/abs/2602.11149
arXiv.org
Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning
Supervised fine-tuning (SFT) on chain-of-thought data is an essential post-training step for reasoning language models. Standard machine learning intuition suggests that training with more unique...
rePIRL: Learn PRM with Inverse RL for LLM Reasoning https://arxiv.org/abs/2602.07832
arXiv.org
rePIRL: Learn PRM with Inverse RL for LLM Reasoning
Process rewards have been widely used in deep reinforcement learning to improve training efficiency, reduce variance, and prevent reward hacking. In LLM reasoning, existing works also explore...
❤3
Tensor Decomposition for Non-Clifford Gate Minimization https://arxiv.org/abs/2602.15285
arXiv.org
Tensor Decomposition for Non-Clifford Gate Minimization
Fault-tolerant quantum computation requires minimizing non-Clifford gates, whose implementation via magic state distillation dominates the resource costs. While $T$-count minimization is...
👍3😈1
BRIDGE: Predicting Human Task Completion Time From Model Performance https://arxiv.org/abs/2602.07267
arXiv.org
BRIDGE: Predicting Human Task Completion Time From Model Performance
Evaluating the real-world capabilities of AI systems requires grounding benchmark performance in human-interpretable measures of task difficulty. Existing approaches that rely on direct human task...
😁1
Forwarded from Love. Death. Transformers.
Если вы готовитесь к собесу в норм место вам будет полезно почитать
https://djdumpling.github.io/2026/01/31/frontier_training.html
https://djdumpling.github.io/2026/01/31/frontier_training.html
Alex Wa’s Blog
frontier model training methodologies
How do labs train a frontier, multi-billion parameter model? We look towards seven open-weight frontier models: Hugging Face’s SmolLM3, Prime Intellect’s Intellect 3, Nous Research’s Hermes 4, OpenAI’s gpt-oss-120b, Moonshot’s Kimi K2, DeepSeek’s DeepSeek…
👍3🔥3