A new work from Yoshua Bengio’s lab: Recursive Self-Aggregation > Gemini DeepThink.
it really is the best test-time scaling algorithm. Just crushed ARC-AGI 2 public evals with Gemini 3 Flash and RSA.
it really is the best test-time scaling algorithm. Just crushed ARC-AGI 2 public evals with Gemini 3 Flash and RSA.
Recursive Self-Aggregation Research
Recursive Self-Aggregation (RSA) for LLM Reasoning
Hybrid test-time scaling for LLMs: recursive aggregation of chains-of-thought, plus aggregation-aware RL.
❤5🔥4🥰4
All about AI, Web 3.0, BCI
Nvidia will acquire assets and key talent from chipmaking startup Groq for $20B Groq co-founder and CEO Jonathan Ross was lead designer and artchitect for the first generation of Google’s TPU chips. He’ll join Nvidia along with president Sunny Madra and…
Nvidia investing an additional $2 billion in to Corweave, to accelerate capacity buildout.
Nvidia will also make Vera CPU available as standlone offering, with Coreweave to deploy first. “Many” design wins to come.
Nvidia will also make Vera CPU available as standlone offering, with Coreweave to deploy first. “Many” design wins to come.
Bloomberg.com
Nvidia Invests $2 Billion More in CoreWeave, Offers New Chip
Nvidia Corp., the dominant maker of artificial intelligence chips, invested an additional $2 billion in the cloud computing firm and key customer CoreWeave Inc., marking the latest example of the circular financing deals that have lifted valuations of AI…
❤3🔥3👏3
Nvidia introduced 3 new open source models in the NV Earth-2 family, enabling weather forecasting with tools for data assimilation, forecasting, nowcasting, and downscaling.
Developers can also build climate simulations using PhysicsNeMo and create inference pipelines with the open source Earth2Studio framework.
Developers can also build climate simulations using PhysicsNeMo and create inference pipelines with the open source Earth2Studio framework.
👍4🔥4❤3
DeepSeek just released #DeepSeek-OCR 2
Now, AI could "see" an image in the same logical order as a human!
Its new method, the DeepEncoder V2, teaches the AI to dynamically reorder the pieces of an image based on its meaning, instead of just scanning it rigidly from left to right. This mimics how humans follow the logical flow of a scene.
The result is a model that outperforms conventional vision-language models, especially on images with complex layouts like documents or diagrams, by enabling more intelligent, causally-informed visual understanding.
Now, AI could "see" an image in the same logical order as a human!
Its new method, the DeepEncoder V2, teaches the AI to dynamically reorder the pieces of an image based on its meaning, instead of just scanning it rigidly from left to right. This mimics how humans follow the logical flow of a scene.
The result is a model that outperforms conventional vision-language models, especially on images with complex layouts like documents or diagrams, by enabling more intelligent, causally-informed visual understanding.
GitHub
GitHub - deepseek-ai/DeepSeek-OCR-2: Visual Causal Flow
Visual Causal Flow. Contribute to deepseek-ai/DeepSeek-OCR-2 development by creating an account on GitHub.
🔥4❤3👍2
The “One Person Company” (OPC) model is booming, especially in innovation hubs like Shenzhen, where AI-powered entrepreneurship is reshaping the business landscape.
These OPCs, often led by a single founder supported by AI and minimal staff, offer fast decision-making, low costs, and high flexibility. Shenzhen is building dedicated OPC hubs, attracting creators nationwide.
These OPCs, often led by a single founder supported by AI and minimal staff, offer fast decision-making, low costs, and high flexibility. Shenzhen is building dedicated OPC hubs, attracting creators nationwide.
🔥7💯4👏3🤡1
Moonshot AI released Kimi K2.5, Open-Source Visual Agentic Intelligence
Global SOTA on Agentic Benchmarks: HLE full set (50.2%), BrowseComp (74.9%)
Open-source SOTA on Vision and Coding: MMMU Pro (78.5%), VideoMMMU (86.6%), SWE-bench Verified (76.8%)
Code with Taste: turn chats, images & videos into aesthetic websites with expressive motion.
Agent Swarm (Beta): self-directed agents working in parallel, at scale. Up to 100 sub-agents, 1,500 tool calls, 4.5× faster compared with single-agent setup.
K2.5 is now live on kimi.com in chat mode and agent mode.
K2.5 Agent Swarm in beta for high-tier users.
For production-grade coding, you can pair K2.5 with Kimi Code
Weights & code.
Global SOTA on Agentic Benchmarks: HLE full set (50.2%), BrowseComp (74.9%)
Open-source SOTA on Vision and Coding: MMMU Pro (78.5%), VideoMMMU (86.6%), SWE-bench Verified (76.8%)
Code with Taste: turn chats, images & videos into aesthetic websites with expressive motion.
Agent Swarm (Beta): self-directed agents working in parallel, at scale. Up to 100 sub-agents, 1,500 tool calls, 4.5× faster compared with single-agent setup.
K2.5 is now live on kimi.com in chat mode and agent mode.
K2.5 Agent Swarm in beta for high-tier users.
For production-grade coding, you can pair K2.5 with Kimi Code
Weights & code.
❤🔥7🔥4👏3
Qwen released Qwen3-Max-Thinking, its flagship reasoning model and DeepPlanning
It says demonstrates performance comparable to models such as GPT-5.2 Thinking and Opus 4.5 (Qwen).
Key innovations:
1. Adaptive tool-use: intelligently leverages Search, Memory & Code Interpreter without manual selection
2. Test-time scaling: multi-round self-reflection beats Gemini 3 Pro on reasoning
3. From complex math (98.0 on HMMT Feb) to agentic search (49.8 on HLE)—it just thinks better.
DeepPlanning is a new benchmark for long-horizon agent planning in real-world scenarios.
HF
ModelScope.
It says demonstrates performance comparable to models such as GPT-5.2 Thinking and Opus 4.5 (Qwen).
Key innovations:
1. Adaptive tool-use: intelligently leverages Search, Memory & Code Interpreter without manual selection
2. Test-time scaling: multi-round self-reflection beats Gemini 3 Pro on reasoning
3. From complex math (98.0 on HMMT Feb) to agentic search (49.8 on HLE)—it just thinks better.
DeepPlanning is a new benchmark for long-horizon agent planning in real-world scenarios.
HF
ModelScope.
❤5🔥5👍3
OpenAI introduced Prism a free, AI-native workspace for scientists to write and collaborate on research, powered by GPT-5.2.
Accelerating science requires progress on two fronts:
1. Frontier AI models that use scientific tools and can tackle the hardest problems
2. Integrating that AI into the products scientists use every day
Prism is free to anyone with a ChatGPT account, with unlimited projects and collaborators.
Accelerating science requires progress on two fronts:
1. Frontier AI models that use scientific tools and can tackle the hardest problems
2. Integrating that AI into the products scientists use every day
Prism is free to anyone with a ChatGPT account, with unlimited projects and collaborators.
Openai
Prism | A free, LaTeX-native workspace for scientists
Write, edit, and collaborate on scientific documents in LaTeX with Prism—a free workspace integrating GPT-5.2 into research and writing.
❤6🔥2👏2
Google introduced ATLAS: new scaling laws for massively multilingual language models.
Practical, data-driven guidance to balance data mix and model size, helping global developers better serve billions of non-English speakers.
Practical, data-driven guidance to balance data mix and model size, helping global developers better serve billions of non-English speakers.
research.google
ATLAS: Practical scaling laws for multilingual models
❤2🔥2🆒2👏1
Big news in clinical AI: Aidoc secured FDA clearance for healthcare’s first comprehensive AI triage solution for body CT, powered by their CARE foundation model.
Healthcare AI | Aidoc Always-on AI
Aidoc Secures New FDA Clearance
Aidoc announced 11 newly cleared indications, combined with three existing ones, to introduce an AI safety net for crowded Emergency Departments and imaging backlogs.
👏3🔥2💯2
Fidelity to launch dollar-backed stablecoin FIDD on Ethereum in coming weeks
The firm first said it was testing a stablecoin in early 2025, but had not committed to a launch at the time.
The token will be issued by Fidelity Digital Assets’ national trust bank and is expected to roll out to both retail and institutional customers.
Fidelity said it will oversee issuance and management of reserves for the stablecoin, leaning on its asset management arm, Fidelity Management & Research Company LLC, to handle reserve assets.
Customers will be able to purchase or redeem FIDD for $1 through Fidelity Digital Assets, Fidelity Crypto and Fidelity Crypto for Wealth Managers, with the stablecoin also transferable to any Ethereum mainnet address and available on major crypto exchanges where it is listed.
The firm first said it was testing a stablecoin in early 2025, but had not committed to a launch at the time.
The token will be issued by Fidelity Digital Assets’ national trust bank and is expected to roll out to both retail and institutional customers.
Fidelity said it will oversee issuance and management of reserves for the stablecoin, leaning on its asset management arm, Fidelity Management & Research Company LLC, to handle reserve assets.
Customers will be able to purchase or redeem FIDD for $1 through Fidelity Digital Assets, Fidelity Crypto and Fidelity Crypto for Wealth Managers, with the stablecoin also transferable to any Ethereum mainnet address and available on major crypto exchanges where it is listed.
The Block
Fidelity to launch dollar-backed stablecoin FIDD on Ethereum in coming weeks
Fidelity Investments plans to launch its own Ethereum-based stablecoin, FIDD, as U.S. stablecoin regulation comes into focus.
❤3🔥2👏2
In the last month, 1X, Skild, and Physical Intelligence all signaled a shift to human data.
Robotics is caught in a tug-of-war between quality and scale, where reality is the referee.
This essay explains why the robot models that best navigate the “Data Pareto Frontier” will win in 2026.
Robotics is caught in a tug-of-war between quality and scale, where reality is the referee.
This essay explains why the robot models that best navigate the “Data Pareto Frontier” will win in 2026.
vincentliu.org
The Robotics Data Pareto Frontier ― Vincent Liu
The defining narrative of robotics in 2025 was not a new model architecture, but an enthusiasm for data. Despite a consensus around teleoperation as the gold...
🔥3🥰2👏2
Researchers from IBM, the University of Melbourne, Southeast University present a unified theory: "Neural Network Reprogrammability."
They show that methods like prompt tuning and in-context learning all work by manipulating the data flowing into a frozen model, not by changing the model itself.
This framework outperforms isolated research by providing a universal taxonomy to understand and improve adaptation across all data types and model architectures.
GitHub.
They show that methods like prompt tuning and in-context learning all work by manipulating the data flowing into a frozen model, not by changing the model itself.
This framework outperforms isolated research by providing a universal taxonomy to understand and improve adaptation across all data types and model architectures.
GitHub.
❤3🔥2👏2
Ai2 released SERA-32B, an approach to coding agents that matches Devstral 2 at just $9,000.
It is fully open-source and you can train your own model easily - at 26x the efficiency of using RL.
Models and data.
GitHub.
It is fully open-source and you can train your own model easily - at 26x the efficiency of using RL.
Models and data.
GitHub.
❤3🔥2👏2
Oxford with Microsoft has piloted an Oxford‑built multi‑agent AI assistant for cancer care.
Integrated directly into Microsoft Teams, the assistant orchestrates specialised agents to support tumour board (MDT) decision‑making in a clinically realistic setting at Oxford University Hospitals—demonstrating impact in real workflows, not just a lab prototype.
GitHub.
Integrated directly into Microsoft Teams, the assistant orchestrates specialised agents to support tumour board (MDT) decision‑making in a clinically realistic setting at Oxford University Hospitals—demonstrating impact in real workflows, not just a lab prototype.
GitHub.
University of Oxford
Oxford-built multi-agent assistant for cancer care to be piloted in collaboration with Microsoft
Researchers at the University of Oxford have developed TrustedMDT, a multi-agent artificial intelligence (AI) system designed to support medical specialists during cancer treatment planning meetings.
🆒4🔥2🥰2👏1
Ant Group(Alibaba) is going big on robotics with Robbyant
Robbyant dropped LingBot-VLA and LingBot-Depth models.
LingBot-VLA is a pragmatic Vision-Language-Action model designed to bridge the gap between perception and execution in robotics.
LingBot-VLA-4B: Lightweight & versatile.
LingBot-VLA-4B-Depth: Enhanced for high-precision spatial tasks.
Powerful Core: built on the Qwen2.5-VL-3B foundation, mastering multi-tasking and dual-arm coordination across 9+ robot configs.
Elite Performance: Outperforms competitors like π0.5 and GR00T in success rates (SR) on both GM-100 (Real-world) and RoboTwin 2.0 (Sim).
Hyper-Efficient: 1.5–2.8x faster training than existing VLA codebases, scaling smoothly from 8 to 256 GPUs.
Spatial Precision: Features a Depth-distillated version for pinpoint 3D accuracy in complex environments.
Massive Data: Pre-trained on 20,000+ hours of real-world data for unparalleled generalization.
Robbyant dropped LingBot-VLA and LingBot-Depth models.
LingBot-VLA is a pragmatic Vision-Language-Action model designed to bridge the gap between perception and execution in robotics.
LingBot-VLA-4B: Lightweight & versatile.
LingBot-VLA-4B-Depth: Enhanced for high-precision spatial tasks.
Powerful Core: built on the Qwen2.5-VL-3B foundation, mastering multi-tasking and dual-arm coordination across 9+ robot configs.
Elite Performance: Outperforms competitors like π0.5 and GR00T in success rates (SR) on both GM-100 (Real-world) and RoboTwin 2.0 (Sim).
Hyper-Efficient: 1.5–2.8x faster training than existing VLA codebases, scaling smoothly from 8 to 256 GPUs.
Spatial Precision: Features a Depth-distillated version for pinpoint 3D accuracy in complex environments.
Massive Data: Pre-trained on 20,000+ hours of real-world data for unparalleled generalization.
Robbyant 蚂蚁灵波科技
Robbyant - Exploring the Frontiers of Embodied Intelligence | 灵波科技
We focus on foundational large models for embodied intelligence. LingBot-Depth, LingBot-VLA, LingBot-World, LingBot-VA. 专注具身智能基础大模型:空间感知、VLA、世界模型、视频动作。
❤3🔥2👏2
Google DeepMind launched Project Genie, an experimental prototype of the world's most advanced world model.
Create entire playable worlds to explore in real-time just from a simple text prompt.
Available to Ultra subs in the US for now.
Create entire playable worlds to explore in real-time just from a simple text prompt.
Available to Ultra subs in the US for now.
Google
Project Genie: Experimenting with infinite, interactive worlds
Google AI Ultra subscribers in the U.S. can now try out Project Genie.
🔥4👍2👏2
OpenAI is laying the groundwork for a Q4 IPO and has started informal talks with Wall Street banks while building out its finance team.
OpenAI is moving faster in part because it’s worried Anthropic could beat it to market.
OpenAI is moving faster in part because it’s worried Anthropic could beat it to market.
The Wall Street Journal
Exclusive | OpenAI Plans Fourth-Quarter IPO in Race to Beat Anthropic to Market
The rivals are competing to be the first major generative AI startup to tap the public markets.
❤3🔥3🥰2
Cool work that aligns with how humans learn.
The model writes its own answers
1) without cheating
2) cheating (seeing the true answer)
It learns to make (1) close to (2) by minimizing the KL divergence.
This prevent catastrophic forgetting in continual learning.
The model writes its own answers
1) without cheating
2) cheating (seeing the true answer)
It learns to make (1) close to (2) by minimizing the KL divergence.
This prevent catastrophic forgetting in continual learning.
❤4👏4🔥3
New paper from Google DeepMind studying how LLMs representations of things like factuality evolve over a conversation.
Researchers find that in edge case conversations, e.g. about model consciousness or delusional content, model representations can change dramatically.
In a simulated argument where two language models argue about whether they are conscious or not (one pro, one anti) their representations for questions about consciousness flip back and forth as they play each role.
By contrast, contexts that are clearly framed as sci-fi stories result in less representational change.
Researchers think these results are interesting as one way models adapt to context, and are consistent with a "role-play" description in which models' representations evolve to reflect the current role, e.g. in an argument. (N.b. these conversations are mostly noton policy!).
They also raise challenges for the construct validity of dimensions discovered using interpretability methods — dimensions may not have the same meaning w.r.t. ground truth at different points in a context. This poses challenges for probing and steering for safety, etc.
Researchers find that in edge case conversations, e.g. about model consciousness or delusional content, model representations can change dramatically.
In a simulated argument where two language models argue about whether they are conscious or not (one pro, one anti) their representations for questions about consciousness flip back and forth as they play each role.
By contrast, contexts that are clearly framed as sci-fi stories result in less representational change.
Researchers think these results are interesting as one way models adapt to context, and are consistent with a "role-play" description in which models' representations evolve to reflect the current role, e.g. in an argument. (N.b. these conversations are mostly noton policy!).
They also raise challenges for the construct validity of dimensions discovered using interpretability methods — dimensions may not have the same meaning w.r.t. ground truth at different points in a context. This poses challenges for probing and steering for safety, etc.
arXiv.org
Linear representations in language models can change dramatically...
Language model representations often contain linear directions that correspond to high-level concepts. Here, we study the dynamics of these representations: how representations evolve along these...
❤4🔥4👍3