OpenAI acquired Statsig (A/B testing) for $1.1 billion
Vijaye Raji founder & CEO of Statsig, will join OpenAI as CTO of Applications to lead engineering for ChatGPT & Codex, following the acquisition of Statsig.
Srinivas Narayanan promoted to CTO of B2B Applications.
Kevin Weil will be heading a new team as the VP of AI for Science.
Vijaye Raji founder & CEO of Statsig, will join OpenAI as CTO of Applications to lead engineering for ChatGPT & Codex, following the acquisition of Statsig.
Srinivas Narayanan promoted to CTO of B2B Applications.
Kevin Weil will be heading a new team as the VP of AI for Science.
Openai
Vijaye Raji to become CTO of Applications with acquisition of Statsig
We’re expanding our Applications leadership, the org responsible for how our research reaches and benefits the world.
👍3🔥3❤2
EPFL , ETH Zurich released Apertus, Switzerland's first large-scale, multilingual LLM.
Swiss AI
Apertus | Swiss AI
🔥3❤2👏1🥴1
NVIDIA just dropped Universal Deep Research.You can use it to design your own AI research assistant.
Universal Deep Research (UDR) is the first truly customizable research agent that breaks free from hard-coded limitations.
Unlike existing research tools that force you into predetermined workflows, UDR lets users create, edit, and refine completely custom research strategies without any training required.
The system comes with example strategies (minimal, expansive, intensive) but the real power is in the customization.
Code.
Universal Deep Research (UDR) is the first truly customizable research agent that breaks free from hard-coded limitations.
Unlike existing research tools that force you into predetermined workflows, UDR lets users create, edit, and refine completely custom research strategies without any training required.
The system comes with example strategies (minimal, expansive, intensive) but the real power is in the customization.
Code.
Nvidia
Universal Deep Research
Website for the project 'Universal Deep Research: Bring Your Own Model and Strategy'
❤3👍3🔥3
A great work from MIT:General Social Agents
In a nutshell, it's about how you might create "general" agents in the sense that their behavior would match real humans even in novel settings.
It's both about the method (agent-building pipeline) & some demonstrations that this "works" (using real human subject experiments).
In a nutshell, it's about how you might create "general" agents in the sense that their behavior would match real humans even in novel settings.
It's both about the method (agent-building pipeline) & some demonstrations that this "works" (using real human subject experiments).
🔥4👍3❤2
University Health Network announced that Canada’s first Neuralink implant surgeries have been successfully completed at UHN.
These procedures mark the first Neuralink surgeries performed outside the United States, representing a significant milestone in global neurosurgical innovation.
The surgeries are part of the CAN-PRIME Study (Canadian Precise Robotically Implanted Brain-Computer Interface), a clinical trial evaluating the safety and functionality of Neuralink’s implant and surgical robot.
The study aims to enable individuals with quadriplegia to control external devices using their thoughts.
These procedures mark the first Neuralink surgeries performed outside the United States, representing a significant milestone in global neurosurgical innovation.
The surgeries are part of the CAN-PRIME Study (Canadian Precise Robotically Implanted Brain-Computer Interface), a clinical trial evaluating the safety and functionality of Neuralink’s implant and surgical robot.
The study aims to enable individuals with quadriplegia to control external devices using their thoughts.
www.uhn.ca
UHN performs first Neuralink implant surgeries outside the U.S.
🔥4🥰2👏2
OpenAI announced the OpenAI Jobs Platform to connect AI-ready workers with companies who need AI skills, and OpenAI-Certified for workers to learn and demonstrate their AI skills.
Openai
Expanding economic opportunity with AI
Fidji Simo - CEO, Applications
LLM agents trained with dynamic planning learn when to spend test-time compute, balancing cost & performance.
This is the first work to explore training LLM agents for dynamic test-time compute allocation in sequential decision-making tasks.
This is the first work to explore training LLM agents for dynamic test-time compute allocation in sequential decision-making tasks.
❤4🔥4🙏3
Can AI agents benefit from personality? ETH Zurich, BASF SE, Cledar, IDEAS Research Institute
Researchers presented MBTI-in-Thoughts, a framework that conditions LLMs with psychologically grounded archetypes (e.g., MBTI types) via prompt engineering.
Findings:
- Emotional priming boosts narrative generation
- Analytical priming improves stability in game-theoretic tasks
- Multi-agent setups show better cooperation after self-reflection
- Personality persistence verified via 16Personalities test
- Generalizes beyond MBTI → Big Five, HEXACO, Enneagram
GitHub.
Researchers presented MBTI-in-Thoughts, a framework that conditions LLMs with psychologically grounded archetypes (e.g., MBTI types) via prompt engineering.
Findings:
- Emotional priming boosts narrative generation
- Analytical priming improves stability in game-theoretic tasks
- Multi-agent setups show better cooperation after self-reflection
- Personality persistence verified via 16Personalities test
- Generalizes beyond MBTI → Big Five, HEXACO, Enneagram
GitHub.
arXiv.org
Psychologically Enhanced AI Agents
We introduce MBTI-in-Thoughts, a framework for enhancing the effectiveness of Large Language Model (LLM) agents through psychologically grounded personality conditioning. Drawing on the...
🔥3👏3❤2
Visa introduced Visa Intelligent Commerce, an initiative to empower agents to shop and buy.
Visa’s MCP gateway connects AI agents directly to Visa Intelligent Commerce APIs and allows them to discover, authenticate and invoke integrated services like Tokenization, Authentication and Personalization to build intelligent, payment-enabled experiences.
For developers, the MCP Server provides a faster path from idea to a secure, working agent:
1. No need to hand-code every API call
2. Prototypes in hours, not weeks
3. Lets agents dynamically apply Visa APIs in new contexts
The result → AI agents that browse, buy, and transact on your behalf.
Also piloting the Visa Acceptance Agent Toolkit, built on MCP. It lets developers and business users trigger Visa Acceptance actions with plain-language prompts—no code required.
Visa’s MCP gateway connects AI agents directly to Visa Intelligent Commerce APIs and allows them to discover, authenticate and invoke integrated services like Tokenization, Authentication and Personalization to build intelligent, payment-enabled experiences.
For developers, the MCP Server provides a faster path from idea to a secure, working agent:
1. No need to hand-code every API call
2. Prototypes in hours, not weeks
3. Lets agents dynamically apply Visa APIs in new contexts
The result → AI agents that browse, buy, and transact on your behalf.
Also piloting the Visa Acceptance Agent Toolkit, built on MCP. It lets developers and business users trigger Visa Acceptance actions with plain-language prompts—no code required.
🔥5❤3👏2
University students: Get a FREE year of Gemini Pro and more. Sign up by Nov 3rd in Germany. Egypt. Saudi Arabia. UK. Mexico.
+ Unlimited image uploads
+ Nano Banana for images
+ Veo 3 for videos
+ Personalized exam prep
+ Save hours with Deep Research
+ Talk it out with Gemini Live
+ Unlimited image uploads
+ Nano Banana for images
+ Veo 3 for videos
+ Personalized exam prep
+ Save hours with Deep Research
+ Talk it out with Gemini Live
🔥3🆒3❤2🥰1
All about AI, Web 3.0, BCI
Physical Intelligence introduced Real-Time Action Chunking, a method that lets VLAs execute actions while “thinking." Instead of waiting for inference to finish, a robot can start acting with the next steps, completing the given task more quickly
Physical intelligence added pi-05 to the openpi repo: pi05-base, pi05-droid, pi05-libero. Also added PyTorch training code.
This should be a straight model upgrade over pi0 in all aspects. See eg pi05_droid leading previous models in open RoboArena evals.
GitHub.
This should be a straight model upgrade over pi0 in all aspects. See eg pi05_droid leading previous models in open RoboArena evals.
GitHub.
www.pi.website
A VLA with Open-World Generalization
Our latest generalist policy, π0.5, extends π0 and enables open-world generalization. Our new model can control a mobile manipulator to clean up an entirely new kitchen or bedroom.
🔥5❤4👏3👍1
Baidu launched ERNIE X1.1
In benchmark evaluations, it surpasses DeepSeek R1-0528 and performs on par with GPT-5 and Gemini 2.5 Pro.
Built on the foundation of ERNIE 4.5, the model is enhanced with extensive mid-training and post-training, including end-to-end reinforcement learning.
Available on ERNIE Bot, Wenxiaoyan app and MaaS platform Qianfan (via API)
In benchmark evaluations, it surpasses DeepSeek R1-0528 and performs on par with GPT-5 and Gemini 2.5 Pro.
Built on the foundation of ERNIE 4.5, the model is enhanced with extensive mid-training and post-training, including end-to-end reinforcement learning.
Available on ERNIE Bot, Wenxiaoyan app and MaaS platform Qianfan (via API)
🔥4❤3👏2
Alibaba dropped an open-source Python framework to build multi-agent applications.
Build AI agents visually with MCP tools, memory, RAG, reasoning, and tracing.
Build AI agents visually with MCP tools, memory, RAG, reasoning, and tracing.
GitHub
GitHub - agentscope-ai/agentscope: AgentScope: Agent-Oriented Programming for Building LLM Applications
AgentScope: Agent-Oriented Programming for Building LLM Applications - agentscope-ai/agentscope
🔥4❤2👏2🆒2
Claude can now create and edit files.
Turn conversations into Excel spreadsheets, documents, PowerPoint slide decks, and PDFs directly.
Claude has access to a private computer environment where it can write code and run programs.
Turn conversations into Excel spreadsheets, documents, PowerPoint slide decks, and PDFs directly.
Claude has access to a private computer environment where it can write code and run programs.
Claude
Claude can now create and edit files | Claude
Describe what you need and get back ready-to-use spreadsheets, documents, presentations, and PDFs instead of just text responses. Update: Now generally available for paid plans with net network and egress controls (October 21, 2025).
❤4🔥4🥰3
Chinese researchers have unveiled SpikingBrain-1.0, a new AI system that mimics brain neurons for highly efficient training with minimal data.
Trained entirely on domestic GPUs, it matches Transformer based models while using only ~2% of the data.
Its strength with ultra long sequences makes it well suited for fields like law, medicine, physics, and genomics.
The team has open sourced the model, released a public demo, and published a large bilingual technical report.
GitHub.
Models.
Trained entirely on domestic GPUs, it matches Transformer based models while using only ~2% of the data.
Its strength with ultra long sequences makes it well suited for fields like law, medicine, physics, and genomics.
The team has open sourced the model, released a public demo, and published a large bilingual technical report.
GitHub.
Models.
GitHub
GitHub - BICLab/SpikingBrain-7B: Spiking Brain-inspired Large Models, integrating hybrid efficient attention, MoE modules and spike…
Spiking Brain-inspired Large Models, integrating hybrid efficient attention, MoE modules and spike encoding into its architecture - BICLab/SpikingBrain-7B
👍3🔥3🥰2👏2🥱1
Wow! Microsoft will use Anthropic models to power some features of Office 365 Copilot
Why? Microsoft product leaders genuinely say they are better than OpenAI models for certain tasks.
Why? Microsoft product leaders genuinely say they are better than OpenAI models for certain tasks.
The Information
Microsoft to Buy AI From Anthropic in Partial Shift From OpenAI
Microsoft is taking its biggest step to lessen reliance on OpenAI’s artificial intelligenceby embracing the startup’s bitter rival Anthropic to power its most important software business. Microsoft will pay to use Anthropic’s technology for some AI features…
🔥3❤2👏2🤓1
Coinbase introduced the x402 Bazaar: The open, machine-readable discovery layer for x402.
Ecosystem where specialized AI services, data feeds, and APIs can thrive. A search engine for agents.
API providers: The x402 Bazaar means distribution.
List your x402 endpoint – its schema, price, a clear description – and suddenly, AI agents and developers building on x402 can find your service. This is how you tap into the coming agentic economy. Permissionless and open.
Developers and AI Agents: Your agent doesn't need pre-baked integrations for every service. It can query the Bazaar, find a service matching its requirements, and call it using x402. No keys, no pre-funding dozens of accounts.
Agents can become dynamic, autonomous entities.
Services priced, discovered, and consumed autonomously by machines. This can unlock long-tail API development and specialized AI services at a scale we haven't seen before.
Agent A needs data → finds Agent B's API → pays → gets data.
Ecosystem where specialized AI services, data feeds, and APIs can thrive. A search engine for agents.
API providers: The x402 Bazaar means distribution.
List your x402 endpoint – its schema, price, a clear description – and suddenly, AI agents and developers building on x402 can find your service. This is how you tap into the coming agentic economy. Permissionless and open.
Developers and AI Agents: Your agent doesn't need pre-baked integrations for every service. It can query the Bazaar, find a service matching its requirements, and call it using x402. No keys, no pre-funding dozens of accounts.
Agents can become dynamic, autonomous entities.
Services priced, discovered, and consumed autonomously by machines. This can unlock long-tail API development and specialized AI services at a scale we haven't seen before.
Agent A needs data → finds Agent B's API → pays → gets data.
Coinbase
Introducing x402 Bazaar: An index for self-improving AI agents
TL;DR: x402 Bazaar is the first discovery layer for agentic commerce. It gives agents a single place to find, interact with, and pay for new services - unlocking dynamic, self-improving agents that can evolve as the ecosystem grows.
🆒5🔥3👏2🥰1
Salesforce introduced SFR-DeepResearch (SFR-DR): RL-trained autonomous agents that can reason, search, and code their way through deep research tasks.
SFR-DR agents are trained to operate independently, without pre-defined multi-agent workflows. They autonomously plan, reason, and propose and take actions as defined by their tools.
SFR-DR-20B achieves 28.7% on Humanity's Last Exam (text-only) using only web search, browsing, and Python interpreter, surpassing DeepResearch with OpenAI o3 and Kimi Researcher.
SFR-DR agents are also trained to manage their own memory by summarizing previous results when context becomes limited. This enables a virtually unlimited context window, enabling long-horizon tasks
SFR-DR agents are trained to operate independently, without pre-defined multi-agent workflows. They autonomously plan, reason, and propose and take actions as defined by their tools.
SFR-DR-20B achieves 28.7% on Humanity's Last Exam (text-only) using only web search, browsing, and Python interpreter, surpassing DeepResearch with OpenAI o3 and Kimi Researcher.
SFR-DR agents are also trained to manage their own memory by summarizing previous results when context becomes limited. This enables a virtually unlimited context window, enabling long-horizon tasks
arXiv.org
SFR-DeepResearch: Towards Effective Reinforcement Learning for...
Equipping large language models (LLMs) with complex, interleaved reasoning and tool-use capabilities has become a key focus in agentic AI research, especially with recent advances in...
❤5🔥3🥰2
A new research paper from Thinking Machines (ex-openAI team): Why LLM Gives Different Answers to the Same Question (And How to Fix It)
Ever notice that ChatGPT gives you slightly different responses when you ask the same question multiple times? Even at temperature 0, where the model should theoretically always pick the most likely token?
Most people assume this happens because of sampling randomness or GPU parallelization quirks. The conventional wisdom goes something like this: "GPUs do parallel calculations, floating-point math isn't associative, so results vary depending on which threads finish first."
This explanation isn't wrong, but it misses the real culprit. Horace He and the team at Thinking Machines dug deeper and found something more fundamental: batch invariance.
Here's what's actually happening: when you send a request to an LLM API, your output depends not just on your input, but on how many other people are using the service at the same time.
The server batches requests together for efficiency, and the batch size affects the numerical computations.
Even though each individual operation might be deterministic, the same input can produce different outputs depending on whether it's processed alone or with 10, 100, or 1000 other requests.
Think of it this way: you ask a question, but the answer changes based on how crowded the "room" is when you ask it.
This work challenges a common attitude in ML: "our systems are already probabilistic, so what's a little more randomness?" The researchers argue this is defeatist. With careful engineering, we can understand and eliminate these sources of nondeterminism.
They've open-sourced their implementation on top of vLLM, making it possible for others to achieve truly deterministic LLM inference today.
Ever notice that ChatGPT gives you slightly different responses when you ask the same question multiple times? Even at temperature 0, where the model should theoretically always pick the most likely token?
Most people assume this happens because of sampling randomness or GPU parallelization quirks. The conventional wisdom goes something like this: "GPUs do parallel calculations, floating-point math isn't associative, so results vary depending on which threads finish first."
This explanation isn't wrong, but it misses the real culprit. Horace He and the team at Thinking Machines dug deeper and found something more fundamental: batch invariance.
Here's what's actually happening: when you send a request to an LLM API, your output depends not just on your input, but on how many other people are using the service at the same time.
The server batches requests together for efficiency, and the batch size affects the numerical computations.
Even though each individual operation might be deterministic, the same input can produce different outputs depending on whether it's processed alone or with 10, 100, or 1000 other requests.
Think of it this way: you ask a question, but the answer changes based on how crowded the "room" is when you ask it.
This work challenges a common attitude in ML: "our systems are already probabilistic, so what's a little more randomness?" The researchers argue this is defeatist. With careful engineering, we can understand and eliminate these sources of nondeterminism.
They've open-sourced their implementation on top of vLLM, making it possible for others to achieve truly deterministic LLM inference today.
Thinking Machines Lab
Defeating Nondeterminism in LLM Inference
Reproducibility is a bedrock of scientific progress. However, it’s remarkably difficult to get reproducible results out of large language models.
For example, you might observe that asking ChatGPT the same question multiple times provides different results.…
For example, you might observe that asking ChatGPT the same question multiple times provides different results.…
❤4🔥4🥰2