OpenAI released GPT-5-Codex — a version of GPT-5 further optimized for agentic coding in Codex.
Available in the Codex CLI, IDE Extension, web, mobile, and for code reviews in Github.
Available in the Codex CLI, IDE Extension, web, mobile, and for code reviews in Github.
Openai
Introducing upgrades to Codex
Codex just got faster, more reliable, and better at real-time collaboration and tackling tasks independently anywhere you develop—whether via the terminal, IDE, web, or even your phone.
ByteDance introduced EMPG, a framework that recalibrates the learning signal using the agent's own uncertainty.
Comparing with GRPO and DAPO, it achieves promising gains on agent benchmarks like WebShop, ALFWorld, & Deep Search.
Paper.
Comparing with GRPO and DAPO, it achieves promising gains on agent benchmarks like WebShop, ALFWorld, & Deep Search.
Paper.
🔥4❤3👏2
Google launched new protocol for agent-driven purchases
Google announced a new open protocol for purchases initiated by AI agents — automated software programs that can shop and make decisions on behalf of users. AI payments protocol supporting credit cards and stablecoins, built with Coinbase, the Ethereum Foundation and over 60 partners, per Fortune. GitHub.
Called the Agent Payments Protocol (AP2), the system is meant to be interoperable between AI platforms, payment systems and vendors, providing a traceable paper trail for each transaction.
In collaboration with cryptocurrency outfits Coinbase, Metamask and the Ethereum foundation, Google also produced an extension that would integrate the cryptocurrency-oriented x402 protocol, allowing for AI-driven purchasing from crypto wallets.
A number of other tech companies are working on their own agentic purchasing systems — most notably Perplexity, which allows for a Buy With Pro service in its agentic browser. The payment provider Stripe also produces software tools for agentic purchasing on its platform, though they are not as comprehensive as AP2.
Google announced a new open protocol for purchases initiated by AI agents — automated software programs that can shop and make decisions on behalf of users. AI payments protocol supporting credit cards and stablecoins, built with Coinbase, the Ethereum Foundation and over 60 partners, per Fortune. GitHub.
Called the Agent Payments Protocol (AP2), the system is meant to be interoperable between AI platforms, payment systems and vendors, providing a traceable paper trail for each transaction.
In collaboration with cryptocurrency outfits Coinbase, Metamask and the Ethereum foundation, Google also produced an extension that would integrate the cryptocurrency-oriented x402 protocol, allowing for AI-driven purchasing from crypto wallets.
A number of other tech companies are working on their own agentic purchasing systems — most notably Perplexity, which allows for a Buy With Pro service in its agentic browser. The payment provider Stripe also produces software tools for agentic purchasing on its platform, though they are not as comprehensive as AP2.
TechCrunch
Google launches new protocol for agent-driven purchases | TechCrunch
Called the Agent Payments Protocol (AP2), the system is meant to be interoperable between AI platforms, payment systems, and vendors.
👍3❤2🥰2👏2
That's a lot of money for robots: Figure has exceeded $1B in funding at a $39B post-money valuation
The round was led by Parkway Venture Capital with significant investments from Brookfield Asset Management, NVIDIA, Macquarie Capital, Intel Capital, Align Ventures, Tamarack Global, LG Technology Ventures, Salesforce, T-Mobile Ventures, and Qualcomm Ventures.
A new funding will support Figure's momentum across three core areas:
1. Scaling humanoid robots into homes & commercial operations
2. Building next-generation GPU infrastructure to accelerate training & simulation
3. Launching advanced data collection efforts for Helix
The round was led by Parkway Venture Capital with significant investments from Brookfield Asset Management, NVIDIA, Macquarie Capital, Intel Capital, Align Ventures, Tamarack Global, LG Technology Ventures, Salesforce, T-Mobile Ventures, and Qualcomm Ventures.
A new funding will support Figure's momentum across three core areas:
1. Scaling humanoid robots into homes & commercial operations
2. Building next-generation GPU infrastructure to accelerate training & simulation
3. Launching advanced data collection efforts for Helix
FigureAI
Figure Exceeds $1B in Series C Funding at $39B Post-Money Valuation
❤5🔥4👏2
Tongyi Lab dropped half a dozen new papers, most focused on Deep Research agents.
1. Tongyi DeepResearch: Open-source DeepResearch Agent
• First OSS web agent matching OpenAI’s DeepResearch
• SOTA on HLE (32.9), BrowseComp (43.4/46.7), xbench-DeepSearch (75)
• Full-stack pipeline: Agentic CPT → SFT → RL w/ synthetic data
• Native ReAct & new Heavy Mode (IterResearch) for long-horizon tasks
2. WebResearcher: Unbounded reasoning for long-horizon agents
• IterResearch: Iterative deep-research paradigm (avoids context suffocation & noise)
• WebFrontier: Tool-augmented data engine for complex research tasks
• Parallel agents + synthesis → scalable, evidence-grounded reasoning
• Beats proprietary systems: 36.7% on HLE, 51.7% on BrowseComp
3. AgentScaler: Towards General Agentic Intelligence
• Scales environments for diverse, realistic tool-calling
• Fully simulated envs = verifiable + scalable interactions
• SOTA on τ-bench, τ²-bench, ACEBench
• AgentScaler-30B matches 1T-parameter models with far fewer params
4. AgentFounder: Scaling Agents via Continual Pre-training
• First to propose Agentic CPT → builds agentic foundation models before fine-tuning
• Solves post-training bottlenecks (capabilities + alignment conflict)
• Data synthesis: First-order (planning/actions) + Higher-order (multi-step decision)
• Two-stage training (32K → 128K context)
• SOTA: 39.9% BrowseComp-en, 72.8% GAIA
5. WebWeaver: Structuring Web-Scale Evidence for Deep Research
• Dual-agent framework (Planner + Writer)
• Dynamic outlines: search ↔ refine ↔ search (human-like loop)
• Memory-grounded, section-by-section synthesis → avoids long-context failures
• SOTA across DeepResearch Bench, DeepConsult, DeepResearchGym
• Produces reliable, well-cited, structured reports
6. ReSum: Long-Horizon Web Agents Without Context Limits
• Problem: ReAct hits context limits in long searches (32k tokens)
• Solution: ReSum periodically compresses history → compact reasoning states
• ReSumTool-30B: specialized summarizer extracts key evidence & gaps
• ReSum-GRPO (RL): trains agents to adapt summaries into reasoning
• +4.5% over ReAct baseline, +8.2% with RL across web search benchmarks.
1. Tongyi DeepResearch: Open-source DeepResearch Agent
• First OSS web agent matching OpenAI’s DeepResearch
• SOTA on HLE (32.9), BrowseComp (43.4/46.7), xbench-DeepSearch (75)
• Full-stack pipeline: Agentic CPT → SFT → RL w/ synthetic data
• Native ReAct & new Heavy Mode (IterResearch) for long-horizon tasks
2. WebResearcher: Unbounded reasoning for long-horizon agents
• IterResearch: Iterative deep-research paradigm (avoids context suffocation & noise)
• WebFrontier: Tool-augmented data engine for complex research tasks
• Parallel agents + synthesis → scalable, evidence-grounded reasoning
• Beats proprietary systems: 36.7% on HLE, 51.7% on BrowseComp
3. AgentScaler: Towards General Agentic Intelligence
• Scales environments for diverse, realistic tool-calling
• Fully simulated envs = verifiable + scalable interactions
• SOTA on τ-bench, τ²-bench, ACEBench
• AgentScaler-30B matches 1T-parameter models with far fewer params
4. AgentFounder: Scaling Agents via Continual Pre-training
• First to propose Agentic CPT → builds agentic foundation models before fine-tuning
• Solves post-training bottlenecks (capabilities + alignment conflict)
• Data synthesis: First-order (planning/actions) + Higher-order (multi-step decision)
• Two-stage training (32K → 128K context)
• SOTA: 39.9% BrowseComp-en, 72.8% GAIA
5. WebWeaver: Structuring Web-Scale Evidence for Deep Research
• Dual-agent framework (Planner + Writer)
• Dynamic outlines: search ↔ refine ↔ search (human-like loop)
• Memory-grounded, section-by-section synthesis → avoids long-context failures
• SOTA across DeepResearch Bench, DeepConsult, DeepResearchGym
• Produces reliable, well-cited, structured reports
6. ReSum: Long-Horizon Web Agents Without Context Limits
• Problem: ReAct hits context limits in long searches (32k tokens)
• Solution: ReSum periodically compresses history → compact reasoning states
• ReSumTool-30B: specialized summarizer extracts key evidence & gaps
• ReSum-GRPO (RL): trains agents to adapt summaries into reasoning
• +4.5% over ReAct baseline, +8.2% with RL across web search benchmarks.
🔥5❤4👏3
Anthropic shipped two updates for developers using Claude
1. Claude in Xcode 26 Claude Sonnet 4 is now available as a coding assistant directly in Apple's IDE. Developers can connect their Claude account to access natural language code interaction, documentation generation, and inline editing tools. The integration shares usage limits with other Claude platforms and works with Pro, Max, and premium Team/Enterprise plans.
2. Claude Code UX Update A small but useful interface improvement: keywords like "think" and "ultrathink" now get highlighted when they would trigger extended thinking mode. Use
1. Claude in Xcode 26 Claude Sonnet 4 is now available as a coding assistant directly in Apple's IDE. Developers can connect their Claude account to access natural language code interaction, documentation generation, and inline editing tools. The integration shares usage limits with other Claude platforms and works with Pro, Max, and premium Team/Enterprise plans.
2. Claude Code UX Update A small but useful interface improvement: keywords like "think" and "ultrathink" now get highlighted when they would trigger extended thinking mode. Use
/t to disable the mode, preventing accidental activation when these words appear in regular prompts.Anthropic
Claude is now generally available in Xcode
Connect your Claude account to Xcode 26 for AI-powered coding assistance. Debug, refactor, and build Apple apps faster with Claude Sonnet 4 by Anthropic.
🔥3❤2👏2
New a16z benchmark: Which AI-native Office tools actually work?
First, the market splits into two camps:
- Generalists (Assistants: Manus, Genspark; Browsers: Dia, Comet; Extensions: MaxAI, Monica) - flexible but less polished.
- Specialists (Email: Fyxer, Serif; Slides: Gamma, Chronicle; Notes: Mem, Granola) - focused and refined in a single workflow.
a16z benchmarked both across office tasks: summarization, communication, file understanding, research, planning, and execution in 5 use cases.
Key Takeaways from Testing:
1. Specialists still win in PPT, email, and notes - but generalists are catching up, boosted by rapid model progress (e.g. Manus).
2. The horizontal race is heating up - even labs (Anthropic, OpenAI) are entering to own the “work UI”.
3. Convergence is inevitable - verticals are expanding categories, and horizontals are doubling down on use cases.
First, the market splits into two camps:
- Generalists (Assistants: Manus, Genspark; Browsers: Dia, Comet; Extensions: MaxAI, Monica) - flexible but less polished.
- Specialists (Email: Fyxer, Serif; Slides: Gamma, Chronicle; Notes: Mem, Granola) - focused and refined in a single workflow.
a16z benchmarked both across office tasks: summarization, communication, file understanding, research, planning, and execution in 5 use cases.
Key Takeaways from Testing:
1. Specialists still win in PPT, email, and notes - but generalists are catching up, boosted by rapid model progress (e.g. Manus).
2. The horizontal race is heating up - even labs (Anthropic, OpenAI) are entering to own the “work UI”.
3. Convergence is inevitable - verticals are expanding categories, and horizontals are doubling down on use cases.
🔥3❤2👏2
Vitalik Buterin presented Ethereum’s roadmap at Japan Dev Conference today:
1. short-term goals focus on scaling and increasing L1 gas limits;
2. mid-term aims target cross-L2 interoperability and faster responsiveness;
3. long-term vision emphasizes a secure, simple, quantum-resistant, and formally verified minimalist Ethereum.
1. short-term goals focus on scaling and increasing L1 gas limits;
2. mid-term aims target cross-L2 interoperability and faster responsiveness;
3. long-term vision emphasizes a secure, simple, quantum-resistant, and formally verified minimalist Ethereum.
🆒3👍2
Perceptron AI introduced Isaac 0.1 that can understand and interact with the physical world
Isaac 0.1 is an open-source, 2B params, open weights. Matches or beats models significantly larger on core perception.
Founded by the team behind Meta's Chameleon multimodal models.
Isaac is tuned for where the physical world needs intelligence. Capability alone isn’t enough—you need capability at cost, power, and tail-latency constraints.
In-context learning for perception: show a few annotated examples (defects, safety flags, etc.) and the model adapts in-prompt. No YOLO-style fine-tuning or custom detector stacks.
Isaac 0.1 is an open-source, 2B params, open weights. Matches or beats models significantly larger on core perception.
Founded by the team behind Meta's Chameleon multimodal models.
Isaac is tuned for where the physical world needs intelligence. Capability alone isn’t enough—you need capability at cost, power, and tail-latency constraints.
In-context learning for perception: show a few annotated examples (defects, safety flags, etc.) and the model adapts in-prompt. No YOLO-style fine-tuning or custom detector stacks.
marketing.perceptron.inc
A layer of intelligence for the physical world.
We are a research company building the future of Physical AGI.
We are a research company building the future of Physical AGI.
🔥3❤2👏2
Stanford Introduced Paper2Agent - a system that turns static research papers into interactive AI agents you can chat with and use the tools/data in the paper.
The key idea is to represent a paper and its codebase as a Model Context Protocol (MCP) server, which provides the context for an AI agent to create a paper-specific agent.
GitHub.
The key idea is to represent a paper and its codebase as a Model Context Protocol (MCP) server, which provides the context for an AI agent to create a paper-specific agent.
GitHub.
🔥3❤2🥰2
MIT Physicists have discovered a new form of magnetism, termed, p-wave magnetism.
This breakthrough paves the way for a new class of ultrafast, compact, energy-efficient, and nonvolatile magnetic memory devices.
This breakthrough paves the way for a new class of ultrafast, compact, energy-efficient, and nonvolatile magnetic memory devices.
MIT News
Physicists observe a new form of magnetism for the first time
MIT physicists demonstrated a new form of magnetism, called p-wave magnetism, that could one day be harnessed to build faster, denser, and less power-hungry “spintronic” memory chips.
❤4🔥4🥰3
SakanaAI presented Robust Agentic CUDA Kernel Optimization
• Fuses ops, boosts forward/backward passes, outperforms torch baselines
• Agentic LLM pipeline: PyTorch → CUDA → evolutionary runtime optimization
• Soft-verification: LLMs flag incorrect kernels (↑30% verification success)
• robust-kbench: new benchmark for real kernel performance + correctness.
Paper.
• Fuses ops, boosts forward/backward passes, outperforms torch baselines
• Agentic LLM pipeline: PyTorch → CUDA → evolutionary runtime optimization
• Soft-verification: LLMs flag incorrect kernels (↑30% verification success)
• robust-kbench: new benchmark for real kernel performance + correctness.
Paper.
GitHub
GitHub - SakanaAI/robust-kbench
Contribute to SakanaAI/robust-kbench development by creating an account on GitHub.
👏4🔥3🥰3
Chinese device assembler Luxshare has signed a deal with OpenAI to produce a consumer AI device, possibly the tiny robot.
OpenAI now appears to have multiple separate devices in the works, all planned to arrive next year.
OpenAI now appears to have multiple separate devices in the works, all planned to arrive next year.
CNBC
Apple-supplier Luxshare shares pop 10% on report of OpenAI hardware deal
Luxshare saw its shares jump about 10% following a report that the Chinese device assembler had signed a deal with OpenAI to produce a consumer AI device.
🔥4👍3👏2
Microsoft's first of many (connected!) Fairwater GB200 Supercomputers coming online.
10x faster than the top of the #top500 with an innovative network after years of planning and design.
10x faster than the top of the #top500 with an innovative network after years of planning and design.
The Official Microsoft Blog
Inside the world’s most powerful AI datacenter
This week we have introduced a wave of purpose-built datacenters and infrastructure investments we are making around the world to support the global adoption of cutting-edge AI workloads and cloud services. Today in Wisconsin we introduced Fairwater, our…
🔥4❤2👏2🤔1
DeepSeek introduced DeepSeek-V3.1-Terminus. The latest update builds on V3.1’s strengths while addressing key user feedback.
What’s improved?
1. Language consistency: fewer CN/EN mix-ups & no more random chars.
2. Agent upgrades: stronger Code Agent & Search Agent performance
What’s improved?
1. Language consistency: fewer CN/EN mix-ups & no more random chars.
2. Agent upgrades: stronger Code Agent & Search Agent performance
🔥4👏3🥰2
OpenAI & NVIDIA announced partnership to deploy 10GW of NVIDIA Systems
OpenAI to build & deploy at least 10 gigawatts of AI datacenters with NVIDIA systems representing millions of GPUs for OpenAI’s next-gen AI infrastructure.
To support the partnership, NVIDIA intends to invest up to $100 billion in OpenAI progressively as each gigawatt is deployed.
The first gigawatt of NVIDIA systems will be deployed in the second half of 2026 on NVIDIA’s Vera Rubin platform.
OpenAI to build & deploy at least 10 gigawatts of AI datacenters with NVIDIA systems representing millions of GPUs for OpenAI’s next-gen AI infrastructure.
To support the partnership, NVIDIA intends to invest up to $100 billion in OpenAI progressively as each gigawatt is deployed.
The first gigawatt of NVIDIA systems will be deployed in the second half of 2026 on NVIDIA’s Vera Rubin platform.
Openai
OpenAI and NVIDIA announce strategic partnership to deploy 10 gigawatts of NVIDIA systems
OpenAI and NVIDIA announce a strategic partnership to deploy 10 gigawatts of AI datacenters powered by NVIDIA systems, with the first phase launching in 2026.
🔥4❤2🥰2
Alibaba introduced Qwen3-Omni — the first natively end-to-end omni-modal AI unifying text, image, audio & video in one model — no modality trade-offs
1. SOTA on 22/36 audio & AV benchmarks
2. 119L text / 19L speech in / 10L speech out
3. 211ms latency | 30-min audio understanding
4. Fully customizable via system prompts
5. Built-in tool calling
6. Open-source Captioner model
What’s Open-Sourced?
- Qwen3-Omni-30B-A3B-Instruct, - Qwen3-Omni-30B-A3B-Thinking, - Qwen3-Omni-30B-A3B Captioner.
GitHub.
HuggingFace.
MS Models.
Demo.
1. SOTA on 22/36 audio & AV benchmarks
2. 119L text / 19L speech in / 10L speech out
3. 211ms latency | 30-min audio understanding
4. Fully customizable via system prompts
5. Built-in tool calling
6. Open-source Captioner model
What’s Open-Sourced?
- Qwen3-Omni-30B-A3B-Instruct, - Qwen3-Omni-30B-A3B-Thinking, - Qwen3-Omni-30B-A3B Captioner.
GitHub.
HuggingFace.
MS Models.
Demo.
chat.qwen.ai
Qwen Chat
Qwen Chat offers comprehensive functionality spanning chatbot, image and video understanding, image generation, document processing, web search integration, tool utilization, and artifacts.
🔥3🥰3😁1
Nvidia presented ReaSyn: Rethinking molecule synthesizability
• Treats synthesis like CoT with Chain-of-Reaction (CoR) steps
• Each reaction = reasoning step → richer supervision & step-by-step learning
• Adds RL finetuning + test-time scaling for better optimization
• SOTA in reconstruction, goal-directed design & hit expansion
• Broad coverage of synthesizable chemical space → practical for drug discovery
• Treats synthesis like CoT with Chain-of-Reaction (CoR) steps
• Each reaction = reasoning step → richer supervision & step-by-step learning
• Adds RL finetuning + test-time scaling for better optimization
• SOTA in reconstruction, goal-directed design & hit expansion
• Broad coverage of synthesizable chemical space → practical for drug discovery
Kaggle hosting the 5-Day AI Agents Intensive course with Google on November 10 - 14.
This no-cost course is designed to help you explore the foundations and practical applications of AI agents.
This no-cost course is designed to help you explore the foundations and practical applications of AI agents.
Withgoogle
5-Day AI Agents Intensive Course with Google
Join our 5-day AI Agents Intensive Course with Google, November 10–14, to learn how to build, evaluate, and deploy agents.
🔥3🥰3👏2