Tencent presented Hunyuan-TurboS
- Hybrid Transformer-Mamba MoE (56B active params) trained on 16T tokens
- Dynamically switching between rapid responses and deep ”thinking” modes
- Overall top 7 on LMSYS Chatbot Arena.
- Hybrid Transformer-Mamba MoE (56B active params) trained on 16T tokens
- Dynamically switching between rapid responses and deep ”thinking” modes
- Overall top 7 on LMSYS Chatbot Arena.
❤3
VanEck will launch a private digital assets fund in June 2025 focused on the Avalanche ecosystem.
The fund will invest in projects with long-term token utility around the TGE stage across sectors such as gaming, financial services, payments, and AI, while allocating idle capital to Avalanche-native RWA products to maintain onchain liquidity.
The fund will invest in projects with long-term token utility around the TGE stage across sectors such as gaming, financial services, payments, and AI, while allocating idle capital to Avalanche-native RWA products to maintain onchain liquidity.
GlobeNewswire News Room
VanEck Prepares to Launch PurposeBuilt Fund to Invest in Real-World Applications on Avalanche
Managed by VanEck’s Digital Assets Alpha Fund investment team, the VanEck PurposeBuilt Fund will invest in Avalanche ecosystem founders building scalable...
👏3
G42 announced with OpenAI the Stargate in UAE
#Stargate UAE: a next generation 1GW AI compute cluster, will be built by G42 and operated by OpenAI and Oracle.
The collaboration will also include Cisco and SoftBank Group. NVIDIA will supply the latest Blackwell GB300 systems. This will be at the heart of 5GW AI campus announced last week.
#Stargate UAE: a next generation 1GW AI compute cluster, will be built by G42 and operated by OpenAI and Oracle.
The collaboration will also include Cisco and SoftBank Group. NVIDIA will supply the latest Blackwell GB300 systems. This will be at the heart of 5GW AI campus announced last week.
Invent a Better Everyday | Abu Dhabi, UAE | G42
Invent a Better Everyday | Abu Dhabi, UAE | G42 | Global Tech Alliance Launches Stargate UAE
👍3
Researchers introduced MedBrowseComp, a challenging deep research benchmark for LLM agents in medicine
MedBrowseComp is the first benchmark that tests the ability of agents to retrieve & synthesize multi-hop medical facts from oncology knowledge bases.
MedBrowseComp is the first benchmark that tests the ability of agents to retrieve & synthesize multi-hop medical facts from oncology knowledge bases.
moreirap12.github.io
MedBrowseComp
MedBrowseComp project page
🔥4
Claude 4 is here, and it’s Anthropic’s vision about future of Agents
👍6
All about AI, Web 3.0, BCI
Claude 4 is here, and it’s Anthropic’s vision about future of Agents
More details about Claude4:
—Both models are hybrid models
—Opus 4 is great at understanding codebases and “the right choice” for agentic workflows
—Sonnet 4 excels at everyday tasks, and is your “daily go to”.
Coding agents are a huge theme here at the event and clearly a major focus for what’s coming next.
-Claude 4 has significantly greater agentic capabilities
-A new Code execution tool
-Claude Code coming to VSCode and Jetbrains
-Can now run Claude Code in GitHub.
Some more details on Claude 4 Opus:
—Matches or beats the best models in the world
—SOTA for coding, agentic tool use, and writing
—Memory capabilities across sessions
—Extended thinking mode for complex problem-solving
—200K context window with 32K output tokens.
Claude Code:
—Now generally available
—Integrates with VSCode and Jetbrains IDEs
—You can now see changes live inline in your editor
—A new Claude Code SDK for more flexibility.
If you want to read more about Sonnet & Opus 4, including a bunch of alignment and reward hacking findings, check out the model card.
—Both models are hybrid models
—Opus 4 is great at understanding codebases and “the right choice” for agentic workflows
—Sonnet 4 excels at everyday tasks, and is your “daily go to”.
Coding agents are a huge theme here at the event and clearly a major focus for what’s coming next.
-Claude 4 has significantly greater agentic capabilities
-A new Code execution tool
-Claude Code coming to VSCode and Jetbrains
-Can now run Claude Code in GitHub.
Some more details on Claude 4 Opus:
—Matches or beats the best models in the world
—SOTA for coding, agentic tool use, and writing
—Memory capabilities across sessions
—Extended thinking mode for complex problem-solving
—200K context window with 32K output tokens.
Claude Code:
—Now generally available
—Integrates with VSCode and Jetbrains IDEs
—You can now see changes live inline in your editor
—A new Claude Code SDK for more flexibility.
If you want to read more about Sonnet & Opus 4, including a bunch of alignment and reward hacking findings, check out the model card.
Anthropic
Introducing Claude 4
Discover Claude 4's breakthrough AI capabilities. Experience more reliable, interpretable assistance for complex tasks across work and learning.
❤6👍3
ByteDance introduced MMaDA: Multimodal Large Diffusion Language Models
MMaDA, a novel class of multimodal diffusion foundation models designed to achieve superior performance across diverse domains such as textual reasoning, multimodal understanding, and text-to-image generation.
Surpasses LLaMA-3-7B and Qwen2-7B, SDXL and Janus, Show-o and SEED-X.
3 key innovations:
1. a unified diffusion architecture with a shared probabilistic formulation and a modality-agnostic design, eliminating the need for modality-specific components.
2. mixed long chain-of-thought (CoT) fine-tuning strategy that curates a unified CoT format across modalities.
3. UniGRPO, a unified policy-gradient-based RL algorithm specifically tailored for diffusion foundation models.
GitHub.
MMaDA, a novel class of multimodal diffusion foundation models designed to achieve superior performance across diverse domains such as textual reasoning, multimodal understanding, and text-to-image generation.
Surpasses LLaMA-3-7B and Qwen2-7B, SDXL and Janus, Show-o and SEED-X.
3 key innovations:
1. a unified diffusion architecture with a shared probabilistic formulation and a modality-agnostic design, eliminating the need for modality-specific components.
2. mixed long chain-of-thought (CoT) fine-tuning strategy that curates a unified CoT format across modalities.
3. UniGRPO, a unified policy-gradient-based RL algorithm specifically tailored for diffusion foundation models.
GitHub.
arXiv.org
MMaDA: Multimodal Large Diffusion Language Models
We introduce MMaDA, a novel class of multimodal diffusion foundation models designed to achieve superior performance across diverse domains such as textual reasoning, multimodal understanding, and...
👏4
Humans can now see near-infrared light! Very cool tech development in biophotonics: engineered contact lenses convert invisible NIR signals into visible colors- enabling wearable, power-free NIR vision.
This has potential to shift our perceptual boundaries, showing the brain can integrate novel spectral inputs when mapped onto familiar visual codes, reframing light-based information processing and sensory integration.
This has potential to shift our perceptual boundaries, showing the brain can integrate novel spectral inputs when mapped onto familiar visual codes, reframing light-based information processing and sensory integration.
❤4
AI models are finding zero-day vulnerabilities. A new era for cybersecurity.
Sean Heelan's Blog
How I used o3 to find CVE-2025-37899, a remote zeroday vulnerability in the Linux kernel’s SMB implementation
In this post I’ll show you how I found a zeroday vulnerability in the Linux kernel using OpenAI’s o3 model. I found the vulnerability with nothing more complicated than the o3 API ̵…
The World Economic Forum has released a report on Asset Tokenization in Financial Markets.
Highlights
1. Tokenization offers a new model of digital asset ownership that enhances transparency, efficiency and accessibility.
2. This report analyses asset class use cases in issuance, securities financing and asset management, identifying factors that enable successful tokenization implementation.
3. Key differentiators include a shared system of record, flexible custody, programmability, fractional ownership and composability across asset types. These features can democratize access to financial markets and modernize infrastructure.
4. While the benefits are demonstrated, adoption is slowed by challenges such as legacy infrastructure, regulatory fragmentation, limited interoperability and liquidity issues.
5. Effective deployment requires phased approaches and strategic coordination among financial institutions, regulators and technology providers. Factors affecting design decisions – such as ledger type, settlement mechanisms and market operating hours – must also be carefully considered.
6. Ultimately, tokenization holds promise for a more inclusive and efficient financial system, provided stakeholders align on standards, safeguards and scalable solutions.
7. Tokenization is expected to reshape financial markets by increasing transparency, efficiency, speed, and inclusivity—paving the way for more resilient and accessible financial systems.
Highlights
1. Tokenization offers a new model of digital asset ownership that enhances transparency, efficiency and accessibility.
2. This report analyses asset class use cases in issuance, securities financing and asset management, identifying factors that enable successful tokenization implementation.
3. Key differentiators include a shared system of record, flexible custody, programmability, fractional ownership and composability across asset types. These features can democratize access to financial markets and modernize infrastructure.
4. While the benefits are demonstrated, adoption is slowed by challenges such as legacy infrastructure, regulatory fragmentation, limited interoperability and liquidity issues.
5. Effective deployment requires phased approaches and strategic coordination among financial institutions, regulators and technology providers. Factors affecting design decisions – such as ledger type, settlement mechanisms and market operating hours – must also be carefully considered.
6. Ultimately, tokenization holds promise for a more inclusive and efficient financial system, provided stakeholders align on standards, safeguards and scalable solutions.
7. Tokenization is expected to reshape financial markets by increasing transparency, efficiency, speed, and inclusivity—paving the way for more resilient and accessible financial systems.
❤4
Singapore's Sharpa unveiled SharpaWave, a lifelike robotic hand
—Features 22 DOF to balance for dexterity and strength
—Each fingertip has 1,000+ tactile sensing pixels and 5 mN pressure sensitivity
—AI models adapt the hand's grip and modulate force
—Features 22 DOF to balance for dexterity and strength
—Each fingertip has 1,000+ tactile sensing pixels and 5 mN pressure sensitivity
—AI models adapt the hand's grip and modulate force
HouseBots
Sharpa Unveils SharpaWave: The World’s Most Tactile Dexterous Robot Hand — HouseBots
Singapore-based robotics startup Sharpa is redefining what robotic manipulation means with the debut of its latest innovation: SharpaWave , a 22-degree-of-freedom (DOF) dexterous hand that brings human-like precision and speed to the world of robotics.
🔥6
Researchers introduced SPORT, a multimodal agent that explores tool usage without human annotation.
It leverages step-wise DPO to further enhance tool-use capabilities following SFT.
SPORT achieves improvements on the GTA and GAIA benchmarks.
It leverages step-wise DPO to further enhance tool-use capabilities following SFT.
SPORT achieves improvements on the GTA and GAIA benchmarks.
Google introduced Lyria RealTime is a new experimental interactive music generation model that allows anyone to interactively create, control and perform music in real time.
Available via the Gemini API and you can try the demo app on Google AI Studio.
Available via the Gemini API and you can try the demo app on Google AI Studio.
Amazon added AI-generated audio discussions about certain products, based on customer reviews and web searches.
About Amazon
Amazon's new generative AI-powered audio feature synthesizes product summaries and reviews to make shopping easier
The new AI shopping experts help save time by compiling research and providing product highlights for customers from product pages, reviews, and insights.
Anthropic just now rolling out voice mode in beta on mobile.
Try starting a voice conversation and asking Claude to summarize your calendar or search your docs. Voice mode in beta is available in English and coming to all plans in the next few weeks.
Try starting a voice conversation and asking Claude to summarize your calendar or search your docs. Voice mode in beta is available in English and coming to all plans in the next few weeks.
Game-Changer for AI: Meet the Low-Latency-Llama Megakernel
Buckle up, because a new breakthrough in AI optimization just dropped, and it’s got even Andrej Karpathy buzzing)
The Low-Latency-Llama Megakernel a approach to running models like Llama-1B faster and smarter on GPUs.
What’s the Big Deal?
Instead of splitting a neural network’s forward pass into multiple CUDA kernels (with pesky synchronization delays), this megakernel runs everything in a single kernel. Think of it as swapping a clunky assembly line for a sleek, all-in-one super-machine!
Why It’s Awesome:
1. No Kernel Boundaries, No Delays. By eliminating kernel switches, the GPU works non-stop, slashing latency and boosting efficiency.
2. Memory Magic. Threads are split into “loaders” and “workers.” While loaders fetch future weights, workers crunch current data, using 16KiB memory pages to hide latency.
3. Fine-Grained Sync. Without kernel boundaries, custom synchronization was needed. This not only solves the issue but unlocks tricks like early attention head launches.
4. Open Source. The code is fully open, so you can stop “torturing” your models with slow kernel launches (as the devs humorously put it) and optimize your own pipelines!
Why It Matters ?
- Speed Boost. Faster inference means real-time AI applications (think chatbots or recommendation systems) with lower latency.
- Cost Savings. Optimized GPU usage reduces hardware demands, perfect for startups or budget-conscious teams.
- Flexibility. Open-source code lets developers tweak it for custom models or use cases.
Karpathy’s Take:
Andrej calls it “so so so cool,” praising the megakernel for enabling “optimal orchestration of compute and memory.” He argues that traditional sequential kernel approaches can’t match this efficiency.
Buckle up, because a new breakthrough in AI optimization just dropped, and it’s got even Andrej Karpathy buzzing)
The Low-Latency-Llama Megakernel a approach to running models like Llama-1B faster and smarter on GPUs.
What’s the Big Deal?
Instead of splitting a neural network’s forward pass into multiple CUDA kernels (with pesky synchronization delays), this megakernel runs everything in a single kernel. Think of it as swapping a clunky assembly line for a sleek, all-in-one super-machine!
Why It’s Awesome:
1. No Kernel Boundaries, No Delays. By eliminating kernel switches, the GPU works non-stop, slashing latency and boosting efficiency.
2. Memory Magic. Threads are split into “loaders” and “workers.” While loaders fetch future weights, workers crunch current data, using 16KiB memory pages to hide latency.
3. Fine-Grained Sync. Without kernel boundaries, custom synchronization was needed. This not only solves the issue but unlocks tricks like early attention head launches.
4. Open Source. The code is fully open, so you can stop “torturing” your models with slow kernel launches (as the devs humorously put it) and optimize your own pipelines!
Why It Matters ?
- Speed Boost. Faster inference means real-time AI applications (think chatbots or recommendation systems) with lower latency.
- Cost Savings. Optimized GPU usage reduces hardware demands, perfect for startups or budget-conscious teams.
- Flexibility. Open-source code lets developers tweak it for custom models or use cases.
Karpathy’s Take:
Andrej calls it “so so so cool,” praising the megakernel for enabling “optimal orchestration of compute and memory.” He argues that traditional sequential kernel approaches can’t match this efficiency.
hazyresearch.stanford.edu
Look Ma, No Bubbles! Designing a Low-Latency Megakernel for Llama-1B
🆒5
Telegram + Grok = this summer https://xn--r1a.website/durov/422
Telegram
Pavel Durov
🔥 This summer, Telegram users will gain access to the best AI technology on the market. Elon Musk and I have agreed to a 1-year partnership to bring xAI’s chatbot Grok to our billion+ users and integrate it across all Telegram apps 🤝
💪 This also strengthens…
💪 This also strengthens…
🆒5
Apple and Duke University introduced 𝐈𝐧𝐭𝐞𝐫𝐥𝐞𝐚𝐯𝐞𝐝 𝐑𝐞𝐚𝐬𝐨𝐧𝐢𝐧𝐠
Researchers train LLMs to alternate between thinking & answering.
Reducing Time-to-First-Token (TTFT) by over 80% AND improving Pass@1 accuracy up to 19.3%!
Researchers train LLMs to alternate between thinking & answering.
Reducing Time-to-First-Token (TTFT) by over 80% AND improving Pass@1 accuracy up to 19.3%!
Market map for browser agents
A new companies launch in the space every week, for both consumer and enterprise use cases. ManusAI is one of the most popular generalist consumer agents, and Athena Intelligence is already being used by companies like Anheuser-Busch.
Computer/browser use has become one of the most important frontiers for model capabilities, with OpenAI, Anthropic, and Google DeepMind having dedicated teams to Operator, Claude Computer Use, and Project Mariner.
Open source frameworks like Browser use and Stagehand have become some of the most popular repos on Github, with tens of thousands of stars.
AI-first browsers are poised to disrupt the massive web browser market, with highly anticipated releases like Comet from Perplexity on the way. It's yet to be seen how Google integrates Project Mariner and other AI tools within Chrome.
A new companies launch in the space every week, for both consumer and enterprise use cases. ManusAI is one of the most popular generalist consumer agents, and Athena Intelligence is already being used by companies like Anheuser-Busch.
Computer/browser use has become one of the most important frontiers for model capabilities, with OpenAI, Anthropic, and Google DeepMind having dedicated teams to Operator, Claude Computer Use, and Project Mariner.
Open source frameworks like Browser use and Stagehand have become some of the most popular repos on Github, with tens of thousands of stars.
AI-first browsers are poised to disrupt the massive web browser market, with highly anticipated releases like Comet from Perplexity on the way. It's yet to be seen how Google integrates Project Mariner and other AI tools within Chrome.
🔥4
New paper from Google DeepMind Beyond Markovian: Reflective Exploration via Bayes-Adaptive RL for LLM Reasoning
Researchers study 𝙬𝙝𝙮, 𝙝𝙤𝙬, and 𝙬𝙝𝙚𝙣 LLMs should self-reflect and explore at test time—questions that conventional Markovian RL cannot fully answer.
HuggingFace
GitHub
Researchers study 𝙬𝙝𝙮, 𝙝𝙤𝙬, and 𝙬𝙝𝙚𝙣 LLMs should self-reflect and explore at test time—questions that conventional Markovian RL cannot fully answer.
HuggingFace
GitHub
🆒4