Boltz-2 is a new biomolecular foundation model that goes beyond AlphaFold3 and Boltz-1 by jointly modeling complex structures and binding affinities, a critical component towards accurate molecular design.
Boltz-2 a new model capable not only of predicting structures but also binding affinities
Boltz-2 is the first AI model to approach the performance of FEP simulations while being more than 1000x faster.
All open-sourced under MIT license.
Paper
Boltz-2 a new model capable not only of predicting structures but also binding affinities
Boltz-2 is the first AI model to approach the performance of FEP simulations while being more than 1000x faster.
All open-sourced under MIT license.
Paper
GitHub
GitHub - jwohlwend/boltz: Official repository for the Boltz biomolecular interaction models
Official repository for the Boltz biomolecular interaction models - jwohlwend/boltz
👍6
Extract an AI system developed by the UK Government’s AI Incubator team using Google’s Gemini model
It aims to modernize the UK’s planning system by converting old, paper-based planning documents—such as blurry maps and handwritten notes—into clear, digital data in about 40 seconds.
Key points:
1. Speeds up processing of ~350,000 annual planning applications in England, supporting housing and infrastructure development.
2. How it works:
- Uses Gemini’s multimodal reasoning to extract critical information from text, handwritten notes, and low-quality map images.
- Identifies map features (e.g., boundaries, shaded areas) using tools like OpenCV, Ordnance Survey, and Segment Anything.
- Matches historical maps to modern equivalents using addresses, landmarks, and feature mapping (e.g., LoFTR) to convert shapes into precise geographical coordinates.
- Reduces council workload, simplifies processes, and frees staff for strategic planning. It supports the UK’s goal of building 1.5 million new homes.
It aims to modernize the UK’s planning system by converting old, paper-based planning documents—such as blurry maps and handwritten notes—into clear, digital data in about 40 seconds.
Key points:
1. Speeds up processing of ~350,000 annual planning applications in England, supporting housing and infrastructure development.
2. How it works:
- Uses Gemini’s multimodal reasoning to extract critical information from text, handwritten notes, and low-quality map images.
- Identifies map features (e.g., boundaries, shaded areas) using tools like OpenCV, Ordnance Survey, and Segment Anything.
- Matches historical maps to modern equivalents using addresses, landmarks, and feature mapping (e.g., LoFTR) to convert shapes into precise geographical coordinates.
- Reduces council workload, simplifies processes, and frees staff for strategic planning. It supports the UK’s goal of building 1.5 million new homes.
Google
UK government harnesses Gemini to support faster planning decisions
Extract, built with Gemini, uses the model’s advanced visual reasoning and multi-modal capabilities to help councils turn old planning documents—including blurry maps and handwritten notes—into clear, digital data, speeding up decision-making timelines for…
👏3
Project_Pine_Tokenised_Financial_Markets_1749471407.pdf
1.7 MB
Project Pine - Tokenised Financial Markets by The Federal Reserve Bank of New York and BIS
ProjectPine found that central banks could customise and deploy policy implementation tools using programmable smart contracts in a potential future state where commercial banks and other private sector financial institutions have widely adopted tokenisation for wholesale payments and securities settlement.
The project generated the prototype of a generic monetary policy implementation tokenised toolkit for potential further research and development by central banks across jurisdictions and currencies. The prototype was designed to be technically modifiable for different central banks' monetary policy frameworks and calibrated to conduct standard or emergency market operations.
The toolkit prototype was created in consultation with central banks' financial markets advisors from multiple jurisdictions, who helped outline the project scope and specific design requirements. It is not particular to any currency or jurisdiction. It can fulfil a common set of central bank implementation requirements, including paying interest on reserves, open market operations, and collateral management.
ProjectPine found that central banks could customise and deploy policy implementation tools using programmable smart contracts in a potential future state where commercial banks and other private sector financial institutions have widely adopted tokenisation for wholesale payments and securities settlement.
The project generated the prototype of a generic monetary policy implementation tokenised toolkit for potential further research and development by central banks across jurisdictions and currencies. The prototype was designed to be technically modifiable for different central banks' monetary policy frameworks and calibrated to conduct standard or emergency market operations.
The toolkit prototype was created in consultation with central banks' financial markets advisors from multiple jurisdictions, who helped outline the project scope and specific design requirements. It is not particular to any currency or jurisdiction. It can fulfil a common set of central bank implementation requirements, including paying interest on reserves, open market operations, and collateral management.
🔥3
Alibaba's RL LLM training library: ROLL
ROLL is built upon several key modules to serve these user groups effectively:
1. A single-controller architecture combined with an abstraction of the parallel worker simplifies the development of the training pipeline.
2. The parallel strategy and data transfer modules enable efficient and scalable training.
3. The rollout scheduler offers fine-grained management of each sample's lifecycle during the rollout stage.
4. The environment worker and reward worker support rapid and flexible experimentation with agentic RL algorithms and reward designs.
Finally, AutoDeviceMapping allows users to assign resources to different models flexibly across various stages.
GitHub.
ROLL is built upon several key modules to serve these user groups effectively:
1. A single-controller architecture combined with an abstraction of the parallel worker simplifies the development of the training pipeline.
2. The parallel strategy and data transfer modules enable efficient and scalable training.
3. The rollout scheduler offers fine-grained management of each sample's lifecycle during the rollout stage.
4. The environment worker and reward worker support rapid and flexible experimentation with agentic RL algorithms and reward designs.
Finally, AutoDeviceMapping allows users to assign resources to different models flexibly across various stages.
GitHub.
Autonomous Agents That Think, Remember, and Evolve from AWS team
This project showcasing how autonomous agents can move beyond simple tasks to reason, remember, and adapt
- powered by Mem0, AWS,, and Strands Agents SDK
From remembering past findings to adapting strategies on the fly, Cyber-AutoAgent uses long- and short-term memory to build real expertise - one interaction at a time.
GitHub.
This project showcasing how autonomous agents can move beyond simple tasks to reason, remember, and adapt
- powered by Mem0, AWS,, and Strands Agents SDK
From remembering past findings to adapting strategies on the fly, Cyber-AutoAgent uses long- and short-term memory to build real expertise - one interaction at a time.
GitHub.
Aws
AWS Builder Center
Start here. Go anywhere. Welcome to AWS Builder Center, the go-to site for builders to learn, grow, and connect with the AWS community.
At WWDC Apple introduced a new generation of LLMs developed to enhance the Apple Intelligence features.
Also introduced the new Foundation Models framework, which gives app developers direct access to the on-device foundation language model.
Also introduced the new Foundation Models framework, which gives app developers direct access to the on-device foundation language model.
Apple Machine Learning Research
Updates to Apple’s On-Device and Server Foundation Language Models
With Apple Intelligence, we're integrating powerful generative AI right into the apps and experiences people use every day, all while…
GALBOT announced OpenWBT – an open-source, whole-body humanoid VR teleoperation system using Apple Vision Pro.
It supports Unitree G1 and H1 robots, enabling operators to control movements like walking, squatting, bending, grasping, and lifting.
It supports Unitree G1 and H1 robots, enabling operators to control movements like walking, squatting, bending, grasping, and lifting.
GitHub
GitHub - GalaxyGeneralRobotics/OpenWBT: Official implementation of OpenWBT.
Official implementation of OpenWBT. Contribute to GalaxyGeneralRobotics/OpenWBT development by creating an account on GitHub.
👏3
Microsoft introduced Code Researcher - a deep research agent for large systems code and commit history.
Achieves a 58% crash resolution rate on a benchmark of crashes in the Linux kernel, a complex codebase with 28M LOC & 75K files.
Achieves a 58% crash resolution rate on a benchmark of crashes in the Linux kernel, a complex codebase with 28M LOC & 75K files.
Microsoft Research
Code Researcher: Deep Research Agent for Large Systems Code and Commit History - Microsoft Research
Large Language Model (LLM)-based coding agents have shown promising results on coding benchmarks, but their effectiveness on systems code remains underexplored. Due to the size and complexities of systems code, making changes to a systems codebase is a daunting…
👍3🦄3
Mistral introduced Magistral is a first reasoning model designed to excel in domain-specific, transparent, and multilingual reasoning.
Magistral is available in two variants:
1. Magistral Small (24B parameter open-source version)
2. Magistral Medium (enterprise version).
Magistral is available in two variants:
1. Magistral Small (24B parameter open-source version)
2. Magistral Medium (enterprise version).
mistral.ai
Magistral | Mistral AI
Stands to reason.
🔥3
SEAL and Red Team at Scale AI presented a position paper outlining what they’ve learned from red teaming LLMs so far—what matters, what’s missing, and how model safety fits into broader system safety and monitoring.
Scale AI
It’s Time to Rethink Red Teaming | Scale
A roadmap for testing AI systems by prioritizing product specifications, realistic threats, and system-level awareness.
👏3
Modular + AMD: Breaking NVIDIA's AI Monopoly?
Modular just announced their software now works with AMD's latest AI chips, claiming major performance improvements.
Here's what this actually means.
Right now, if you want to run AI models, you basically have to use NVIDIA's expensive GPUs. AMD makes competitive chips, but the software ecosystem around them is weak. Most AI code is written for NVIDIA's CUDA platform.
What Modular Built?
The Software: A new programming language called Mojo that can run the same code on both NVIDIA and AMD chips without changes. Think of it as a universal translator for AI hardware.
The Promise: Use cheaper AMD chips (with more memory) while keeping the same performance as expensive NVIDIA cards.
The Claims vs Reality
Modular shows benchmarks where AMD's new MI325X chip beats NVIDIA's H200 by 20-50% on certain AI tasks. AMD's chip also has almost twice the memory (256GB vs 141GB).
But: These are carefully selected benchmarks. Real-world performance across all AI workloads is probably more modest.
Why This Matters?
For Companies: Potential to save money on AI infrastructure and avoid vendor lock-in
For Developers: More hardware choices could mean better prices and innovation
For The Market: Any credible alternative to NVIDIA is good for competition.
This is promising technology from a credible team, but NVIDIA's software advantage is huge. Most AI tools, libraries, and developer knowledge are built around NVIDIA's ecosystem.
Interesting development that could work well for specific use cases, but don't expect it to dethrone NVIDIA anytime soon. The real test is whether companies actually adopt it in production.
Modular just announced their software now works with AMD's latest AI chips, claiming major performance improvements.
Here's what this actually means.
Right now, if you want to run AI models, you basically have to use NVIDIA's expensive GPUs. AMD makes competitive chips, but the software ecosystem around them is weak. Most AI code is written for NVIDIA's CUDA platform.
What Modular Built?
The Software: A new programming language called Mojo that can run the same code on both NVIDIA and AMD chips without changes. Think of it as a universal translator for AI hardware.
The Promise: Use cheaper AMD chips (with more memory) while keeping the same performance as expensive NVIDIA cards.
The Claims vs Reality
Modular shows benchmarks where AMD's new MI325X chip beats NVIDIA's H200 by 20-50% on certain AI tasks. AMD's chip also has almost twice the memory (256GB vs 141GB).
But: These are carefully selected benchmarks. Real-world performance across all AI workloads is probably more modest.
Why This Matters?
For Companies: Potential to save money on AI infrastructure and avoid vendor lock-in
For Developers: More hardware choices could mean better prices and innovation
For The Market: Any credible alternative to NVIDIA is good for competition.
This is promising technology from a credible team, but NVIDIA's software advantage is huge. Most AI tools, libraries, and developer knowledge are built around NVIDIA's ecosystem.
Interesting development that could work well for specific use cases, but don't expect it to dethrone NVIDIA anytime soon. The real test is whether companies actually adopt it in production.
Modular
Modular: Modular + AMD: Unleashing AI performance on AMD GPUs
Modular is excited to announce a partnership with Advanced Micro Devices, Inc. (AMD), one of the world’s leading AI semiconductor companies. This partnership marks the general availability of the Modular Platform across AMD's GPU portfolio, a significant…
❤6✍2
Who will control AI in the future — big, centralized companies, or communities of users?
Blockchains can counterbalance many of the centralizing forces already seeing in AI. Together these trends enable useful applications.
a16Z presented 11 crypto x AI use cases, from decentralized physical infrastructure to IP registries for creators.
1. Persistent data and context in AI interactions
2. Universal identity for agents
3. Forwards-compatible proof of personhood
4. Decentralized Physical Infrastructure (DePIN) for AI
5. Infrastructure and guardrails for interactions between AI agents, end-service providers, and users
6. Keeping AI/vibe-coded apps in sync
7. Micropayments that support revenue sharing
8. Blockchains as a registry for intellectual property and provenance
9. Webcrawlers that help compensate content creators
10. Privacy-preserving ads that are tailored, not creepy
11. AI companions, owned and controlled by humans.
Blockchains can counterbalance many of the centralizing forces already seeing in AI. Together these trends enable useful applications.
a16Z presented 11 crypto x AI use cases, from decentralized physical infrastructure to IP registries for creators.
1. Persistent data and context in AI interactions
2. Universal identity for agents
3. Forwards-compatible proof of personhood
4. Decentralized Physical Infrastructure (DePIN) for AI
5. Infrastructure and guardrails for interactions between AI agents, end-service providers, and users
6. Keeping AI/vibe-coded apps in sync
7. Micropayments that support revenue sharing
8. Blockchains as a registry for intellectual property and provenance
9. Webcrawlers that help compensate content creators
10. Privacy-preserving ads that are tailored, not creepy
11. AI companions, owned and controlled by humans.
a16z crypto
AI x crypto crossovers - a16z crypto
The open web is becoming a prompt bar. So Who will control future AI — big companies or communities of users? That's where crypto comes in.
🙏4👍3
You can now just add "hf.co/mcp" in Claude or Cursor to make use of the HF MCP server to looks for models, datasets, papers or apps or specific information about them
Salesforce presents a new work on AI Agent and LLM judge safety "Helpful Agent Meets Deceptive Judge: Understanding Vulnerabilities in Agentic Workflows"
As AI agents become increasingly autonomous, they often rely on feedback from judges (evaluators). These judges evaluate, critique, and guide agent behavior. But what if the feedback is not just flawed—but deceptively persuasive?
Researchers reveal how deceptive feedback can derail even state-of-the-art LLMs such as o4-mini.
As AI agents become increasingly autonomous, they often rely on feedback from judges (evaluators). These judges evaluate, critique, and guide agent behavior. But what if the feedback is not just flawed—but deceptively persuasive?
Researchers reveal how deceptive feedback can derail even state-of-the-art LLMs such as o4-mini.
👍5
A free guide from Anthropic "How we built our multi-agent research system"
Anthropic shares how they built Claude's new multi-agent Research feature, an architecture where a lead Claude agent spawns and coordinates subagents to explore complex queries in parallel.
They use the orchestrator-worker architecture.
Anthropic shares how they built Claude's new multi-agent Research feature, an architecture where a lead Claude agent spawns and coordinates subagents to explore complex queries in parallel.
They use the orchestrator-worker architecture.
❤5🔥4
Google's Approach for Secure AI Agents
The document addresses the promise and risks of autonomous AI agents—systems designed to perceive environments, make decisions, and act autonomously to achieve user-defined goals.
Below is a summary of the key points based on available information:
I. Google proposes 3 fundamental principles to secure AI agents:
1. Well-Defined Human Controllers
2. Limited Agent Powers
3. Observable Agent Actions
II. Key Risks Identified. The paper highlights two primary security concerns for AI agents:
- Rogue Actions
- Sensitive Data Disclosure
III. Google advocates a hybrid strategy combining:
- Traditional Deterministic Controls
- Dynamic, Reasoning-Based Defenses
IV. Securing AI agents is complex due to:
- Difficulty distinguishing trusted user commands from untrusted inputs, increasing vulnerability to prompt injection.
- The need to balance agent utility with security, ensuring they remain useful without exposing excessive attack surfaces.
- The evolving threat landscape, requiring continuous assurance through regression testing, red teaming, and external research.
V. Practical Implementation
Google integrates these principles into its platforms, such as Google Cloud’s Agentspace, which includes:
- Authentication and Authorization
- Model Safeguards
- Logging and Detection
- Posture Assessment
The document addresses the promise and risks of autonomous AI agents—systems designed to perceive environments, make decisions, and act autonomously to achieve user-defined goals.
Below is a summary of the key points based on available information:
I. Google proposes 3 fundamental principles to secure AI agents:
1. Well-Defined Human Controllers
2. Limited Agent Powers
3. Observable Agent Actions
II. Key Risks Identified. The paper highlights two primary security concerns for AI agents:
- Rogue Actions
- Sensitive Data Disclosure
III. Google advocates a hybrid strategy combining:
- Traditional Deterministic Controls
- Dynamic, Reasoning-Based Defenses
IV. Securing AI agents is complex due to:
- Difficulty distinguishing trusted user commands from untrusted inputs, increasing vulnerability to prompt injection.
- The need to balance agent utility with security, ensuring they remain useful without exposing excessive attack surfaces.
- The evolving threat landscape, requiring continuous assurance through regression testing, red teaming, and external research.
V. Practical Implementation
Google integrates these principles into its platforms, such as Google Cloud’s Agentspace, which includes:
- Authentication and Authorization
- Model Safeguards
- Logging and Detection
- Posture Assessment
Meta introduced V-JEPA 2, a new world model with SOTA performance in visual understanding and prediction.
V-JEPA 2 can enable zero-shot planning in robots—allowing them to plan and execute tasks in unfamiliar environments.
V-JEPA 2 can enable zero-shot planning in robots—allowing them to plan and execute tasks in unfamiliar environments.
AI at Meta
Introducing V-JEPA 2
Video Joint Embedding Predictive Architecture 2 (V-JEPA 2) is the first world model trained on video that achieves state-of-the-art visual understanding and prediction, enabling zero-shot robot control in new environments.
🆒4🔥3😁2
Crypto group Tron to go public
According to the FT, with US regulators halting their investigation into Justin Sun’s Tron, the project is set to go public through a reverse merger with Nasdaq-listed SRM Entertainment.
The deal is arranged by Dominari Securities, a New York-based boutique investment bank with ties to Donald Trump Jr. and Eric Trump.
The newly formed joint entity will acquire and hold TRX, following a strategy similar to that of Strategy.
According to the FT, with US regulators halting their investigation into Justin Sun’s Tron, the project is set to go public through a reverse merger with Nasdaq-listed SRM Entertainment.
The deal is arranged by Dominari Securities, a New York-based boutique investment bank with ties to Donald Trump Jr. and Eric Trump.
The newly formed joint entity will acquire and hold TRX, following a strategy similar to that of Strategy.
Ft
Crypto group Tron to go public after US pauses probe into billionaire founder
Deal involving Justin Sun’s digital asset venture orchestrated by investment bank linked to Trump’s sons
❤4
Physical Intelligence introduced Real-Time Action Chunking, a method that lets VLAs execute actions while “thinking."
Instead of waiting for inference to finish, a robot can start acting with the next steps, completing the given task more quickly
Instead of waiting for inference to finish, a robot can start acting with the next steps, completing the given task more quickly
www.pi.website
Real-Time Action Chunking with Large Models
A real-time system for large VLAs that maintains precision and speed in the face of high latency.
🥰4