Wow! Researchers introduced a new RL algo to train agents who can build other agents
Weak-for-Strong (W4S): Training a Weak Meta-Agent to Harness Strong Executors.
With this, SLMs become powerful meta-agents that manage frontier LLMs in diverse agentic tasks.
Code.
Weak-for-Strong (W4S): Training a Weak Meta-Agent to Harness Strong Executors.
With this, SLMs become powerful meta-agents that manage frontier LLMs in diverse agentic tasks.
Code.
arXiv.org
Weak-for-Strong: Training Weak Meta-Agent to Harness Strong Executors
Efficiently leveraging of the capabilities of contemporary large language models (LLMs) is increasingly challenging, particularly when direct fine-tuning is expensive and often impractical....
❤5👍4🍌1
Anthropic is preparing Claude Code to be released on the mobile app
It now runs on Anthropic infrastructure not just on GitHub anymore.
Users will be able to connect Claude app to GitHub and run their coding prompts on the go.
It now runs on Anthropic infrastructure not just on GitHub anymore.
Users will be able to connect Claude app to GitHub and run their coding prompts on the go.
TestingCatalog
Anthropic prepares Claude Code release for mobile apps
Anthropic prepares a Code section on web and mobile with GitHub integration, repository browsing, and Claude Code tasks tailored to developers.
👍3🔥3🥰2
Google introduced Gemini 2.5 Computer Use
- Control UIs based with vision understanding and reasoning
- Use for web and Android control
- Try it with Browserbase or locally
- Control UIs based with vision understanding and reasoning
- Use for web and Android control
- Try it with Browserbase or locally
Google
Introducing the Gemini 2.5 Computer Use model
Today we are releasing the Gemini 2.5 Computer Use model via the API, which outperforms leading alternatives at browser and mobile tasks.
👍3🔥3🥰2
MIT Media Lab introduced NeuroChat: neuroadaptive chatbot that adapts its responses to your cognitive engagement.
NeuroChat is the first to use generative AI to adapt on the fly.
Every response — tone, depth, pacing — is co-authored by your brain and the model.
By reading brain signals in real time, NeuroChat personalizes its teaching style to your attention, curiosity, and focus.
Here’s how it works:
NeuroChat measures real-time brain activity using EEG - a lightweight, noninvasive sensor that captures your level of engagement while you learn.
The chatbot uses this engagement score to adjust how it teaches - simplifying, deepening, or changing pace to match your focus.
A live feedback loop between your mind and the model.
NeuroChat isn’t mind-reading.
It tracks only engagement signals, not specific thoughts, memories, or emotions.
All processing happens in your browser, and it’s compatible with local AI models, keeping brain data private.
Researchers ran a pilot study (n = 24) comparing NeuroChat to a non-adaptive chatbot:
1. EEG engagement: significantly higher with NeuroChat
2. Self-reports: users described it as more human-like, fluid, and enjoyable
3. Learning: similar short-term scores, but stronger sustained focus and curiosity
NeuroChat is the first to use generative AI to adapt on the fly.
Every response — tone, depth, pacing — is co-authored by your brain and the model.
By reading brain signals in real time, NeuroChat personalizes its teaching style to your attention, curiosity, and focus.
Here’s how it works:
NeuroChat measures real-time brain activity using EEG - a lightweight, noninvasive sensor that captures your level of engagement while you learn.
The chatbot uses this engagement score to adjust how it teaches - simplifying, deepening, or changing pace to match your focus.
A live feedback loop between your mind and the model.
NeuroChat isn’t mind-reading.
It tracks only engagement signals, not specific thoughts, memories, or emotions.
All processing happens in your browser, and it’s compatible with local AI models, keeping brain data private.
Researchers ran a pilot study (n = 24) comparing NeuroChat to a non-adaptive chatbot:
1. EEG engagement: significantly higher with NeuroChat
2. Self-reports: users described it as more human-like, fluid, and enjoyable
3. Learning: similar short-term scores, but stronger sustained focus and curiosity
Tally Forms
NeuroChat Public Release Waitlist
Made with Tally, the simplest way to create forms.
❤3🔥2🥰2
Google shared a new paper, which provides an extremely sample-efficient way to create an agent that can perform well in multi-agent, partially-observed, symbolic environments.
The key idea is to use LLM-powered code synthesis to learn a code world model (in the form of Python code) from a small dataset of (observation, action) trajectories, plus some background information (in text form), and then to pass this induced WM, plus the observation history, to an existing solver, such as (information-set) MCTS, to choose the next action.
Researchers applied their method to various two-player games (with both perfect and imperfect information), and show that it works much better than prompting the LLM to directly generate actions, especially for novel games.
In particular, we beat Gemini 2.5 Pro in 7 out of 10 games, and tie it in 2 out of 10 games.
The key idea is to use LLM-powered code synthesis to learn a code world model (in the form of Python code) from a small dataset of (observation, action) trajectories, plus some background information (in text form), and then to pass this induced WM, plus the observation history, to an existing solver, such as (information-set) MCTS, to choose the next action.
Researchers applied their method to various two-player games (with both perfect and imperfect information), and show that it works much better than prompting the LLM to directly generate actions, especially for novel games.
In particular, we beat Gemini 2.5 Pro in 7 out of 10 games, and tie it in 2 out of 10 games.
arXiv.org
Code World Models for General Game Playing
Large Language Models (LLMs) reasoning abilities are increasingly being applied to classical board and card games, but the dominant approach -- involving prompting for direct move generation --...
🔥2🥰2👏2
Anthropic just now introduced Claude Code Plugins in public beta.
Plugins allow you to install and share curated collections of slash commands, agents, MCP servers, and hooks directly within Claude Code.
To get started, you can add a marketplace using: /plugin marketplace add user-or-org/repo-name.
Then browse and install from the /plugin menu.
Try out the multi-agent workflow we use to develop Claude Code:
/plugin marketplace add anthropics/claude-code
/plugin install feature-dev
Anyone can host a marketplace or make a plugin. All you need is a git repo with a .claude-plugin/marketplace.json file.
Plugins allow you to install and share curated collections of slash commands, agents, MCP servers, and hooks directly within Claude Code.
To get started, you can add a marketplace using: /plugin marketplace add user-or-org/repo-name.
Then browse and install from the /plugin menu.
Try out the multi-agent workflow we use to develop Claude Code:
/plugin marketplace add anthropics/claude-code
/plugin install feature-dev
Anyone can host a marketplace or make a plugin. All you need is a git repo with a .claude-plugin/marketplace.json file.
Claude
Customize Claude Code with plugins | Claude
Claude Code now supports plugins: custom collections of slash commands, agents, MCP servers, and hooks that install with a single command. Share your Claude Code setup with plugins Slash commands, agents, MCP servers, and hooks are all extension points you…
🔥9❤2🥰2
Google introduced Gemini Enterprise with Agents, Connectors and Agent Builder
“It allows you to chat with your company’s documents, data and apps as well as build and deploy AI agents, all grounded in your information and context.“
“It allows you to chat with your company’s documents, data and apps as well as build and deploy AI agents, all grounded in your information and context.“
Google Cloud Blog
Introducing Gemini Enterprise | Google Cloud Blog
Today, we’re introducing Gemini Enterprise – the new front door for AI in the workplace. It’s our advanced agentic platform that brings the best of Google AI to every employee, for every workflow.
❤2🔥2👏2
AI-Driven Research for Systems released “Barbarians at the Gate: How AI is Upending Systems Research”
Researchers show how AI-Driven Research for Systems (ADRS) can rediscover or outperform human-designed algorithms across cloud scheduling, MoE expert load balancing, LLM-SQL optimization, transaction scheduling, and more — all within hours and under $20.
Code.
Researchers show how AI-Driven Research for Systems (ADRS) can rediscover or outperform human-designed algorithms across cloud scheduling, MoE expert load balancing, LLM-SQL optimization, transaction scheduling, and more — all within hours and under $20.
Code.
GitHub
GitHub - UCB-ADRS/ADRS: AI-Driven Research Systems (ADRS)
AI-Driven Research Systems (ADRS) . Contribute to UCB-ADRS/ADRS development by creating an account on GitHub.
❤4🔥2🦄2👏1
Elon Musk’s xAI is developing world models which are AI systems capable of understanding and designing physical environments.
To achieve this, xAI has hired former Nvidia specialists.
These world models are considered a more advanced form of AI than the LLMs trained primarily on text data, and are expected to surpass the limitations of popular AI tools such as ChatGPT and xAI’s own Grok.
xAI plans to apply world models first in the gaming sector.
The models could be used to automatically generate interactive 3D environments and potentially be applied to AI systems for robotics.
Some tech companies expect world models to become a next-generation core technology that could extend AI applications beyond software into physical products, such as humanoid robots.
Last month, Nvidia told the Financial Times that the market potential of world models could be nearly as large as the entire global economy.
Musk also reaffirmed his earlier goal by posting on X that xAI will release a “great AI-generated game” by the end of next year.
To achieve this, xAI has hired former Nvidia specialists.
These world models are considered a more advanced form of AI than the LLMs trained primarily on text data, and are expected to surpass the limitations of popular AI tools such as ChatGPT and xAI’s own Grok.
xAI plans to apply world models first in the gaming sector.
The models could be used to automatically generate interactive 3D environments and potentially be applied to AI systems for robotics.
Some tech companies expect world models to become a next-generation core technology that could extend AI applications beyond software into physical products, such as humanoid robots.
Last month, Nvidia told the Financial Times that the market potential of world models could be nearly as large as the entire global economy.
Musk also reaffirmed his earlier goal by posting on X that xAI will release a “great AI-generated game” by the end of next year.
Ft
Elon Musk’s xAI joins race to build ‘world models’ to power video games
Artificial intelligence group hired staff from Nvidia to work on advanced AI that can design and navigate physical spaces
🔥3👏3🥰2👍1
Google shows how agents learn software skills by watching tutorials, converting them into action steps, and boosting task performance
So converts free videos into reliable supervision at scale. A vision model, inverse dynamics, predicts the action between 2 screenshots, like click, type, or scroll.
Training uses about 630K transitions, mixing 500K synthetic steps and 132K human ones.
The model then labels tutorial videos and turns them into executable step sequences. It produces about 53K trajectories across 69 apps for examples or training.
As examples, these steps add 2 to 3 points on OSWorld for strong models. As training data, they raise a general multimodal model from 1.9 to 13.0 on OSWorld.
So converts free videos into reliable supervision at scale. A vision model, inverse dynamics, predicts the action between 2 screenshots, like click, type, or scroll.
Training uses about 630K transitions, mixing 500K synthetic steps and 132K human ones.
The model then labels tutorial videos and turns them into executable step sequences. It produces about 53K trajectories across 69 apps for examples or training.
As examples, these steps add 2 to 3 points on OSWorld for strong models. As training data, they raise a general multimodal model from 1.9 to 13.0 on OSWorld.
❤4👏3🔥2
Another announcement, Broadcom and OpenAI deal for another 10GW of capacity
OpenAI and Broadcom today announced a collaboration for 10 gigawatts of custom accelerators designed by OpenAI. Targeted to start in the second half of 2026 to complete by end of 2029 across OpenAI's facilities and partner data centers.
OpenAI will design them, Broadcom will develop them. They have been working towards this together for 18 months.
By designing its own chips and systems, OpenAI can embed what it’s learned from developing frontier models and products directly into the hardware, unlocking new levels of capability and intelligence.
So add $500B-$600B of additional investments to the $1.5T number from OpenAI already.
OpenAI and Broadcom today announced a collaboration for 10 gigawatts of custom accelerators designed by OpenAI. Targeted to start in the second half of 2026 to complete by end of 2029 across OpenAI's facilities and partner data centers.
OpenAI will design them, Broadcom will develop them. They have been working towards this together for 18 months.
By designing its own chips and systems, OpenAI can embed what it’s learned from developing frontier models and products directly into the hardware, unlocking new levels of capability and intelligence.
So add $500B-$600B of additional investments to the $1.5T number from OpenAI already.
Openai
OpenAI and Broadcom announce strategic collaboration to deploy 10 gigawatts of OpenAI-designed AI accelerators
Multi-year partnership enables OpenAI and Broadcom to deliver accelerator and network systems for next-generation AI clusters.
🔥3🥰3👏3
Andrej Karpathy released new repo: nanochat
Unlike earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase.
GitHub.
You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI.
It weighs ~8,000 lines of imo quite clean code to:
- Train the tokenizer using a new Rust implementation
- Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics
- Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use.
- SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval)
- RL the model optionally on GSM8K with "GRPO"
- Efficient inference the model in an Engine with KV cache, simple prefill/decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUI.
- Write a single markdown report card, summarizing and gamifying the whole thing.
Even for as low as ~$100 in cost (~4 hours on an 8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions. About ~12 hours surpasses GPT-2 CORE metric. As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and 70s on ARC-Easy, 20s on GSM8K, etc.
Unlike earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase.
GitHub.
You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI.
It weighs ~8,000 lines of imo quite clean code to:
- Train the tokenizer using a new Rust implementation
- Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics
- Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use.
- SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval)
- RL the model optionally on GSM8K with "GRPO"
- Efficient inference the model in an Engine with KV cache, simple prefill/decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUI.
- Write a single markdown report card, summarizing and gamifying the whole thing.
Even for as low as ~$100 in cost (~4 hours on an 8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions. About ~12 hours surpasses GPT-2 CORE metric. As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and 70s on ARC-Easy, 20s on GSM8K, etc.
GitHub
Introducing nanochat: The best ChatGPT that $100 can buy. · karpathy nanochat · Discussion #1
Ok so we just booted up an 8xH100 box from e.g. Lambda GPU Cloud. This is costing us about ~$24/hr, so there is no time to lose. Environment setup Clone the project: git clone git@github.com:karpat...
❤5🆒3🔥2👏2
Meet Mamba3. A research paper submitted to ICLR 2026 introduced Mamba-3, which addresses several limitations in current sub-quadratic sequence models through three methodological changes grounded in classical state-space theory.
Code and detailed implementation not yet publicly available as the paper is under review.
Core Modifications
1. Trapezoidal Discretization
The paper replaces Euler's rule (first-order approximation) with a generalized trapezoidal rule (second-order approximation) for discretizing the continuous-time SSM.
This results in:
- A recurrence that incorporates both current and previous inputs with data-dependent weights
- Ability to replace the short causal convolution when combined with learnable biases on B and C projections
- Lower approximation error: O(Δt²) vs O(Δt) for Euler's method
2. Complex-Valued State Spaces
Mamba-2 simplified the transition matrix to a real scalar, which removed the model's ability to solve simple state-tracking tasks. Mamba-3 reintroduces complex SSMs:
- Enables rotational dynamics in hidden states
- Mathematically equivalent to applying data-dependent rotary embeddings to B and C projections
- Can be computed efficiently using the "RoPE trick"
- Recovers performance on parity and modular arithmetic tasks (100% vs <1% for Mamba-2).
3. MIMO Formulation
Changes state update from outer-product to matrix-multiplication based:
- Increases arithmetic intensity from ~2.5 to ~2r (where r is MIMO rank)
- Better utilizes GPU accelerators during decode
- No increase in state size, maintaining inference speed
- Optional feature that can be enabled when compute efficiency is prioritized
Experimental Results
Language Modeling (100B FineWeb-Edu tokens):
Outperforms Mamba-2, Transformer, and Gated DeltaNet baselines at all tested scales (180M, 440M, 820M, 1.5B parameters)
Example: Mamba-3-1.5B achieves 56.4% average accuracy vs 55.7% for Mamba-2
State-Tracking Tasks:
Parity: 100.0% (Mamba-2: 0.9%)
Arithmetic without brackets: 98.5% (Mamba-2: 47.8%)
Arithmetic with brackets: 87.8% (Mamba-2: 0.9%)
Inference Performance:
Faster single-step decode than Mamba-2 despite more complex SSM
MIMO variant improves Pareto frontier: better perplexity at same state size
At 440M scale with 100B tokens, MIMO achieves 12.72 vs 12.87 perplexity for SISO
Code and detailed implementation not yet publicly available as the paper is under review.
Core Modifications
1. Trapezoidal Discretization
The paper replaces Euler's rule (first-order approximation) with a generalized trapezoidal rule (second-order approximation) for discretizing the continuous-time SSM.
This results in:
- A recurrence that incorporates both current and previous inputs with data-dependent weights
- Ability to replace the short causal convolution when combined with learnable biases on B and C projections
- Lower approximation error: O(Δt²) vs O(Δt) for Euler's method
2. Complex-Valued State Spaces
Mamba-2 simplified the transition matrix to a real scalar, which removed the model's ability to solve simple state-tracking tasks. Mamba-3 reintroduces complex SSMs:
- Enables rotational dynamics in hidden states
- Mathematically equivalent to applying data-dependent rotary embeddings to B and C projections
- Can be computed efficiently using the "RoPE trick"
- Recovers performance on parity and modular arithmetic tasks (100% vs <1% for Mamba-2).
3. MIMO Formulation
Changes state update from outer-product to matrix-multiplication based:
- Increases arithmetic intensity from ~2.5 to ~2r (where r is MIMO rank)
- Better utilizes GPU accelerators during decode
- No increase in state size, maintaining inference speed
- Optional feature that can be enabled when compute efficiency is prioritized
Experimental Results
Language Modeling (100B FineWeb-Edu tokens):
Outperforms Mamba-2, Transformer, and Gated DeltaNet baselines at all tested scales (180M, 440M, 820M, 1.5B parameters)
Example: Mamba-3-1.5B achieves 56.4% average accuracy vs 55.7% for Mamba-2
State-Tracking Tasks:
Parity: 100.0% (Mamba-2: 0.9%)
Arithmetic without brackets: 98.5% (Mamba-2: 47.8%)
Arithmetic with brackets: 87.8% (Mamba-2: 0.9%)
Inference Performance:
Faster single-step decode than Mamba-2 despite more complex SSM
MIMO variant improves Pareto frontier: better perplexity at same state size
At 440M scale with 100B tokens, MIMO achieves 12.72 vs 12.87 perplexity for SISO
❤4🔥4👏2
Citibank plans to launch crypto asset custody services in 2026
The project has been in preparation for two to three years and is currently underway.
Citi is exploring a dual-track approach involving both in-house technology and third-party solutions, aiming to directly custody native crypto assets.
The project has been in preparation for two to three years and is currently underway.
Citi is exploring a dual-track approach involving both in-house technology and third-party solutions, aiming to directly custody native crypto assets.
CNBC
Citi targets 2026 launch for crypto custody service as Wall Street dives deeper into digital assets
A more favorable regulatory environment in the U.S. has prompted American banks to offer more services to do with digital assets.
🔥3🥰3👏2
Anthropic and Salesforce announced an expanded partnership to make Claude a preferred model for Salesforce's Agentforce platform in financial services, healthcare, cybersecurity, and life sciences, with Anthropic being the first LLM provider fully integrated within the Salesforce trust boundary with all of Claude's traffic contained within the Salesforce virtual private cloud
Anthropic and Salesforce will build AI solutions designed to specific industries starting with financial services, and through Slack's Model Context Protocol server Claude can access Slack channels, messages, and files while users can invoke Claude directly within Slack to pull connected insights from Salesforce CRM data and other enterprise apps.
Anthropic and Salesforce will build AI solutions designed to specific industries starting with financial services, and through Slack's Model Context Protocol server Claude can access Slack channels, messages, and files while users can invoke Claude directly within Slack to pull connected insights from Salesforce CRM data and other enterprise apps.
Anthropic
Anthropic and Salesforce expand partnership to bring Claude to regulated industries
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
🔥4❤3👏2
Financial Times projected OpenAI’s cap table following its for-profit transition:
-Microsoft (30%)
- OpenAI Employees (30%)
- OpenAI Non-Profit (>20%)
- Softbank (10%)
That leaves ~10% for existing investors (Thrive, Khosla, MGX etc).
Moving forward, Nvidia’s $100B investment will dilute existing investors and then there is also consideration for Sam Altman’s potential equity stake.
-Microsoft (30%)
- OpenAI Employees (30%)
- OpenAI Non-Profit (>20%)
- Softbank (10%)
That leaves ~10% for existing investors (Thrive, Khosla, MGX etc).
Moving forward, Nvidia’s $100B investment will dilute existing investors and then there is also consideration for Sam Altman’s potential equity stake.
Microsoft AI announced its first image generator created in-house.
The MAI-Image-1 model has already secured a spot in the top 10 of the LMArena AI benchmark
The MAI-Image-1 model has already secured a spot in the top 10 of the LMArena AI benchmark
Microsoft AI
Introducing MAI-Image-1, debuting in the top 10 on LMArena | Microsoft AI
🔥4🥰3👏2
Meta AI researchers propose a new learning paradigm for language agents called “early experience”, a reward-free method where agents learn by interacting with environments using their own suboptimal actions.
Instead of relying solely on human demonstrations or reinforcement signals, the agent learns from future outcomes it observes after taking alternative actions.
Two key strategies power this method:
1. Implicit World Modeling – grounding behavior in environment dynamics
2. Self-Reflection – learning from mistakes by generating natural language rationales
Tested across 8 diverse environments, the approach outperforms imitation learning alone and significantly boosts generalization even improving downstream reinforcement learning.
It positions early experience as a scalable bridge between static supervised fine-tuning and full-on autonomous agents.
Instead of relying solely on human demonstrations or reinforcement signals, the agent learns from future outcomes it observes after taking alternative actions.
Two key strategies power this method:
1. Implicit World Modeling – grounding behavior in environment dynamics
2. Self-Reflection – learning from mistakes by generating natural language rationales
Tested across 8 diverse environments, the approach outperforms imitation learning alone and significantly boosts generalization even improving downstream reinforcement learning.
It positions early experience as a scalable bridge between static supervised fine-tuning and full-on autonomous agents.
🔥8🥰2👏2
OpenAI, Anthropic, and Google DeepMind jointly released a paper shows the current LLM safety defenses are extremely fragile
The paper systematically evaluates the robustness of current LLM safety defenses and finds that almost all existing methods can be bypassed by adaptive attacks.
Looks all LLM big names emphasize that reliable robustness evaluation of LLMs must incorporate adaptive attacks.
If a defense fails under a single adaptive loop, it cannot be considered robust.
1. The study tests 12 types of LLM defense mechanisms, covering jailbreak prevention and prompt-injection defenses. It shows that most current evaluation protocols rely on static or fixed attack samples, which fail to simulate a realistic adaptive attacker.
Once the attacker can adjust strategy, success rates of bypassing reach more than 90% for most models.
2. The authors propose a General Adaptive Attack Framework. It assumes attackers can systematically modify attack prompts based on defense feedback, using optimization methods such as gradient descent, reinforcement learning, random search, and human-in-the-loop exploration.
This framework successfully bypassed all 12 recently published defense methods.
3. Prompt-based defenses
can resist fixed attacks, but are ineffective against adaptive ones: Spotlighting / Prompt Sandwiching: ASR (attack success rate) > 95%, RPO: ASR ≈ 96–98%
it shows such methods lack generalization and are easily defeated once new automated or human attack variants appear.
4. Training-based defenses fine-tune models with adversarial data.
However, adaptive attacks raised success rates from below 5 % to 96–100 %.
This confirms that static adversarial training cannot cover unseen adaptive attacks; dynamic retraining is required.
5. Filter-model defenses place an external classifier before or after the main model.
These are typically fine-tuned BERT detectors.
6. Secret-knowledge defenses rely on hidden triggers or unknown “canary” information to detect injection.
All four categories: prompt optimization, adversarial training, filtering, and secret-based detection, exhibit severe weaknesses.
Static or single-shot defenses cannot resist adaptive attack loops. Only dynamically optimized and continuously co-trained systems may achieve meaningful robustness.
The paper systematically evaluates the robustness of current LLM safety defenses and finds that almost all existing methods can be bypassed by adaptive attacks.
Looks all LLM big names emphasize that reliable robustness evaluation of LLMs must incorporate adaptive attacks.
If a defense fails under a single adaptive loop, it cannot be considered robust.
1. The study tests 12 types of LLM defense mechanisms, covering jailbreak prevention and prompt-injection defenses. It shows that most current evaluation protocols rely on static or fixed attack samples, which fail to simulate a realistic adaptive attacker.
Once the attacker can adjust strategy, success rates of bypassing reach more than 90% for most models.
2. The authors propose a General Adaptive Attack Framework. It assumes attackers can systematically modify attack prompts based on defense feedback, using optimization methods such as gradient descent, reinforcement learning, random search, and human-in-the-loop exploration.
This framework successfully bypassed all 12 recently published defense methods.
3. Prompt-based defenses
can resist fixed attacks, but are ineffective against adaptive ones: Spotlighting / Prompt Sandwiching: ASR (attack success rate) > 95%, RPO: ASR ≈ 96–98%
it shows such methods lack generalization and are easily defeated once new automated or human attack variants appear.
4. Training-based defenses fine-tune models with adversarial data.
However, adaptive attacks raised success rates from below 5 % to 96–100 %.
This confirms that static adversarial training cannot cover unseen adaptive attacks; dynamic retraining is required.
5. Filter-model defenses place an external classifier before or after the main model.
These are typically fine-tuned BERT detectors.
6. Secret-knowledge defenses rely on hidden triggers or unknown “canary” information to detect injection.
All four categories: prompt optimization, adversarial training, filtering, and secret-based detection, exhibit severe weaknesses.
Static or single-shot defenses cannot resist adaptive attack loops. Only dynamically optimized and continuously co-trained systems may achieve meaningful robustness.
👍3🔥3👏2
Ubyx_Corporate_Treasury_in_a_World_of_Wallets_1760531818.pdf
2.7 MB
Wallets are the new cash rail for enterprise. And they’re changing the way liquidity is managed.
A new report from Ubyx and Finmo highlights that while a bank account anchors value to a single institution, a wallet can hold tokenized deposits, regulated #stablecoins, tokenized #MMFs, and other instruments across multiple blockchains.
This enables treasurers to consolidate hundreds of accounts into programmable, multi-asset wallets with 24/7 settlement, automated liquidity optimization, and transparent auditability.
Wallet-based architectures promise radical simplification, continuous #yield optimization, and reduced counterparty dependence.
Legally and from an accounting perspective, tokenized deposits and regulated stablecoins are now being recognized as cash equivalents under IAS 7, removing a key barrier to adoption.
Technically, wallets bring programmability and instant settlement, but also new operational risks (key management, custody resilience).
The also bring regulatory challenges requiring phased pilots, hybrid coexistence with traditional rails, and rigorous counterparty due diligence.
What we see is a competitive landscape that’s slowly coalescing:
1. Banks that tokenize deposits can defend client relationships
2. Tech providers are building ERP-integrated wallet rails
3. Digital-native firms supply programmability and scale.
A new report from Ubyx and Finmo highlights that while a bank account anchors value to a single institution, a wallet can hold tokenized deposits, regulated #stablecoins, tokenized #MMFs, and other instruments across multiple blockchains.
This enables treasurers to consolidate hundreds of accounts into programmable, multi-asset wallets with 24/7 settlement, automated liquidity optimization, and transparent auditability.
Wallet-based architectures promise radical simplification, continuous #yield optimization, and reduced counterparty dependence.
Legally and from an accounting perspective, tokenized deposits and regulated stablecoins are now being recognized as cash equivalents under IAS 7, removing a key barrier to adoption.
Technically, wallets bring programmability and instant settlement, but also new operational risks (key management, custody resilience).
The also bring regulatory challenges requiring phased pilots, hybrid coexistence with traditional rails, and rigorous counterparty due diligence.
What we see is a competitive landscape that’s slowly coalescing:
1. Banks that tokenize deposits can defend client relationships
2. Tech providers are building ERP-integrated wallet rails
3. Digital-native firms supply programmability and scale.
👍4🔥2👏2