All about AI, Web 3.0, BCI
3.22K subscribers
724 photos
26 videos
161 files
3.09K links
This channel about AI, Web 3.0 and brain computer interface(BCI)

owner @Aniaslanyan
Download Telegram
Physical Intelligence demonstrated a series of humanoid robot tasks like making peanut butter sandwiches, cleaning windows, peeling oranges, and washing pans modeled after Benjie Holson’s “Robot Olympics.”

Using their fine-tuned model π0.6, the robots autonomously tackled high-dexterity challenges that highlight Moravec’s Paradox: tasks humans find trivial are still incredibly hard for machines.

Their results show that fine-tuning large embodied models is essential training from scratch failed on all tasks.
🔥2👏2💯2
Google DeepMind just released DeepSearchQA

A 900-prompt benchmark that evaluates AI agents on complex, multi-step web research tasks across 17 fields.
Nvidia will acquire assets and key talent from chipmaking startup Groq for $20B

Groq co-founder and CEO Jonathan Ross was lead designer and artchitect for the first generation of Google’s TPU chips. He’ll join Nvidia along with president Sunny Madra and other top executives.

This Nvidia’s largest deal ever, topping its $7B acquisition of data centre networking firm Mellanox in 2019
4🔥4👏2
This paper is a big deal. A new research introduces Agent-R1, a framework for training LLM agents with end-to-end reinforcement learning across multi-turn interactions.

As agents move from predefined workflows to autonomous interaction, end-to-end RL becomes the natural training paradigm. Agent-R1 provides a modular foundation for scaling RL to complex, tool-using LLM agents.

Standard RL for LLMs assumes deterministic state transitions. You generate a token, append it to the sequence, done. But agents trigger external tools with uncertain outcomes. The environment responds unpredictably. State transitions become stochastic.

Therefore, the researchers extend the Markov Decision Process framework to capture this. State space expands to include full interaction history and environmental feedback. Actions can trigger external tools, not just generate text. Rewards become dense, with process rewards for intermediate steps alongside final outcome rewards.

Two core mechanisms make this work. An Action Mask distinguishes agent-generated tokens from environmental feedback, ensuring credit assignment targets only the agent's actual decisions. A ToolEnv module manages the interaction loop, handling state transitions and reward calculation when tools are invoked.

GitHub.
🔥54💯3
LeCun's JEPA has evolved into a vision-language model, with 1.6B parameters rivaling the 72B Qwen-VL.

Instead of predicting words directly, the proposed VL-JEPA learns to predict the core "meaning" of a text in an abstract space, ignoring surface-level wording variations.

This method outperforms standard token-based training with 50% fewer parameters. It beats models like CLIP & SigLIP2 on video classification/retrieval tasks and matches larger VLMs on VQA, while using a decoder only when needed to cut decoding ops by nearly 3x.
🔥53👏2
Meta introduced software agents can self-improve via self-play RL

Self-play SWE-RL (SSR): training a single LLM agent to self-play between bug-injection and bug-repair, grounded in real-world repositories, no human-labeled issues or tests.

Bug-injection: the agent creates a standard suite of bug artifacts, further validated for consistency.

Key steps:
1) original tests must pass,
2) tests fail after applying the bug-injection patch,
3) weakened tests should pass
🔥53👏2
2026_digital_assets__1767020549.pdf
1.2 MB
Grayscale shipped Digital Asset Outlook 2026.

Key Takeaways:

1. 2026 to accelerate structural shifts in digital asset investing, which have been underpinned by two major themes: macro demand for alternative stores of value and improved regulatory clarity. Together, these trends should bring in new capital, broaden adoption (especially among advised wealth and institutional investors), and bridge public blockchains more fully into mainstream financial infrastructure.

2. as a result as expected that valuations will rise in 2026 and the end of the so-called “four-year cycle,” or the theory that crypto market direction follows a recurring four-year pattern.

3. Bitcoin’s price will likely reach a new all-time high in the first half of the year.

4. Grayscale expects bipartisan crypto market structure legislation to become U.S. law in 2026. This will bring deeper integration between public blockchains and traditional finance, facilitate regulated trading of digital asset securities, and potentially allow for on-chain issuance by both startups and mature firms.

5. The outlook for fiat currencies is increasingly uncertain; in contrast, we can be highly confident that the 20 millionth Bitcoin will be mined in March 2026. Digital money systems like Bitcoin and Ethereum that offer transparent, programmatic, and ultimately scarce supply will be in rising demand, in our view, due to rising fiat currency risks.

6. More crypto assets to be available through exchange-traded products in 2026. These vehicles have had a successful start, but many platforms are still conducting due diligence and working to incorporate crypto into their asset-allocation process. As this process matures, look for more slow-moving institutional capital to arrive throughout 2026.

The report also outlines Top 10 Crypto Investing Themes for 2026, reflecting the breadth of use cases emerging across public blockchain technology.

- Dollar Debasement Risk Drives Demand for Monetary Alternatives

- Regulatory Clarity Supporting Adoption of Digital Assets

- Reach of Stablecoins to Grow in Wake of GENIUS Act

- Asset Tokenization at Inflection Point

- Privacy Solutions Needed as Blockchain Tech Goes Mainstream

- AI Centralization Calls for Blockchain Solutions

- DeFi Accelerates, Led by Lending

- Mainstream Adoption Will Demand Next-Generation Infrastructure

- A Focus on Sustainable Revenue

- Investors Seek Out Staking by Default

Finally, two topics that the report does not expect to influence crypto markets in 2026:

1. Quantum computing: We believe that research and preparedness will continue on post-quantum cryptography, but this issue is unlikely to affect valuations in the next year.

2. Digital asset treasuries: Despite their media attention, it is believed that DATs will not be a major swing factor for digital asset markets in 2026.
🔥5💯4🥰2
Meta acquired Manus for $2–4B.

Manus hit $100M ARR just 8 months after launch. Fastest startup did this ever. It has no proprietary model. People call it an “AI wrapper.” Same critique was used on Cursor.

Manus runs on Claude with its custom tools built for orchestration and grounding. Their agentic environment enables the agents to browse, write code, manipulate files, and execute multi-step workflows without human in the loop.

They also beat OpenAI on GAIA. An interesting thing here is that they didn't build a foundation model. They built the most compatible environment for models to reason and act within.
👍2🔥2👏2
All about AI, Web 3.0, BCI
Tongyi Lab dropped half a dozen new papers, most focused on Deep Research agents. 1. Tongyi DeepResearch: Open-source DeepResearch Agent • First OSS web agent matching OpenAI’s DeepResearch • SOTA on HLE (32.9), BrowseComp (43.4/46.7), xbench-DeepSearch…
Tongyi released MAI-UI a family of foundation GUI agents.

It natively integrates MCP tool use, agent user interaction, device–cloud collaboration, and online RL, establishing SOTA results in general GUI grounding and mobile GUI navigation, surpassing Gemini-2.5-Pro, Seed1.8, and UI-Tars-2 on AndroidWorld.

MAI-UI includes a full-spectrum of sizes, including 2B, 8B, 32B and 235B-A22B variants.

MobileWorld benchmark.
👍3🔥2👏2
Happy new year folks! Wishing everyone a bright and inspiring new year 🎉

May 2026 be a year of bold ideas.

Let’s keep building, exploring, and pushing the boundaries together.

Happy New Year from @alwebbci 🚀
🆒4
So the first major paper of 2026, #DeepSeek mHC: Manifold-Constrained Hyper-Connections

This is actually an engineering paper, taking as a starting points ideas already exposed in the original Hyper-Connections (HC) paper from ByteDance, which is consequently a prerequisite for reading. So initial notes on this first.

DeepSeek paper starts almost in media res and first underlines a major success of HC original approach: increase in math/topological complexity did not result in computational overhead.

Overall the actual flex of the paper is not so much proving Hyper-Connections can work at scale.

It’s: we have the internal capacity to re-engineer the complete training environment at all dimensions (kernels, memory management, inter-node communication) around highly experimental research ideas.
7🔥2👏2
This new open-source "brain" just became the world's best robot model. Spirit AI presents Spirit v1.5.

This new vision-language-action model translates what a robot sees into precise physical actions.

It now ranks #1 on the RoboChallenge Table30 benchmark, outperforming the previous leader, Pi0.5, in robotic reasoning and control.

Code.
Model.
52🔥2👏2
Apple and Google have entered into a multi-year collaboration under which the next generation of Apple Foundation Models will be based on Google's Gemini models and cloud technology.

These models will help power future Apple Intelligence features, including a more personalized Siri coming this year.

After careful evaluation, Apple determined that Google's Al technology provides the most capable foundation for Apple Foundation Models and is excited about the innovative new experiences it will unlock for Apple users. Apple Intelligence will continue to run on Apple devices and Private Cloud Compute, while maintaining Apple's industry-leading privacy standards.
🔥43👏2😁2
Huge, new release from DeepSeek & PKU. Enter "Engram," a new conditional memory module.

It's like a super-fast, internal lookup table for knowledge, freeing up the model's compute for actual reasoning.

#Deepseek's new paper is a very nice read. The idea builds on previous work like Over-tokenized Transformer, Per-Layer Embeddings and N-grammer, they scale it and got some pretty convincing results!

The goal is quite simple: free some effective depth for complex modules like MoE and attention by creating a new layer specialized in efficient retrieval. And of course it's DeepSeek, so the system design works nicely with hardware at inference and training, especially you can scale model size with ngrams.

Results: It beats iso-parameter MoE models across the board.

Big gains in general reasoning (BBH +5.0), knowledge (MMLU +3.4), code (HumanEval +3.0), math (MATH +2.4), and massively improves long-context retrieval.

If you still aren’t bullish on SSD demand, read this and get storage-pilled.

Paper.
Code.
🔥75🥰3
Anthropic introduced Cowork:
Claude Code for the rest of your work.


Cowork lets you complete non-technical tasks much like how developers use Claude Code.

In Cowork, you give Claude access to a folder on your computer. Claude can then read, edit, or create files in that folder.

Once you've set a task, Claude makes a plan and steadily completes it, looping you in along the way.

Claude will ask before taking any significant actions so you can course-correct as needed.

Claude can use your existing connectors, which link Claude to external information.

You can also pair Cowork with Claude in Chrome for tasks that need browser access.

Cowork is available as a research preview for Claude Max subscribers in the macOS app.

If you're on another plan, join the waitlist for future access here.
❤‍🔥55👍5
Tencent's WeChat AI presents WeDLM

It's a new diffusion decoding framework that uses standard, forward-looking attention. This lets it use the same high-speed caching systems as today's top LLMs, avoiding the slowdowns of other diffusion models.

The result? It matches the quality of top autoregressive models while delivering up to 3x faster inference on complex reasoning tasks and up to 10x faster on simpler text.

Paper
GitHub
Model.
🔥5👏43
Great paper on Agentic Memory.

LLM agents need both long-term and short-term memory to handle complex tasks.
However, the default approach today treats these as separate components, each with its own heuristics, controllers, and optimization strategies.

But memory isn't two independent systems. It's one cognitive process that decides what to store, retrieve, summarize, and forget.

This new research introduces AgeMem, a unified framework that integrates long-term and short-term memory management directly into the agent's policy through tool-based actions.

Instead of relying on trigger-based rules or auxiliary memory managers, the agent learns when and how to invoke memory operations: ADD, UPDATE, DELETE for long-term storage, and RETRIEVE, SUMMARY, FILTER for context management.

It uses a three-stage progressive RL strategy. First, the model learns long-term memory storage. Then it masters short-term context management. Finally, it coordinates both under full task settings.

To handle the fragmented experiences from memory operations, they design a step-wise GRPO (Group Relative Policy Optimization) that transforms cross-stage dependencies into learnable signals.

The results across five long-horizon benchmarks:

1. On Qwen2.5-7B, AgeMem achieves 41.96 average score compared to 37.14 for Mem0, a 13% improvement.

2. On Qwen3-4B, the gap widens: 54.31 vs 44.70. Adding long-term memory alone provides +10-14% gains.

3. Adding RL training adds another +6%.

4. The full unified system with both memory types achieves up to +21.7% improvement over no-memory baselines.

The unified memory management through learnable tool-based actions outperforms fragmented heuristic pipelines, enabling agents to adaptively decide what to remember and forget based on task demands.
5👍4
New paper from Google, proving a novel theorem in algebraic geometry with an internal math-specialized version of Gemini. 

This was a collaboration between Google DeepMind (Professor Freddie Manners and Blueshift team) and Professors Jim Bryan, Balazs Elek, and Ravi Vakil.

Coauthor Professor Ravi Vakil, president of the American Mathematical Society, said that Gemini’s “proof was rigorous, correct, and elegant... the kind of insight I would have been proud to produce myself.”
OMG! 1 billion cells. Illumina introduced the Billion Cell Atlas, creating the most comprehensive map of human disease biology — and unlocking unparalleled speed and scale in AI for drug discovery.

The Atlas will help researchers, including founding participants AstraZeneca, Merck, and Eli Lilly study the effect of switching on and off all 20,000 genes in cells linked to diseases that have been historically difficult to decode.