All about AI, Web 3.0, BCI
3.71K subscribers
769 photos
28 videos
162 files
3.46K links
This channel about AI, Web 3.0 and brain computer interface(BCI)

owner @Aniaslanyan
Download Telegram
Happy new year folks! Wishing everyone a bright and inspiring new year 🎉

May 2026 be a year of bold ideas.

Let’s keep building, exploring, and pushing the boundaries together.

Happy New Year from @alwebbci 🚀
🆒4
So the first major paper of 2026, #DeepSeek mHC: Manifold-Constrained Hyper-Connections

This is actually an engineering paper, taking as a starting points ideas already exposed in the original Hyper-Connections (HC) paper from ByteDance, which is consequently a prerequisite for reading. So initial notes on this first.

DeepSeek paper starts almost in media res and first underlines a major success of HC original approach: increase in math/topological complexity did not result in computational overhead.

Overall the actual flex of the paper is not so much proving Hyper-Connections can work at scale.

It’s: we have the internal capacity to re-engineer the complete training environment at all dimensions (kernels, memory management, inter-node communication) around highly experimental research ideas.
7🔥2👏2
This new open-source "brain" just became the world's best robot model. Spirit AI presents Spirit v1.5.

This new vision-language-action model translates what a robot sees into precise physical actions.

It now ranks #1 on the RoboChallenge Table30 benchmark, outperforming the previous leader, Pi0.5, in robotic reasoning and control.

Code.
Model.
52🔥2👏2
Apple and Google have entered into a multi-year collaboration under which the next generation of Apple Foundation Models will be based on Google's Gemini models and cloud technology.

These models will help power future Apple Intelligence features, including a more personalized Siri coming this year.

After careful evaluation, Apple determined that Google's Al technology provides the most capable foundation for Apple Foundation Models and is excited about the innovative new experiences it will unlock for Apple users. Apple Intelligence will continue to run on Apple devices and Private Cloud Compute, while maintaining Apple's industry-leading privacy standards.
🔥43👏2😁2
Huge, new release from DeepSeek & PKU. Enter "Engram," a new conditional memory module.

It's like a super-fast, internal lookup table for knowledge, freeing up the model's compute for actual reasoning.

#Deepseek's new paper is a very nice read. The idea builds on previous work like Over-tokenized Transformer, Per-Layer Embeddings and N-grammer, they scale it and got some pretty convincing results!

The goal is quite simple: free some effective depth for complex modules like MoE and attention by creating a new layer specialized in efficient retrieval. And of course it's DeepSeek, so the system design works nicely with hardware at inference and training, especially you can scale model size with ngrams.

Results: It beats iso-parameter MoE models across the board.

Big gains in general reasoning (BBH +5.0), knowledge (MMLU +3.4), code (HumanEval +3.0), math (MATH +2.4), and massively improves long-context retrieval.

If you still aren’t bullish on SSD demand, read this and get storage-pilled.

Paper.
Code.
🔥75🥰3
Anthropic introduced Cowork:
Claude Code for the rest of your work.


Cowork lets you complete non-technical tasks much like how developers use Claude Code.

In Cowork, you give Claude access to a folder on your computer. Claude can then read, edit, or create files in that folder.

Once you've set a task, Claude makes a plan and steadily completes it, looping you in along the way.

Claude will ask before taking any significant actions so you can course-correct as needed.

Claude can use your existing connectors, which link Claude to external information.

You can also pair Cowork with Claude in Chrome for tasks that need browser access.

Cowork is available as a research preview for Claude Max subscribers in the macOS app.

If you're on another plan, join the waitlist for future access here.
❤‍🔥55👍5
Tencent's WeChat AI presents WeDLM

It's a new diffusion decoding framework that uses standard, forward-looking attention. This lets it use the same high-speed caching systems as today's top LLMs, avoiding the slowdowns of other diffusion models.

The result? It matches the quality of top autoregressive models while delivering up to 3x faster inference on complex reasoning tasks and up to 10x faster on simpler text.

Paper
GitHub
Model.
🔥6👏43
Great paper on Agentic Memory.

LLM agents need both long-term and short-term memory to handle complex tasks.
However, the default approach today treats these as separate components, each with its own heuristics, controllers, and optimization strategies.

But memory isn't two independent systems. It's one cognitive process that decides what to store, retrieve, summarize, and forget.

This new research introduces AgeMem, a unified framework that integrates long-term and short-term memory management directly into the agent's policy through tool-based actions.

Instead of relying on trigger-based rules or auxiliary memory managers, the agent learns when and how to invoke memory operations: ADD, UPDATE, DELETE for long-term storage, and RETRIEVE, SUMMARY, FILTER for context management.

It uses a three-stage progressive RL strategy. First, the model learns long-term memory storage. Then it masters short-term context management. Finally, it coordinates both under full task settings.

To handle the fragmented experiences from memory operations, they design a step-wise GRPO (Group Relative Policy Optimization) that transforms cross-stage dependencies into learnable signals.

The results across five long-horizon benchmarks:

1. On Qwen2.5-7B, AgeMem achieves 41.96 average score compared to 37.14 for Mem0, a 13% improvement.

2. On Qwen3-4B, the gap widens: 54.31 vs 44.70. Adding long-term memory alone provides +10-14% gains.

3. Adding RL training adds another +6%.

4. The full unified system with both memory types achieves up to +21.7% improvement over no-memory baselines.

The unified memory management through learnable tool-based actions outperforms fragmented heuristic pipelines, enabling agents to adaptively decide what to remember and forget based on task demands.
7👍5👏2
New paper from Google, proving a novel theorem in algebraic geometry with an internal math-specialized version of Gemini. 

This was a collaboration between Google DeepMind (Professor Freddie Manners and Blueshift team) and Professors Jim Bryan, Balazs Elek, and Ravi Vakil.

Coauthor Professor Ravi Vakil, president of the American Mathematical Society, said that Gemini’s “proof was rigorous, correct, and elegant... the kind of insight I would have been proud to produce myself.”
👍2🔥2👏2
OMG! 1 billion cells. Illumina introduced the Billion Cell Atlas, creating the most comprehensive map of human disease biology — and unlocking unparalleled speed and scale in AI for drug discovery.

The Atlas will help researchers, including founding participants AstraZeneca, Merck, and Eli Lilly study the effect of switching on and off all 20,000 genes in cells linked to diseases that have been historically difficult to decode.
1🔥1
Sakana AI introduced DroPE: Extending the Context of Pretrained LLMs by Dropping Their Positional Embeddings

Positional embeddings are just training wheels. They help convergence but hurt long-context generalization.

Sakana found that if you simply delete them after pretraining and recalibrate for < 1% of the original budget, you unlock massive context windows.

Paper
Code
2🔥2👏2
Agent Skills are now available in Google Antigravity

Skills are an open standard to extend what your agent can do. Whether it's project-specific workflows or global utilities, you can now package knowledge into reusable skills.
🔥5👏3👍2
How can we use new neuroscience insights to build adaptive AI agents and leverage the many foundation models (which are much like different brain areas)? Check out paper.
🔥52🥰2
Anthropic rolling out MCP Tool Search for Claude Code.

As MCP has grown to become a more popular protocol and agents have become more capable, that MCP servers may have up to 50+ tools and take up a large amount of context.

Tool Search allows Claude Code to dynamically load tools into context when MCP tools would otherwise take up a lot of context.

How it works:
- Claude Code detects when your MCP tool descriptions would use more than 10% of context

- When triggered, tools are loaded via search instead of preloaded Otherwise, MCP tools work exactly as before.

This resolves one of our most-requested features on GitHub: lazy loading for MCP servers.

Users were documenting setups with 7+ servers consuming 67k+ tokens. If you're making a MCP server Things are mostly the same, but the "server instructions" field becomes more useful with tool search enabled. It helps Claude know when to search for your tools, similar to skills If you're making a MCP client highly suggest implementing the ToolSearchTool, you can find the docs here.

Anthropic implemented it with a custom search function to make it work for Claude Code.
🔥4🥰32👍2
Anthropic published 4th Economic Index report

This version introduces "economic primitives"—simple and foundational metrics on how AI is used: task complexity, education level, purpose (work, school, personal), AI autonomy, and success rates.

API data shows Claude is 50% successful at tasks of 3.5 hours, and highly reliable on longer tasks on Claude.ai.

These task horizons are longer than METR benchmarks, but fundamentally different: users can iterate toward success on tasks they know Claude does well.

Countries at different stages of economic development use Claude quite differently.

As GDP per capita increases, people use it more for work or personal use; as it decreases, they’re more likely to use AI for coursework.

Because Claude tends to better cover higher-skill tasks, if those get automated, workers may be left with more routine work—a “deskilling” effect.

However, this assumes that automation shrinks those aspects of the job; Anthropic can't be sure how jobs might evolve.
6🔥2👏2
Bytedance dropped a protein folding model better than Google's AlphaFold 3

SeedFold builds on top of AlphaFold3 and gets SOTA on FoldBench.

You can actually play with it and vibecode a 3D protein viewer.

The three main techniques they used are:
— 4xing the width of the Pairformer architecture of AF3
— More efficient linear triangular attention mechanism
— Distilling a 26.5M dataset from AF2 to increase training data

AlphaFold didn't directly participate in the latest CASP16 (2024), but most models that did well like Yang Lab (Shandong U), MULTICOM (UMissouri), Kiharalab (Purdue) and kozakovvajda (Stony Brook / Boston) are all based off AlphaFold3.

SeedFold has a decent chance of outperforming them and taking at least top 3 in CASP17 (2026) in various categories.
5👍2🔥2
This new research introduces UniversalRAG, a framework that retrieves and integrates knowledge from heterogeneous sources across diverse modalities and granularities.

Real-world queries vary widely in what knowledge they need. A universal RAG framework that dynamically routes to the right modality and granularity serves diverse information needs that no single-corpus approach can address.

Instead of forcing everything into one embedding space, UniversalRAG uses modality-aware routing. A router dynamically predicts which modality-specific corpus best matches the query, then performs targeted retrieval within it. This sidesteps the modality gap entirely by avoiding cross-modal comparisons.

Beyond modality, the framework also handles granularity. Complex analytical questions may need full documents or complete videos. Simple factoid questions are better served with paragraphs or short clips. UniversalRAG organizes each modality into multiple granularity levels: paragraphs and documents for text, clips and full videos for video, plus tables and images.

The router can be trained or training-free. The trained version uses inductive biases from existing benchmarks. The training-free version prompts frontier models like Gemini to predict the best modality-granularity pairs directly.

Validation across 10 benchmarks spanning text, images, tables, and videos shows UniversalRAG outperforms both unimodal RAG baselines and unified embedding approaches by large margins on average.
4🔥4👍2
Google announced a deep learning approach that demonstrates the viability of smartwatches for estimation of walking metrics like gait speed and step length.

Wrist-worn devices can be as accurate as smartphones for continuous health tracking.
5🔥2💯2
Google's Titans architecture brings adaptive long-term memory to language models

Titans introduces a deep neural network (MLP) as a long-term memory module, separate from the main model.

This memory:
1. Updates its weights when encountering "surprising" information — tokens that deviate significantly from what the memory already encodes
2. Ignores routine, predictable tokens to maintain speed
3. Uses momentum to capture related context and adaptive forgetting to manage capacity

The "surprise metric" mirrors how human memory works: we forget the routine but retain the unexpected.

Why it matters?

Standard transformers scale quadratically with context length. Linear RNNs and state-space models (like Mamba) scale efficiently but compress context into fixed-size states, losing information.

Titans combines both approaches:
- Attention handles precise short-term context
- The neural memory module compresses and retrieves long-range information
Inference cost stays linear.

Results
On the BABILong benchmark (reasoning across extremely long documents), Titans outperforms GPT-4 despite having far fewer parameters. The architecture scales effectively beyond 2 million tokens while maintaining stable accuracy.

Google also introduced MIRAS — a theoretical framework showing that transformers, RNNs, and SSMs are all variants of associative memory systems.

This opens the door to exploring non-Euclidean optimization objectives beyond standard MSE.
Potential applications: full-document analysis, genomic sequences, long-session agents, continuous context without chunking.
🔥43👍2👨‍💻1
MIT dropped a technique that makes ChatGPT reason like a team of experts instead of one overconfident intern.

It’s called “Recursive Meta-Cognition” and it outperforms standard prompts by 110%.

The problem with how you prompt AI:

- You ask one question. AI gives one answer. If it’s wrong, you never know.

- It’s like asking a random person on the street for medical advice and just… trusting them.

- No second opinion. No fact-checking. No confidence level.

The secret sauce is the confidence scoring.Every reasoning path gets a score from 0.0 to 1.0.Paths below 0.4? Rejected. Paths above 0.8? Trusted.

The multi-perspective check catches errors before they reach you. Most AI answers fail at least one of these.This framework catches them.

Best part: it doesn’t overthink simple questions.The system matches complexity to the problem. No wasted cycles.
4🔥4👏4
Another Chinese model fully trained on domestic chips, released by China Telecom

TeleChat3-36B-Thinking:
- Native support for the Ascend + MindSpore ecosystem
- Inspired by DeepSeek’s architecture design, bringing training stability and efficiency gains.
🔥5👏4💯4