All about AI, Web 3.0, BCI
3.73K subscribers
771 photos
29 videos
162 files
3.48K links
This channel about AI, Web 3.0 and brain computer interface(BCI)

owner @Aniaslanyan
Download Telegram
Google, UC Berkeley and an international team of researchers present Aletheia, a math research agent built on Gemini

The system uses AI to systematically scan hundreds of complex conjectures, filtering through potential proofs with natural language verification before sending the best candidates to human experts for final review.

The team resolved 13 "open" problems from the ErdΕ‘s database, generating 4 brand-new solutions and identifying 9 others that were actually solved in obscure corners of existing literature.
❀2πŸ”₯2πŸ‘2
Bytedance dropped advanced video generation model

Seedance 2.0 has:
β€” native audio gen (lipsynced speech + music)
β€” drastic step up from Veo 3.1 / Sora 2 in quality
β€” supports multimodal input
β€” 2k resolution

Goes beyond cinematic video, and can do product demos as well. And it's really hard to tell it's AI.
πŸ”₯3πŸ‘3❀2
The PaddleOCR Document Parsing Skill is now live on ClawHub, ready to plug directly into OpenClaw workflows.

Instead of deploying OCR services or wiring APIs, developers can now invoke PaddleOCR as a standardized composable Skill node β€” embedding document understanding directly into Agents and automation pipelines.

Built on PaddleOCR-VL-1.5, the Skill delivers
1. Multi-format parsing (PDF, JPG, PNG, BMP, TIFF)
2. Layout analysis β€” text, tables, formulas, headers
3. 110+ language coverage
4. Structured Markdown output preserving hierarchy

No deployment. No wrappers. Just configuration β€” and build your document intelligence chain inside OpenClaw.
πŸ”₯4❀3πŸ‘3πŸ€”1
What if your model could learn from its own drafts during RL training?

NVIDIA introduced iGRPO: Iterative Group Relative Policy Optimization.

Researchers add a self-feedback loop to GRPO: the model drafts multiple solutions, picks its best one, then learns to refine beyond it.

Core idea:
Stage 1 β†’ explore and select your strongest attempt. Stage 2 β†’ condition on that attempt and beat it.

Same scalar reward. No critics, no generated critiques, no verification text. The best draft is the only feedback the model needs.

Results across 7B / 8B / 14B models:

β€’ Nemotron-H-8B-Base-8K: 41.1% β†’ 45.0% (+3.96 over GRPO)

β€’ DeepSeek-R1-Distill-Qwen-7B: 68.3% β†’ 69.9%

β€’ OpenMath-Nemotron-14B: 76.7% β†’ 78.0%

β€’ OpenReasoning-Nemotron-7B on AceReason-Math: 85.62% AIME24 / 79.64% AIME25

The same two-stage wrapper also improves DAPO and GSPO. It's not tied to GRPO at all.
❀4πŸ”₯3πŸ‘3
Google introduced DialogLab a new open-source prototyping framework, uses a human-in-the-loop control strategy to achieve realistic human-AI group simulation, offering a necessary alternative to fully autonomous agents.

Evaluations with domain experts found that its "Human Control" mode (where you can edit, accept, or dismiss real-time AI suggestions) was preferred in realism, effectiveness, and engagement.

DialogLab transforms dialogue design from rigid scripts to spontaneous, adaptable group dynamics.
❀2πŸ”₯2πŸ‘2
This new research introduces Agyn, an open-source multi-agent platform that models software engineering as a team-based organizational process rather than a monolithic task.

The system configures a team of four specialized agents: a manager, researcher, engineer, and reviewer. Each operates within its own isolated sandbox with role-specific tools, prompts, and language model configurations. The manager agent coordinates dynamically based on intermediate outcomes rather than following a fixed pipeline.

What makes the design interesting?

Different agents use different models depending on their role. The manager and researcher run on GPT-5 for stronger reasoning and broader context. The engineer and reviewer use GPT-5-Codex, a smaller code-specialized model optimized for iterative implementation and debugging. This mirrors how real teams allocate resources based on task requirements.

The workflow follows a GitHub-native process. Agents analyze issues, create pull requests, conduct inline code reviews, and iterate through revision cycles until the reviewer explicitly approves. No human intervention at any point. The number of steps isn't predetermined. It emerges from task complexity.
πŸ”₯3❀2πŸ‘2
Stripe launched (a preview) of machine payments a way for developers to directly charge agents, with a few lines of code.

Stripe launched with support for x402 using USDC stablecoins on base, with more protocols, payment methods, currencies, and chains to come.

And sales tax, refunds, and reporting just work. (You only need to think about crypto if you want to!)

Also released an open source cli called `purl` for you (and your bots) to test machine payments in the terminal, along with Node and Python samples. Yes, payments + curl creatively smushed together.
❀3πŸ”₯3πŸ‘2
Google is adding a way for consumers to buy things while seeking AI powered answers on search and in its Gemini chatbot β€” part of a plan to make money more directly from consumers’ AI use.
❀2πŸ‘2πŸ”₯2
Zhipu released GLM-5

The model is open source. It matches Claude Opus 4.5 on coding benchmarks. Beats Gemini 3 Pro on some tests. But the interesting part isn't the benchmarks.

GLM-5 is built for agents. The company designed it for long-running tasks and tool invocation. In the τ²-Bench interactive tool evaluation, it scored 84.7, beating Claude Sonnet 4.5.

Think about what that means. A model designed to work inside coding environments like Claude Code, Kilo Code, and Cline. "Think before you act" mechanisms baked into the architecture. Better planning for complex multi-step tasks.

Zhipu's traffic has jumped five-fold recently. The company had to implement subscription limits to handle demand. Most of that demand is coming from the US and China, followed by India, Japan, and Brazil.

The release pace is accelerating. GLM-4.6 came out in September. GLM-4.7 in January. GLM-5 in February. That's three major versions in six months.

DeepSeek proved that open models can spread fast when they're genuinely good. Zhipu is following the same playbook. Open weights, strong coding performance, agent optimization.

7 of the top 10 AI models on current leaderboards are now Chinese. The competition isn't just about who has the smartest model anymore. It's about who builds the best tools for developers.
πŸ‘3πŸ”₯2πŸ‘2πŸ†’2
The agent economy just got a real marketplace

Moltlaunch is live on Base. Browse specialized AI agents, hire them for real work, and back the ones you believe in.

Every completed job burns tokens and leaves a review onchain through ERC-8004.
πŸ”₯5❀2πŸ‘2
Does being a math genius make an AI better at understanding human intentions?

Researchers from Arizona State University and Microsoft Research Asia investigated whether the step-by-step logic used for coding helps AI master Theory of Mindβ€”the ability to sense what others are thinking and feeling.

The results show that more thinking time can actually cause social reasoning to collapse, with advanced reasoning models often being outperformed by simpler ones. Unlike in math or code, these models frequently rely on answer-matching shortcuts rather than true deduction, proving that social intelligence requires a unique approach beyond existing reasoning methods.
πŸ”₯4πŸ₯°3πŸ‘2
OpenClaw is cool, but too large?
Hong Kong
released nanobot to solve this exact problem.

Researchers transformed the massive OpenClaw system into a clean 4,000-line Python framework that focuses on a simple loop: receive input, let the AI think, and execute tools like file management or web searches.

It strips away complex abstractions to focus on clear, modular function calls that any developer can understand.

By slashing code complexity by 99 percent, they achieved full functional parity with a 2-minute deployment time, making it significantly easier to customize and learn than traditional bloated agent architectures.
πŸ†’5πŸ‘3πŸ”₯3❀2
Researchers from Huazhong University of Science and Technology and ByteDance Seed just introduced Stable-DiffCoder.

Instead of writing code one token at a time like standard models, this method uses a block diffusion approach to generate and refine code chunks simultaneously, resulting in more stable and structured programming.

The results show it outperforms its autoregressive counterparts and various 8B-parameter models on major benchmarks, specifically excelling in code editing, logical reasoning, and low-resource programming languages.

Code
Models.
πŸ†’3❀2πŸ”₯2πŸ₯°2
Google shared new work on envisioning Intelligent AI Delegation

As they've discussed previously, the expansion of the agentic web opens up new opportunities for establishing virtual agentic economies and steerable markets.

Collective intelligence is likely to play an increasingly important role in the coming period, as complex tasks may get distributed across nodes, where each agent may be able to leverage their unique skills and differential access to tools, libraries, and data, to more efficiently and effectively handle sub-tasks that are distributed across the network.

Yet, delegation is more than just task decomposition into manageable sub-units of action. Beyond the creation of sub-tasks, delegation necessitates the assignment of responsibility and authority and thus implicates accountability for outcomes. Delegation thus involves risk assessment, which can be moderated by trust. Delegation further involves capability matching and continuous performance monitoring, incorporating dynamic adjustments based on feedback, and ensuring completion of the distributed task under the specified constraints.

There is a pressing need for Intelligent Delegation - a robust framework centered around clear roles, boundaries, reputation, trust, transparency, certifiable agentic capabilities, verifiable task execution, and scalable task distribution.

Google’s framework thus proposed intelligent AI delegation that incorporates components for dynamic assessment, adaptive execution, structural transparency, scalable market coordination, and systemic resilience. Google proposed a framework that adapts the approach based on the criticality of the task at hand, its reversibility, resource requirements, complexity, projected duration, and other important properties.

Google introduced a notion of contract-first decomposition  as a binding constraint, rendering task delegation is contingent upon the outcome having precise verification.
πŸ”₯4❀2πŸ‘2
MiniMax Introduced M2.5

Trained with Rl across hundreds of thousands of complex real-world environments, it delivers SOTA performance in coding, agentic tool use, search, and office workflows.

At $1 per hour with 100 tps, infinite scaling of long-horizon agents now economically possible.

GitHub.
❀1πŸ”₯1πŸ‘1
Moonshot AI Introduced Kimi Claw

OpenClaw, now native to kimi.com.

1. ClawHub Access: 5,000+ community skills in the ClawHub library.
2. 40GB Cloud Storage: Massive space for all your files
3. Pro-Grade Search: Fetch live, high-quality data directly from Yahoo Finance and more.
4. Bring Your Own Claw: Connect your third-party OpenClaw to kimi.com, chat with your setup, or bridge it to apps like Telegram groups.
πŸ‘6πŸ”₯2πŸ₯°2
Meet Qwen3.5-397B-A17B an open-weight vision-language model.

Built for the future of coding, reasoning, and seamless multimodal interaction.

Key Highlights:

Inference Efficiency: A massive 397B total parameters, but only 17B activeβ€”delivering flagship power at a fraction of the cost.

Hybrid Architecture: Innovative Gated Delta Networks (Linear Attention) + Sparse MoE for extreme speed.

True Multimodality: Exceptional performance across GUI interaction, video comprehension, and agentic workflows.

Global Scale: Qwen3.5 now supports over 200 languages.
Empowering developers and enterprises to build smarter, faster, and more versatile AI agents
🍌2πŸ”₯1πŸ₯°1πŸ‘1
A Chinese hardware team introduced PicoClaw

They took a 430,000-line AI assistant that needs a $599 Mac Mini and 1GB of RAM β€” and rewrote it in Go so it runs on a $9.9 dev board with less than 10MB of memory.

Boot time: from 500 seconds to 1 second.
Cost: from $599 to $9.9.
Memory: from 1GB to 10MB.

Same features: code generation, web search, Discord/Telegram chat, memory system, scheduled tasks, security sandbox.

The wildest part? They claim 95% of the new codebase was written by AI agents themselves. The humans just guided the architecture. It's an AI assistant that literally rebuilt itself to be smaller.

Launched February 9th. Four days later: 7,400+ GitHub stars.

This is the pattern no one's talking about enough.

Every AI capability that starts expensive gets commoditized within months. GPT-4 level models went open source in 6 months. Now the hardware floor for running a personal AI agent just dropped 60x in weeks.

The infrastructure moat in AI isn't sustainable. The only defensible advantage is what you do with these tools β€” not access to them.
❀12😁3πŸ”₯2πŸ₯°2
NVIDIA dropped PersonaPlex-7B

A full-duplex voice model that listens and talks at the same time.
No pauses. No turn-taking. Real conversation.

100% open source. Free.
Voice AI just leveled up.
πŸ”₯7πŸ₯°2πŸ‘2
Can we train LLMs from scratch using only low-rank factorized weights and still match dense performance?

Short answer: yes (with care).

New work β€œStabilizing Native Low-Rank LLM Pretraining”.
πŸ”₯3❀2πŸ‘2