All about AI, Web 3.0, BCI
3.3K subscribers
729 photos
26 videos
161 files
3.13K links
This channel about AI, Web 3.0 and brain computer interface(BCI)

owner @Aniaslanyan
Download Telegram
Google DeepMind shared pre-print AMIE research diagnostic dialogue AI.

Researchers introduced a new asynchronous oversight paradigm, decoupling history-taking by AMIE from sharing a human-approved diagnosis.

AMIE can perform consultations with patients to gather information within guardrails (g-AMIE), abstaining from individualized medical advice. A diagnosis and treatment plan is proposed, which licensed physicians authorize through our interface, the clinician cockpit.

Guardrailed-AMIE multi-agent system consists of a multi-phase dialogue agent, a guardrail agent and a SOAP note generation agent based on Gemini 2.0 Flash.

Researchers evaluate workflow in a virtual Objective Structured Clinical Examination (OSCE) study with oversight, contextualizing g-AMIE’s performance with control groups consisting of primary care physicians (PCPs) and nurse practitioners (NPs)/physician assistants/associates (PAs).

g-AMIE and the control groups (g-PCP and g-NP/PA) all operate under the same guardrails of not providing individualized medical advice during consultations and draft SOAP notes for handoff.

This work has various limitations and nuances, including the difficulty of classifying individualized medical advice, the AI-focused workflow which was unfamiliar to both control groups, high mental load required for oversight and the simulated nature of our OSCE study.

Because of this, results need to be interpreted with care and cannot be used to draw conclusions about the relative performance of our PCP, NP and PA control groups.
🔥3
Alibaba released Qwen3-Coder

This 480B-parameter Mixture-of-Experts model (35B active) natively supports 256K context and scales to 1M context with extrapolation.

It achieves top-tier performance across multiple agentic coding benchmarks among open models, including SWE-bench-Verified.

Alongside the model, also open-sourcing a command-line tool for agentic coding: Qwen Code. Forked from Gemini Code, it includes custom prompts and function call protocols to fully unlock Qwen3-Coder’s capabilities.
🔥5
How can we teach embodied agents to think before they act?

ThinkAct — a hierarchical Reasoning VLA framework with an MLLM for complex, slow reasoning and an action expert for fast, grounded execution.

Paper.
MIT introduced MEM1: RL for Memory Consolidation in Long-Horizon Agents.

Long-horizon agents (e.g., deep research, web agents) typically store all observations, actions, and intermediate thoughts in context. However, much of this information is unnecessary for subsequent reasoning, leading to inefficient memory usage and slower inference.

In MEM1, researchers introduced RL approach that trains the agent to maintain a dynamic internal state, which:

1. Consolidates and maintains only relevant information
2. Updates memory while reasoning
3. Discards unneeded history dynamically

A new method achieves:
1. 3.7× lower memory usage & 1.78× faster inference on multi-question HotpotQA
2. 2.5× lower memory usage on WebShop

code and model are fully open-sourced:

Paper.
5
Chat Annotator— a free chatbot where users can highlight parts of responses, leave a comment, and have the model incorporate that feedback into its next output. Powered by Cohere Command-A.
How do we train LLMs on real-world tasks where it’s hard to define a single verifiable answer?

Scale introduced Rubrics as Rewards (RaR) — a framework for on-policy post-training that uses structured, checklist-style rubrics as interpretable reward signals.
👍5
ASI-Arch is the first Artificial Superintelligence for AI Research enabling fully automated neural architecture innovation.

No human-designed search space. No human in the loop.

Key Breakthroughs of ASI-Arch:

- Autonomous code generation & training
- 1,773 experiments conducted (20K+ GPU hours)
- 106 new SOTA linear attention architectures discovered
- Unveiled a scaling law for scientific discovery
🔥7🤯2
Another massive open-source LLM is coming from a Chinese company. Meet Step 3 — multimodal LLM from StepFun:

1. MoE architecture (321B total params, 38B active)
2. Rivals OpenAI o3, Gemini 2.5 Pro, and Claude Opus 4 in performance
3. Optimized for China’s domestic AI chips

StepFun just announced: Step 3 will be open-sourced on July 31st!

This could be the best open-source multimodal LLM you’ll get your hands on.
🔥11
Claude Code is getting a brand new feature: custom subagents

Subagents let you create teams of custom agents, each designed to handle specialized tasks.

Examples of subagents we’ve seen be useful are:

1. Software Architect: help design features elegantly and ensure appropriate layers of abstraction.

2. Code reviewer: Review best practices in a codebase, delete old code.

3. QA tester: Run unit tests, lints and writes fixes.
🔥6❤‍🔥2👍2
A new world wodel from Meta - DINO-world: a generalist video world model that predicts the future—in latent space.

Trained on uncurated videos with DINOv2, it learns diverse temporal dynamics (driving, indoors, sims), beats prior models on segmentation & depth, and even grasps intuitive physics.

It can be fine-tuned for action-conditioned planning.
🆒5🔥2
Singapore's Sapient Intelligence dropped a Hierarchical Reasoning Model, with a brain-inspired architecture

With training on just 1K examples and 27M params, it handles complex reasoning tasks like extreme Sudoku and maze puzzles

Code.
4🍌2
Chinese lab Z.ai dropped GLM-4.5 and 4.5 Air, 2 open-source agentic models

The 4.5 variant with 355B params tops open models worldwide, and ranks just behind o3 and Grok 4

Also excels at agentic tasks with a 90% success in tool use.

API Pricing (per 1M tokens):
GLM-4.5: $0.6 Input / $2.2 Output
GLM-4.5-Air: $0.2 Input / $1.1 Output

Weights
API
OpenRouter
Develop Tools.
Try them.
3🔥3🆒3
All about AI, Web 3.0, BCI
CAMEL-AI's Trifecta: Loong, OWL, and CRAB - The Future of AI Agent Systems Loong: Self-Improving AI in Specialized Domains Project Loong tackles the fundamental challenge of training LLMs to reason effectively in specialized domains without expensive labeled…
Eigent — the first open source multi-agent workforce on your desktop.

Eigent is a team of AI agents collaborating to complete complex tasks in parallel.

It brings together specialized agents, developer, search, document, multi-modal, each designed to work in parallel and adapt to your needs.

Eigent is
built on CamelAI open-source multi-agent infrastructures.

It supports:
-
Running parallel tasks
- Custom workers
- Cloud version or "Bring Your Own Key" (BYOK)
- Local model deployment
- Human-in-the-loop feedback
- Model Context Protocol (MCP) tools
- Secure self-hosting
- Enterprise-level security

Eigent supports multiple deployment options:

- Cloud version with instant access and managed infrastructure
- Community edition for local hosting and customization
- Enterprise edition with SLAs, auditability, and scale
🔥3
Coinbase and JPMorgan have partnered to crypto access for over 80 million Chase customers, introducing three 3 methods:

- converting Chase Ultimate Rewards to USDC,
- funding Coinbase accounts with Chase credit cards,
- direct bank integration.

The integration of Ultimate Rewards to USDC offers a novel entry point, while credit card funding and direct bank links streamline transactions, potentially boosting adoption rates among mainstream users.
🔥4