All about AI, Web 3.0, BCI – Telegram

All about AI, Web 3.0, BCI

3.26K subscribers

727 photos

26 videos

161 files

3.11K links

This channel about AI, Web 3.0 and brain computer interface(BCI)

owner @Aniaslanyan

Download Telegram

About

Blog

Apps

Platform

All about AI, Web 3.0, BCI

3.26K subscribers

All about AI, Web 3.0, BCI

Moonshot AI released Kimi K2 Thinking. The Open-Source Thinking Agent Model is here.

- SOTA on HLE (44.9%) and BrowseComp (60.2%)
- Executes up to 200 – 300 sequential tool calls without human interference
- Excels in reasoning, agentic search, and coding
- 256K context window

Built as a thinking agent, K2 Thinking marks our latest efforts in test-time scaling — scaling both thinking tokens and tool-calling turns.

Weights and code.

moonshotai.github.io

Kimi K2 Thinking

Kimi K2 Thinking, Moonshot's best open-source thinking model.

🔥5😍4🤔2🆒2👏1

966 views15:37

All about AI, Web 3.0, BCI

Google to roll out Polymarket and Kalshi prediction markets data in search results.

Google Finance to roll out Polymarket and Kalshi prediction markets data in search results

Google said prediction markets data from leading platforms Polymarket and Kalshi will roll out over the coming weeks.

👍7🔥3👏3

869 views17:42

All about AI, Web 3.0, BCI

Sakana AI is building artificial life and they can evolve: Petri Dish Neural Cellular Automata (PD-NCA) let multiple NCA agents learn and adapt during simulation, not just after training.

Each cell updates its own parameters via gradient descent, turning morphogenesis into a living ecosystem of competing, cooperating, and ever-evolving entities—showing emergent cycles and persistent complexity growth.

GitHub

PPetri Dish Neural Cellular Automata (PD-NCA) is a new ALife simulation substrate that replaces the fixed, non-adaptive morphogenesis of conventional NCA—where model parameters remain constant during development—with multi-agent open-ended growth, trained…

❤8🔥3🥰2

873 views11:38

All about AI, Web 3.0, BCI

DreamGym from Meta is a new framework that lets AI agents train via synthetic reasoning-based experiences instead of costly real rollouts.

It models environment dynamics, replays and adapts tasks, and even improves sim-to-real transfer.

Results: +30% gains on WebArena and PPO-level performance—using only synthetic interactions.

🔥5🥰3👏3

794 views17:12

All about AI, Web 3.0, BCI

Google Introduced Nested Learning: a new ML paradigm for continual learning that views models as nested optimization problems to enhance long context processing.

A proof-of-concept model, Hope, shows improved performance in language modeling.

research.google

Introducing Nested Learning: A new ML paradigm for continual learning

🔥6❤3👏3

817 views18:35

All about AI, Web 3.0, BCI

Alibaba introduced ReasonMed : the largest medical reasoning dataset, advancing LLM performance in clinical QA.

Comprising 370k curated examples distilled from 1.75M reasoning paths, ReasonMed is built through a multi-agent EMD (easy–medium–difficult) pipeline with generation, verification, and an Error Refiner that corrects faulty reasoning steps.

Experiments show that combining detailed CoT reasoning with concise answer summaries yields the most robust fine-tuning outcomes.

- Models trained on ReasonMed redefine the state of the art:
- ReasonMed-7B outperforms all sub-10B models by +4.17% and even beats LLaMA3.1-70B on PubMedQA (+4.60%).
- ReasonMed-14B maintains strong scaling efficiency and competitive accuracy.

Hf.
GitHub.

🆒5❤3👏3🔥1

642 views11:17

All about AI, Web 3.0, BCI

Moonshot AI : Quantization is not a compromise — it's the next paradigm.

After K2-Thinking's release, many developers have been curious about its native INT4 quantization format.

Moonshot and Zhihu contributor, shared an insider's view on why this choice matters — and why quantization today isn't just about sacrificing precision for speed.

In the context of LLMs, quantization is no longer a trade-off.
With the evolution of param-scaling and test-time-scaling, native low-bit quantization will become a standard paradigm for large model training.

Why Low-bit Quantization Matters?

In modern LLM inference, there are two distinct optimization goals:
• High throughput (cost-oriented): maximize GPU utilization via large batch sizes.
• Low latency (user-oriented): minimize per-query response time.
For Kimi-K2's MoE structure (with 1/48 sparsity), decoding is memory-bound — the smaller the model weights, the faster the compute.
FP8 weights (≈1 TB) already hit the limit of what a single high-speed interconnect GPU node can handle.

By switching to W4A16, latency drops sharply while maintaining quality — a perfect fit for low-latency inference.

Why QAT over PTQ?

Post-training quantization (PTQ) worked well for shorter generations, but failed in longer reasoning chains:
• Error accumulation during long decoding degraded precision.
• Dependence on calibration data caused "expert distortion" in sparse MoE layers.
Thus, K2-Thinking adopted QAT for minimal loss and more stable long-context reasoning.

How it works?

K2-Thinking uses a weight-only QAT with fake quantization + STE (straight-through estimator).
The pipeline was fully integrated in just days — from QAT training → INT4 inference → RL rollout — enabling near lossless results without extra tokens or retraining.

INT4's hidden advantage in RL
Few people mention this: native INT4 doesn't just speed up inference — it accelerates RL training itself.
Because RL rollouts often suffer from "long-tail" inefficiency, INT4's low-latency profile makes those stages much faster.
In practice, each RL iteration runs 10-20% faster end-to-end.
Moreover, quantized RL brings stability: smaller representational space reduces accumulation error, improving learning robustness.

Why INT4, not MXFP4?

Kimi chose INT4 over "fancier" MXFP4/NVFP4 to better support non-Blackwell GPUs, with strong existing kernel support (e.g., Marlin).
At a quant scale of 1×32, INT4 matches FP4 formats in expressiveness while being more hardware-adaptable.

All about AI, Web 3.0, BCI

Moonshot AI released Kimi K2 Thinking. The Open-Source Thinking Agent Model is here.

- SOTA on HLE (44.9%) and BrowseComp (60.2%)
- Executes up to 200 – 300 sequential tool calls without human interference
- Excels in reasoning, agentic search, and coding…

🔥4👏4🥰3

582 views14:44

All about AI, Web 3.0, BCI

Meta introduced Omnilingual Automatic Speech Recognition (ASR), a suite of models providing ASR capabilities for over 1,600 languages, including 500 low-coverage languages never before served by any ASR system.

While most ASR systems focus on a limited set of languages that are well-represented on the internet, this release marks a major step toward building a truly universal transcription system.

They’re released a full suite of models and a dataset :

1. Omnilingual ASR: A suite of ASR models ranging from 300M to 7B parameters, supporting 1600+ languages

2. Omnilingual w2v 2.0: a 7B-parameter multilingual speech representation model that can be leveraged for other downstream speech-related tasks

3. Omnilingual ASR Corpus: a unique dataset spanning 350 underserved languages that was curated in collaboration with our global partners

🔥5❤4👏3

562 views07:43

All about AI, Web 3.0, BCI

Pleias released a fully synthetic generalist dataset for pretraining, SYNTH and two new SOTA reasoning models exclusively trained on it.

Despite having seen only 200 billion tokens, Baguettotron is currently best-in-class in its size range.

SYNTH is a radical departure from the classic pre-training recipe. At its core it’s an upsampling of Wikipedia 50,000 “vital” articles.

SYNTH is a collection of several synthetic playgrounds: data is not generated through simple prompts but by integrating smaller fine-tuned models into workflows with seeding, constraints, and formal verifications/checks.

Synthetic playgrounds enabled a series of controlled experiments that brought us to favor extreme depth design. Pleias selected a 80-layers architecture for Baguettotron, with improvements across the board on memorization of logical reasoning.

Along with Baguettotron Pleias released the smallest viable language model to date. Monad, a 56M transformer, trained on the English part of SYNTH with non-random performance on MMLU. Desiging Monad an engineering challenge requiring a custom tiny tokenizer.

PleIAs/SYNTH · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

🔥5❤3🥰3

613 views09:51

All about AI, Web 3.0, BCI

Google has released a new "Introduction to Agents" guide, which discusses a "self-evolving" agentic system (Level 4).

"At this level, an agentic system can identify gaps in its own capabilities and create new tools or even new agents to fill them."

🔥8🥰2👏2

671 views15:32

All about AI, Web 3.0, BCI

AELLA is an open-science initiative to make scientific research accessible via structured summaries created by LLMs

Available now:
- Dataset of 100K summaries
- 2 fine-tuned LLMs
- 3d visualizer.

This project spans many disciplines:
- bespoke model-training pipelines
- high-throughput inference systems
- protocols to ensure compute integrity and more.

Models.
Visualizer.

Project OSSAS: Custom LLMs to process 100 Million Research Papers

Project OSSAS is a large-scale open-science initiative to make the world’s scientific knowledge accessible through AI-generated summaries of research papers.

❤5🔥2👏2

669 views07:42

All about AI, Web 3.0, BCI

ByteDance launched Doubao-Seed-Code, a model specifically designed for programming tasks.

It supports native 256K long context and has claimed the top spot on the SWE-Bench Verified leaderboard.

火山引擎-你的AI云

火山引擎是字节跳动旗下的云与AI服务平台。在AI时代，聚焦豆包大模型和AI云原生技术，为企业提供从 Agent 开发到部署的一站式服务，助力企业AI转型与创新发展。

561 views10:11

All about AI, Web 3.0, BCI

A new paper from YANN LECUN. LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics. GitHub.

This could be one of LeCun's last papers at Meta (lol), but it's a really interesting one.

Quick summary:

Yann LeCun's big idea is JEPA, a self-supervised learning method. However, there are various failure modes of this approach, so training strong JEPA models is very brittle, unstable, and quite difficult. So overall JEPA has seen little adoption in practice.

This paper tries to directly address this, making specific design decisions that improve training stability.

The authors identify the isotropic Gaussian as the optimal distribution that JEPA models’ embeddings should follow and design the Sketched Isotropic Gaussian Regularization (SICReg) to constrain embeddings to reach that ideal distribution. This forms the LeJEPA framework, which can be implemented in ~50 lines of code.

On empirical tests, the authors demonstrate stability of training across hyperparameters, architectures, and datasets.

A result particularly interesting to me however is that training a LeJEPA model from scratch directly on the downstream dataset outperforms finetuning a DINOv2/v3 model on the dataset!

646 views12:30

All about AI, Web 3.0, BCI

Japan’s first yen stablecoin issuer says stablecoin issuers could replace the Bank of Japan as major bond buyers.

Japan’s JPYC Says Stablecoins May Become Key Bond Buyers

JPYC projects yen stablecoin issuers will invest heavily in JGBs, potentially shaping liquidity and Japan’s bond-buying landscape.

🔥5❤2🏆2👍1

604 views14:33

All about AI, Web 3.0, BCI

Anthropic is investing $50B to build data centers in TX and NY, with sites coming online throughout 2026.

Anthropic invests $50 billion in American AI infrastructure

Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

612 views16:00

All about AI, Web 3.0, BCI

Last year Google’s AlphaProof & AlphaGeometry reached a key landmark in AI by achieving silver medal level performance at the International Math Olympiad.

Today, Nature is publishing the methodology behind agent AlphaProof.

Olympiad-level formal mathematical reasoning with reinforcement learning

Nature - Olympiad-level formal mathematical reasoning with reinforcement learning

🔥6👍2👏2

567 views18:24

All about AI, Web 3.0, BCI

Anthropic’s applied AI team with a great write up on improving Claude’s frontend design via Skills.

Also with a Claude Code plugin that packages up the skill.

Improving frontend design through Skills | Claude

Best practices for building richer, more customized frontend design with Claude and Skills.

👍4🔥2🥰2

558 views08:11

All about AI, Web 3.0, BCI

New ByteDance + Yale + NYU + Tsinghua paper builds an LLM based agent called AlphaResearch that searches for new algorithms instead of reusing known ones.

For each problem, AlphaResearch first writes a natural language idea for an algorithm and then turns that idea into code.

The big deal is that this setup lets an LLM push actual mathematical records using a simple loop of scoring ideas and executing code, and the same loop could also search for better algorithms in many other domains.

A reward model trained on peer review data scores each idea and filters out the weakest ones before coding.

An execution engine then runs the code, checks all constraints, and reports a numeric performance score.

The agent loops over this process, sampling old attempts, tweaking ideas and programs, and keeping any version that improves the score.

To measure progress, the authors build a benchmark of 8 open ended algorithm problems with strong human baselines.

On this benchmark, AlphaResearch improves steadily and beats the best human constructions on 2 circle packing tasks, while still trailing people on the other 6.

AlphaResearch: Accelerating New Algorithm Discovery with Language Models

Large language models have made significant progress in complex but easy-to-verify problems, yet they still struggle with discovering the unknown. In this paper, we present \textbf{AlphaResearch},...

🔥2🥰2👏2

676 views12:35

All about AI, Web 3.0, BCI

Czech National Bank has announced the establishment of a pilot digital asset portfolio totaling $1 million, comprising Bitcoin, a USD stablecoin, and a tokenized deposit.

Approved on October 30, the initiative plans to share insights within the next 2–3 years.

The central bank reportedly maintains this is the first instance of a central bank including Bitcoin on its balance sheet.

Bitcoin (BTC) Comes to Central Bank Balance Sheet as CNB Buys

The bank said it created a $1 million "test portfolio" of digital assets, mostly made up of bitcoin.

🔥2🥰2👏2

555 views14:20

All about AI, Web 3.0, BCI

Google introduced SIMA 2: an agent that plays, reasons, and learns with u in vrtual 3D Worlds

Powered by Gemini, it goes beyond following basic instructions to think, understand, and take actions in interactive environments – meaning you can talk to it through text, voice, or even images.

Google trained SIMA 2 to achieve high-level goals in a wide array of games – allowing it to perform complex reasoning and independently plan how to accomplish tasks.

It acts like a collaborative partner that can explain its intentions and answer questions about its behavior.

SIMA 2 is now far better at carrying out detailed instructions, even in worlds it's never seen before.

It can transfer learned concepts like “mining” in one game and apply it to “harvesting” in another – connecting the dots between similar tasks.

It even navigated unseen environments created in real-time by Genie 3 model.

SIMA 2 can teach itself new skills, learning through trial-and-error, based on feedback from Gemini. Getting better the more it plays –without additional human input.

SIMA 2 research offers a path towards applications in robotics and another step towards AGI in the real world.

Google DeepMind

SIMA 2: A Gemini-Powered AI Agent for 3D Virtual Worlds

Introducing SIMA 2, the next milestone in our research creating general and helpful AI agents. By integrating the advanced capabilities of our Gemini models, SIMA is evolving from an instruction-foll…

❤3🔥2👏2

617 views15:17

All about AI, Web 3.0, BCI

OpenAI developed a new way to train small AI models with internal mechanisms that are easier for humans to understand.

Language models like the ones behind ChatGPT have complex, sometimes surprising structures, and we don’t yet fully understand how they work.

In a new research, team train “sparse” models—with fewer, simpler connections between neurons—to see whether their computations become easier to understand.

Understanding neural networks through sparse circuits

We trained models to think in simpler, more traceable steps—so we can better understand how they work.

🔥3🥰3👍2

605 views12:29