All about AI, Web 3.0, BCI
3.26K subscribers
727 photos
26 videos
161 files
3.11K links
This channel about AI, Web 3.0 and brain computer interface(BCI)

owner @Aniaslanyan
Download Telegram
xAI announces Grok 3. Here is everything you need to know

Elon mentioned that Grok 3 is an order of magnitude more capable than Grok 2.

Total GPUs: 200K

The capacity was doubled in 92 days!

All of this compute was used to improve Grok -- which has lead to Grok 3.

Grok 3 involved 10x more training than Grok 2!

Grok finished pretraining in early January!

The model is still training.

Here are the benchmark numbers:

Grok 3 significantly outperforms other models in its category such as Gemini 2 Pro and GPT-4o. Even Grok-3 mini shows to be competitive.

Results of early Grok 3 in the Chatbot Arena (LMSYS)

It reached an Elo score of 1400 which no other model has achieved.

The model score keeps improving.

Grok 3 also has reasoning capabilities too!

The Grok team has been testing these capabilities which they have unlocked using RL.

The model is good, especially in coding.

Grok 3 coding example:

Thinking traces as generated as the model tries to solve the problem.

Elon confirmed that the thinking steps have been obscured to avoid getting copied.

Grok 3 also excels at creative coding like generating creative and novel games.

Elon emphasized Grok 3's creative emergent capabilities.

You can also use the Big Brain mode to use more compute and reasoning with Grok 3.

Grok 3 Reasoning performance:

The results correspond to the beta version of Grok-3 Reasoning.

It outperforms o1 and DeepSeek-R1 when given more test-time compute (allowing it to think longer).

The Grok 3 mini reasoning model is also very capable.

Grok 3 Reasoning Beta performance on AIME 2025.

Grok 3 shows generalization capabilities.

It not only does coding and math problem-solving, but it can also do other creative and useful real-world tasks.

One of the results generated with Grok 3 mini.

Bejeweled Tetris generated by Grok 3.

Grok 3 cannot only unlock test-time compute, it also enables capable agents.

These capabilities have led to a new product called DeepSearch.

"Next generation of search agents to understand the universe".

More on DeepSearch:

- the model can think deeply about user intent
- what facts to consider
- how many websites to browse
- it can cross-validate different sources.

DeepSearch also exposes the steps that it takes to conduct the search itself.

Improvements will happen rapidly and almost daily according to the team.

There is also a Grok-powered voice app coming too -- about a week away!

Open-source approach:

The last version will be open-sourced when the most recent version is fully out.

After Grok 3 stable version is out, it is highly likely Grok 2 will be open-sourced. (within a few months).

SuperGrok dedicated app is also available with a polished experience.

Try on the web as well: grok.com

The web will include the latest Grok features.
👍8
Ilya Sutskever’s Startup Is Fundraising at $30 Billion-Plus Valuation

Ilya Sutskever is raising more than $1 billion for his startup at a valuation of over $30 billion, vaulting the nascent venture into the ranks of the world’s most valuable private technology companies.

Greenoaks Capital Partners is leading the deal for the startup, Safe Superintelligence, and plans to invest $500 million.

Greenoaks is also an investor in AI companies Scale AI and Databricks Inc.

The round marks a significant valuation jump from the $5 billion that Sutskever’s company was worth before. The financing talks are ongoing and the details could still change.

The company previously raised money from investors including Sequoia Capital and Andreessen Horowitz.

SSI focuses on developing safe AI systems. It isn’t generating revenue yet and doesn’t intend to sell AI products in the near future.

“This company is special in that its first product will be the safe superintelligence, and it will not do anything else up until then,” Sutskever told Bloomberg in June. “It will be fully insulated from the outside pressures of having to deal with a large and complicated product and having to be stuck in a competitive rat race.”

Sutskever was a key figure in the ouster of OpenAI Chief Executive Officer Sam Altman in 2023 before he helped Altman return.
3
#DeepSeek introduced NSA: A Hardware-Aligned and Natively Trainable Sparse Attention mechanism for ultra-fast long-context training & inference

Core components of NSA:

1. Dynamic hierarchical sparse strategy
2. Coarse-grained token compression
3. Fine-grained token selection

With optimized design for modern hardware, NSA speeds up inference while reducing pre-training costs—without compromising performance. It matches or outperforms Full Attention models on general benchmarks, long-context tasks, and instruction-based reasoning.
👍6👀1
LangMem SDK: long-term memory for AI agents

Different types of memory enable different agent capabilities.

LangChain released an open-source library that implements some of these patterns so you can build agents that learn from interactions.

The LangMem SDK makes it easier to:

- Extract and manage semantic knowledge from conversations
- Auto-optimize prompt instructions and rules from interactions

Can be used with or without LangGraph.

Docs.
Microsoft presented Magma: A Foundation Model for Multimodal AI Agents

- SotA on UI navigation and robotic manipulation tasks
- Pretrained on a large dataset annotated with Set-of-Mark (SoM) for action grounding and Trace-of-Mark (ToM) for action planning.
Kimi introduced MoBA: Mixture of Block Attention for Long-Context LLMs

This innovative approach revolutionizes long-context processing in LLMs by combining the power of Mixture of Experts (MoE) with sparse attention.

MoBA achieves efficiency without sacrificing performance, making long-context tasks more scalable than ever.

Key features of MoBA:
1. Trainable block sparse attention: Capable of continued training from any current full attention model
2. Parameter-less gating mechanism: Seamlessly switches between full & sparse attention
3. Production-proven quality at kimi.ai , 6.5x speedup at 1M input
🔥5
Baichuan-M1

- Opensources SotA medical LLM (Baichuan-M1-14B)
- Trained from scratch on 20T tokens with a dedicated focus on enhancing medical capabilities
Google's AI Co-Scientist: A New Era of Scientific Discovery

Google unveiled its latest innovation in artificial intelligence - the AI co-scientist. Built on Gemini 2.0, this sophisticated multi-agent system represents a significant leap forward in scientific research and discovery.

Unlike traditional AI coding assistants like GitHub Copilot, the AI co-scientist functions as a genuine research partner. This system can:

1. Generate Novel Scientific Hypotheses.

The system doesn't just analyze existing data - it proposes entirely new research directions. For instance, it can suggest innovative applications for existing drugs or identify previously unknown biological mechanisms.

2. Design Experimental Protocols. Going beyond theoretical proposals, the system can outline detailed experimental procedures to test its hypotheses, making it immediately practical for laboratory implementation.

3. Collaborate with Human Scientists.
The system engages in meaningful scientific dialogue, incorporating feedback and iteratively improving its proposals through natural language interaction with researchers.

The system has already demonstrated impressive results in several critical areas:

1. Cancer Treatment Research.
Successfully identified new applications for existing drugs in treating acute myeloid leukemia, with laboratory validation confirming the effectiveness of its proposals.

2. Liver Disease Treatment. Discovered novel epigenetic targets for liver fibrosis treatment, validated through tests on human liver organoids.

3. Antimicrobial Resistance. Independently proposed and correctly identified complex mechanisms of bacterial gene transfer, matching discoveries made through traditional laboratory research.

The AI co-scientist operates through a coalition of specialized agents:
- Generation
- Reflection
- Ranking
- Evolution
- Proximity
- Meta-review


Google is launching a Trusted Tester Program, opening access to research organizations worldwide. This initiative aims to further validate and expand the system's capabilities across various scientific domains.
5
Breakthrough in Robot Design: Universal Controllers Transform How We Build Robots?

Northwestern University researchers have made a significant breakthrough in robotics design, introducing a method that could revolutionize how we create and evolve robots.

Their paper "Accelerated co-design of robots through morphological pretraining" presents a novel approach that solves a decades-old challenge in robotics.

Code is coming soon.

And here are more robots

Key Innovations:

1. Universal Controller

- Developed a single controller that can work with multiple robot body types
- Pre-trained on millions of different robot morphologies
- Uses gradient-based optimization through differentiable simulation
- Can immediately adapt to new robot designs without extensive retraining

2. Zero-Shot Evolution

- Allows rapid testing of new robot body designs
- Enables immediate evaluation of design changes
- Supports successful recombination of robot parts
- Dramatically speeds up the design process

3. Diversity Maintenance

- Identified and solved "diversity collapse" - a previously unknown problem in robot co-design
- Developed methods to maintain morphological diversity while improving performance
- Enabled successful crossover between different robot designs

Technical Details:
- Controllers are trained on over 10 million distinct robot morphologies
- Uses differentiable simulation for gradient-based optimization
- Supports complex 3D environments with varying terrains
- Enables robots to perform adaptive behaviors like phototaxis (movement toward light)


Future Implications:

- Could dramatically accelerate robot design and development
- Opens new possibilities for self-reconfigurable robots
- Provides a framework for more complex multi-material robots
- May help bridge the simulation-to-reality gap in robotics
👍7
Just now Arc institute and Nvidia released Evo 2, the largest AI model for biology. It is fully open-source.

Evo 2 can predict which mutations in a gene are likely to be pathogenic, or even design entire eukaryotic genomes.

Preprint.
GitHub
Nvidia bionemo.
Evo designer.
Evo Mechanistic Interpretability Visualizer

Early model Evo.
🆒5
HuggingFace released the "Ultra-Scale Playbook"

A free, open-source, book to learn everything about 5D parallelism, ZeRO, fast CUDA kernels, how and why overlap compute & communication – all scaling bottlenecks and tools introduced with motivation, theory, interactive plots from our 4000+ scaling experiments and even NotebookLM podcasters to tag along with you.

- How was DeepSeek trained for $5M only?
- Why did Mistral trained an MoE?
- Why is PyTorch native Data Parallelism implementation so complex under the hood?
- What are all the parallelism techniques and why were they invented?
- Should I use ZeRO-3 or Pipeline Parallelism when scaling and what's the story behind both techniques?
- What is this Context Parallelism that Meta used to train Llama 3? Is it different from Sequence Parallelism?
- What is FP8? how does it compares to BF16?

The largest factor for democratizing AI will always be teaching everyone how to build AI and in particular how to create, train and fine-tune high performance models. In other word making accessible to everybody the techniques that power all recent large language models and efficient training is possibly one of the most essential of them.
🔥11
Sakana AI introduced the AI CUDA Engineer: Agentic CUDA Kernel Discovery, Optimization and Composition

The AI CUDA Engineer can produce highly optimized CUDA kernels, reaching 10-100x speedup over common machine learning operations in PyTorch.

System is also able to produce highly optimized CUDA kernels that are much faster than existing CUDA kernels commonly used in production.

Sakana AI believe that fundamentally, AI systems can and should be as resource-efficient as the human brain, and that the best path to achieve this efficiency is to use AI to make AI more efficient!

Team also released a dataset of over 17,000 verified CUDA kernels produced by The AI CUDA Engineer.

Kernel archive webpage
2
Microsoft unveiled Muse, an AI that can generate minutes of unique game sequences from a single sec of gameplay frames

It's the first World and Human Action Model that predicts 3D environments and actions for playable games.

The scale of training is mind-blowing:

— Trained on 1B+ gameplay images
— Used 7+ YEARS of continuous gameplay data
— Learned from real Xbox multiplayer matches

From a single second of gameplay + controller inputs, Muse can create multiple unique, playable sequences that follow actual game physics, mechanics, and rules.

The version shown in research was trained on just a single game (Bleeding Edge).
Wow, #DeepSeek announced Day 0: Warming up for #OpenSourceWeek

Starting next week, they'll be open-sourcing 5 repos, sharing sincere progress with full transparency.

These humble building blocks in their online service have been documented, deployed and battle-tested in production.

Daily unlocks are coming soon. No ivory towers - just pure garage-energy and community-driven innovation.
🔥11
Meta presented MLGym: A New Framework and Benchmark for Advancing AI Research Agents

- The first Gym environment for ML tasks
- 13 diverse and open-ended AI research tasks from diverse domains

GitHub
Paper
👍11
Google dropped SigLIP 2 is the most powerful image-text encoder

SigLIP 2 is new version of SigLIP: best open-source multimodal encoders by Google, now on HF.

What's new?
> Improvements from new masked loss, self-distillation and dense features (better localization)
> Dynamic resolution with Naflex (better OCR).

U can use it to do:
> image-to-image search
> text-to-image-search
> image-to-text search
> image classification with open-ended classes
> train vision language models

SigLIP 2 comes in 3 sizes (base, large, giant), three patch sizes (14, 16, 32) and shape-optimized variants with Naflex.
As usual, supported by transformers from get go.

Models.
🔥10
Chinese researchers introduced BEAMDOJO

It's a new reinforcement learning framework that teaches robots how to walk on uneven surfaces like stepping stones and balancing beams.

Paper.
3