All about AI, Web 3.0, BCI
3.3K subscribers
733 photos
26 videos
161 files
3.15K links
This channel about AI, Web 3.0 and brain computer interface(BCI)

owner @Aniaslanyan
Download Telegram
Google introduced Gemini Robotics is the most advanced VLA in the world

Gemini Robotics featured in the post, builds on Gemini 2.0, introducing advanced vision-language-action capabilities to control robots physically.

The technology enables robots to understand and react to the physical world, performing tasks like desk cleanup through voice commands, as part of a broader push toward embodied AI.

Gemini Robotics-ER, a related model, enhances spatial understanding, allowing robots to adapt to dynamic environments and interact seamlessly with humans.

Tech report.
๐Ÿ‘5โค3๐Ÿ”ฅ2
Hugging Face (LeRobot) & Yaak released the worlds largest open source self driving dataset

To search the data, Yaak is launching Nutron - A tool that is revolutionizing natural language search of robotics data. Check out the video to see how it works.

Natural language search of multi-modal data.
Open sourcing L2D dataset - 5,000 hours of multi-modal self-driving data.

Try Nutron.
โค4๐Ÿ”ฅ4๐Ÿ‘1
The first quantum supremacy for a useful application D-Waveโ€™s quantum computer performed a complex simulation in minutes and with a level of accuracy that would take nearly a million years using the DOE supercomputer built with GPUs.

In addition, it would require more than the worldโ€™s annual electricity consumption to solve this problem using the classical supercomputer.
๐Ÿ”ฅ5๐Ÿ‘2โค1
IOSCO_AI_1741862094.pdf
1.5 MB
IOSCO Report: AI in Capital Markets - Uses, Risks, and Regulatory Responses

This report delves into the current and potential applications of AI within financial markets, outlines the associated risks and challenges, and examines how regulators and market participants are adapting to these changes.

The report cites regulatory approaches from Hong Kong, the EU, Canada, the US, Singapore, the Netherlands, the UK, Greece, Japan, Brazil, and Australia.

Key AI Applications in Financial Markets:

Decision-Making Support:
- Robo-advising (automated investment advice)
- Algorithmic trading
- Investment research and market sentiment analysis

Specific AI Use Cases.
Nasdaq:
Developed the Dynamic M-ELO AI-driven trading order, optimizing order holding time for improved execution efficiency.

Broker-Dealers.
Customer interaction via chatbots
Algorithmic trading enhancements
Fraud and anomaly detection

Asset Managers.
Automated investment advice
Investment research
Portfolio construction and optimization
๐Ÿ”ฅ4๐Ÿ‘3๐Ÿ‘1
Cohere introduced Command A: a new AI model that can match or outperform GPT-4o and DeepSeek-V3 on business tasks, with significantly greater efficiency.

Command A is an open-weights 111B parameter model with a 256k context window focused on delivering great performance across agentic, multilingual, and coding usecases.

Runs on only 2 GPUs (vs. typically 32), offers 256k context length, supports 23 languages and delivers up to 156 tokens/sec.

API.
๐Ÿ”ฅ5โค3๐Ÿฅฐ1๐Ÿ‘Œ1
Transformers, but without normalization layers. New paper by Meta.
๐Ÿ”ฅ7๐Ÿ‘1๐Ÿ‘1
Baidu, the Google of China, dropped two models

1. ERNIE 4.5: beats GPT 4.5 for 1% of price

2. Reasoning model X1: beats DeepSeek R1 for 50% of price.

China continues to build intelligence too cheap to meter. The AI price war is on.
๐Ÿ”ฅ5๐Ÿ‘2โค1
Microsoft has released this useful tool for performing R&D with LLM-based agents.
โค4๐Ÿ”ฅ2๐Ÿ‘1
Xiaomi's development of a SOTA audio reasoning model leverages DeepSeek's GRPO RL algorithm, achieving a 64.5% accuracy on the MMAU benchmark in just one week.

The breakthrough involves applying GRPO to the Qwen2-Audio-7B model, trained on 38,000 samples from Tsinghua University's AVQA dataset, marking a significant advancement in multimodal audio understanding.

The MMAU benchmark, introduced in 2024, tests models on complex audio tasks across speech, sound, and music, with even top models like Gemini Pro 1.5 achieving only 52.97% accuracy, highlighting the challenge Xiaomi's model addresses.
๐Ÿ‘3๐Ÿ”ฅ3๐Ÿ‘2
The total value of real world tokenized assets has ๐Ÿ“ˆ by โ‰ˆ20% over the last 30 days. Data by RWA.xyz.

Total Real-World Asset (RWA) value continues to climbโ€”over $18B is now tokenized onchain, excluding stablecoins.
๐Ÿ‘5โค3๐Ÿ‘2
A breakthrough in brain signal analysis that combines PCA and ANFIS to hit 99.5% accuracy in cognitive pattern recognition.

It could be a game-changer for #neuroscience, #BCI tech and clinical applications.
โค3๐Ÿค”2
Mistral announced Small 3.1: multimodal, multilingual, Apache 2.0

Lightweight: Runs on a single RTX 4090 or a Mac with 32GB RAM, perfect for on-device applications.

Fast-Response Conversations: Ideal for virtual assistants and other applications where quick, accurate responses are essential.

Low-Latency Function Calling: Capable of rapid function execution within automated or agentic workflows.

Specialized Fine-Tuning: Customizable for specific domains.

Advanced Reasoning Foundation: Inspires community innovation, with models like DeepHermes 24B by Nous Research built on Mistral Small 3.
๐Ÿ”ฅ4๐Ÿ‘2๐Ÿ‘2
ByteDance Seed, Tsinghua, and UHK dropped open-sourced a new RL algorithm for building reasoning models.

DAPO-Zero-32B, a fully open-source RL reasoning model, surpasses DeepSeek-R1-Zero-Qwen-32B, and scores 50 on AIME 2024 with 50% fewer steps.

It is trained with zero-shot RL from the Qwen-32b pre-trained model.

Everything is fully open-sourced (algorithm, code, dataset, verifier, and model).
๐Ÿ”ฅ7โค1๐Ÿ‘1
Cool research on open-source by Harvard

$4.15B invested in open-source generates $8.8T of value for companies (aka $1 invested in open-source = $2,000 of value created).

Companies would need to spend 3.5 times more on software than they currently do if OSS did not exist.
๐Ÿ”ฅ3โค2๐Ÿ‘1
HuggingFace and IBM introduced SmolDocling an ultra-compact VLM for end-to-end multi-modal document conversion

SmolDocling is good for enterprise use cases:
- 256M parameters - cheap and easy to run locally
- Performs better than 20x larger models
- Fast inference using VLLM โ€“ Avg of 0.35 secs per page on A100 GPU.
- Apache 2.0 license

Demo.
โค6๐Ÿ‘2๐Ÿ‘1
Biggest deal in Google/Alphabet history: Google is buying Wiz for $32B to beef up in cloud security

Wiz. is an Israeli cloud security startup headquartered in New York City. The company was founded in January 2020.

This acquisition positions Google to better compete with AWS and Azure.
๐Ÿ”ฅ4๐Ÿ‘3๐Ÿ‘3
Anthropic is working on voice capabilities for Claude.

The companyโ€™s chief product officer, Mike Krieger, told the Financial Timesthat Anthropic plans to launch experiences that allow users to talk to Anthropicโ€™s AI models.
๐Ÿ†’5๐Ÿ‘1๐Ÿ”ฅ1๐Ÿ‘1
Media is too big
VIEW IN TELEGRAM
NVIDIA, Google DeepMind and Disney Research are collaborating to build an R2D2 style home droid.

Jensen giving the little guy voice and gesture commands live on stage.

Robotโ€™s name is Blue, he is so cute.
โค6๐Ÿฅฐ2๐Ÿ‘1
Nvidia announced GR00T N1, the worldโ€™s first open foundation model for humanoid robots

The power of general robot brain, in the palm of your hand - with only 2B parameters, N1 learns from the most diverse physical action dataset ever compiled and punches above its weight:

- Real humanoid teleoperation data.
- Large-scale simulation data: we are open-sourcing 300K+ trajectories
- Neural trajectories: SOTA video generation models to โ€œhallucinateโ€ new synthetic data that features accurate physics in pixels. Using Jensenโ€™s words, โ€œsystematically infinite dataโ€
- Latent actions: novel algorithms to extract action tokens from in-the-wild human videos and neural generated videos.

GR00T N1 is a single end-to-end neural net, from photons to actions:

- Vision-Language Model (System 2) that interprets the physical world through vision and language instructions, enabling robots to reason about their environment and instructions, and plan the right actions.
- Diffusion Transformer (System 1) that โ€œrendersโ€ smooth and precise motor actions at 120 Hz, executing the latent plan made by System 2.

Code.
Weights on HF.
Open Physical AI
dataset release.
Blog.
๐Ÿ”ฅ5๐Ÿ‘2โค1
Also Nvidia introduced Newton, an open-source physics engine developed by NVIDIA and Google DeepMind, is designed to accelerate robot learning and development.

Built on NVIDIA Warp, which enables robots to learn how to handle complex tasks with greater precision, Newton is compatible with learning frameworks such as MuJoCo Playground or NVIDIA Isaac Labโ€”an open-source, unified framework for robot learning. 

Disney Research will be one of the first to use Newton to advance its robotic character platform.

GitHub.
โค4๐Ÿ‘3๐Ÿ‘2