All about AI, Web 3.0, BCI
3.29K subscribers
729 photos
26 videos
161 files
3.13K links
This channel about AI, Web 3.0 and brain computer interface(BCI)

owner @Aniaslanyan
Download Telegram
Meta introduced NaturalThoughts

Data curation for general reasoning capabilities is still relatively underexplored.


Researchers systematically compare different metrics for selecting high-quality and diverse reasoning traces in terms of data efficiency in the distillation setting.

Researchers find diversity in reasoning strategies matters more than topics diversity, and challenging questions are more sample efficient in distilling reasoning capabilities.

Researchers find that the Less-Is-More approach is not sufficient for solving general reasoning tasks, but scaling up data quantity always brings consistent gains.

Researchers find that NaturalThoughts outperforms state-of-the-art reasoning datasets such as OpenThoughts3, LIMO, S1k, etc. on general STEM domains.

Also find that distillation based on reasoning difficulty can improve the pareto frontier of the student model’s inference efficiency.

Training with a mix of full reasoning traces and the condensed answers enables efficient hybrid reasoning in the student model, by adaptively switching between long chain-of-thought thinking and directly answering.
6
Meta introduced research on embodied AI agents that can perceive, learn, act and interact in the virtual and physical worlds.
🆒5
HeyGen launched a new Video Agent that handles content production end-to-end

Using just a doc, some footage, or even a sentence, it can find a story, write the script, select shots/generate new footage, and edit everything for final release.
🔥3🆒3💅2
Genspark just launched AI Docs, completing their suite with AI Slides and Sheets.

It's similar to the Gemini integration in Google Docs except with a much better UX, where the AI acts more like a creative partner than just a generative tool: you get to iterate together on the output instead of just prompting once and editing the result. And it has markdown support.
🥰3🆒3
The Hong Kong Stablecoin Ordinance will officially take effect on August 1 this year

The Hong Kong Monetary Authority will open the license application. It is expected that only a single digit number will be issued, but more than 40 companies are currently preparing to apply.

The applicants are basically the largest financial institutions and Internet companies in China.
OpenAI published "Working with 400,000 teachers to shape the future of AI in schools"

OpenAI joining the American Federation of Teachers as the founding partner to launch the National Academy for AI Instruction, a five-year initiative to equip 400,000 K-12 educators with OpenAI contributing $10 million over five years ($8 million in direct funding and $2 million in in-kind resources) alongside the United Federation of Teachers, Microsoft, and Anthropic in supporting the initiative
🔥5
New Mistral Cookbook: Finetuning Pixtral on a satellite imagery dataset 🛰️

- How to call Mistral's batch inference API
- How to pass images (encoded in base64) in your API calls to Mistral's VLM (here Pixtral-12B)
- How to fine-tune Pixtral-12B on an image classification problem in order to improve its accuracy.
🔥7
HuggingFace released SmolLM3: a strong, smol reasoner

> SoTA 3B model
> dual mode reasoning (think/no_think)
> long context, up to 128k
> multilingual: en, fr, es, de, it, pt
> fully open source (data, code, recipes)
🔥3
The biggest dataset of human written GPU Code all open-source? YES! GPU MODE have released around 40k human written code samples spanning Triton, Hip and PyTorch and it's all open. Train the new GPT to make GPTs faster.
🔥4
xAI announced Grok 4

Here is everything you need to know:

Elon claims that Grok 4 is smarter than almost all grad students in all disciplines simultaneously. 100x more training than Grok 2. 10x more compute on RL than any of the models out there.

Performance on Humanity's Last Exam. Elon: "Grok 4 is post-grad level in everything!"

Scaling HLE - Training
More compute, higher intelligence.
(no tools).

With native tool calling, Grok 4 increases the performance significantly.
It's important to give AI the right tools. The scaling is clear.

Reliable signals are key to making RL work. There is still the challenge of data. Elon: "Ultimate reasoning test is AI operating in reality."

Scaling test-time compute. More than 50% of the text-only subset of the HLE problems are solved.
The curves keep getting more ridiculous.

Grok 4 is the single-agent version.
Grok 4 Heavy is the multi-agent version. Multi-agent systems are no joke.

Grok 4 uses all kinds of references like papers, reads PDFs, reasons about the details of the simulation, and what data to use.

Grok 4 Heavy performance is higher than Grok 4, but needs to be improved further. It's one of the weaknesses, according to the team.

Available as SuperGrok Heavy tier.
$30/m for Super Grok
$300/m for SuperGrok Heavy.

Voice updates included, too!

Grok feels snappier and is designed to be more natural.
- 2x faster
- 5 voices
- 10x daily user seconds.

Grok 4 models are available via the xAI API. 256K context window. Real-time data search.

Grok 4 for Gaming!
Video understanding is an area the team is improving, so it will get better.

What is next?

- Smart and fast will be the focus.

- Coding models are also a big focus.

- More capable multi-modal agents are coming too.

- Video generation models are also on the horizon.
🔥4
Google introduced a new models for research & development of health applications:

1. MedGemma 27B Multimodal, for complex multimodal & longitudinal EHR interpretation

2. MedSigLIP, a lightweight image & text encoder for classification, search, & related tasks.
Mistral announced Devstral Small and Medium 2507 with upgrading agentic coding capabilities

Hf.
🔥5
Salesforce introduced GTA1 – a new GUI Test-time Scaling Agent that is now #1 on the OSWorld leaderboard with a 45.2% success rate, outperforming OpenAI’s CUA o3 (42.9%).