All about AI, Web 3.0, BCI

Meta introduced NaturalThoughts

Data curation for general reasoning capabilities is still relatively underexplored.

Researchers systematically compare different metrics for selecting high-quality and diverse reasoning traces in terms of data efficiency in the distillation setting.

Researchers find diversity in reasoning strategies matters more than topics diversity, and challenging questions are more sample efficient in distilling reasoning capabilities.

Researchers find that the Less-Is-More approach is not sufficient for solving general reasoning tasks, but scaling up data quantity always brings consistent gains.

Researchers find that NaturalThoughts outperforms state-of-the-art reasoning datasets such as OpenThoughts3, LIMO, S1k, etc. on general STEM domains.

Also find that distillation based on reasoning difficulty can improve the pareto frontier of the student model’s inference efficiency.

Training with a mix of full reasoning traces and the condensed answers enables efficient hybrid reasoning in the student model, by adaptively switching between long chain-of-thought thinking and directly answering.

arXiv.org

NaturalThoughts: Selecting and Distilling Reasoning Traces for...

Recent work has shown that distilling reasoning traces from a larger teacher model via supervised finetuning outperforms reinforcement learning with the smaller student model alone (Guo et al....

❤6

822 views09:21

All about AI, Web 3.0, BCI

Meta introduced research on embodied AI agents that can perceive, learn, act and interact in the virtual and physical worlds.

🆒5

592 views07:48

All about AI, Web 3.0, BCI

HeyGen launched a new Video Agent that handles content production end-to-end

Using just a doc, some footage, or even a sentence, it can find a story, write the script, select shots/generate new footage, and edit everything for final release.

HeyGen

AI Video Agent | Create and Automate Videos with AI | HeyGen

Meet HeyGen’s AI Video Agent. Instantly generate scripts, voiceovers, avatars, and translations to transform any idea into a compelling video. No credit card required.

🔥3🆒3💅2

629 views13:53

All about AI, Web 3.0, BCI

Genspark just launched AI Docs, completing their suite with AI Slides and Sheets.

It's similar to the Gemini integration in Google Docs except with a much better UX, where the AI acts more like a creative partner than just a generative tool: you get to iterate together on the output instead of just prompting once and editing the result. And it has markdown support.

🥰3🆒3

598 views08:34

All about AI, Web 3.0, BCI

The Hong Kong Stablecoin Ordinance will officially take effect on August 1 this year

The Hong Kong Monetary Authority will open the license application. It is expected that only a single digit number will be issued, but more than 40 companies are currently preparing to apply.

The applicants are basically the largest financial institutions and Internet companies in China.

570 views11:30

All about AI, Web 3.0, BCI

OpenAI published "Working with 400,000 teachers to shape the future of AI in schools"

OpenAI joining the American Federation of Teachers as the founding partner to launch the National Academy for AI Instruction, a five-year initiative to equip 400,000 K-12 educators with OpenAI contributing $10 million over five years ($8 million in direct funding and $2 million in in-kind resources) alongside the United Federation of Teachers, Microsoft, and Anthropic in supporting the initiative

Openai

Working with 400,000 teachers to shape the future of AI in schools

OpenAI joins the American Federation of Teachers to launch the National Academy for AI Instruction.

🔥5

596 views14:45

All about AI, Web 3.0, BCI

New Mistral Cookbook : Finetuning Pixtral on a satellite imagery dataset 🛰️

- How to call Mistral's batch inference API
- How to pass images (encoded in base64) in your API calls to Mistral's VLM (here Pixtral-12B)
- How to fine-tune Pixtral-12B on an image classification problem in order to improve its accuracy.

GitHub

cookbook/mistral/fine_tune/pixtral_finetune_on_satellite_data.ipynb at main · mistralai/cookbook

Contribute to mistralai/cookbook development by creating an account on GitHub.

🔥7

600 views17:26

All about AI, Web 3.0, BCI

HuggingFace released SmolLM3: a strong, smol reasoner

> SoTA 3B model
> dual mode reasoning (think/no_think)
> long context, up to 128k
> multilingual: en, fr, es, de, it, pt
> fully open source (data, code, recipes)

🔥3

574 views08:52

All about AI, Web 3.0, BCI

The biggest dataset of human written GPU Code all open-source? YES! GPU MODE have released around 40k human written code samples spanning Triton, Hip and PyTorch and it's all open. Train the new GPT to make GPTs faster.

huggingface.co

GPUMODE/kernelbot-data · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

🔥4

570 views11:59

All about AI, Web 3.0, BCI

Google DeepMind introduced T5Gemma: the next generation of encoder-decoder/T5 models

- Decoder models adapted to be encoder-decoder
- 32 models with different combinations
- Available in Hugging Face and Kaggle

Googleblog

Google for Developers Blog - News about Web, Mobile, AI and Cloud

Explore T5Gemma – a new collection of encoder-decoder LLMs offering superior performance and efficiency – especially for tasks requiring deep input understanding, like summarization and translation, built on Gemma 2 models.

❤6

557 views17:35

All about AI, Web 3.0, BCI

xAI announced Grok 4

Here is everything you need to know:

Elon claims that Grok 4 is smarter than almost all grad students in all disciplines simultaneously. 100x more training than Grok 2. 10x more compute on RL than any of the models out there.

Performance on Humanity's Last Exam. Elon: "Grok 4 is post-grad level in everything!"

Scaling HLE - Training
More compute, higher intelligence.
(no tools).

With native tool calling, Grok 4 increases the performance significantly.
It's important to give AI the right tools. The scaling is clear.

Reliable signals are key to making RL work. There is still the challenge of data. Elon: "Ultimate reasoning test is AI operating in reality."

Scaling test-time compute. More than 50% of the text-only subset of the HLE problems are solved.
The curves keep getting more ridiculous.

Grok 4 is the single-agent version.
Grok 4 Heavy is the multi-agent version. Multi-agent systems are no joke.

Grok 4 uses all kinds of references like papers, reads PDFs, reasons about the details of the simulation, and what data to use.

Grok 4 Heavy performance is higher than Grok 4, but needs to be improved further. It's one of the weaknesses, according to the team.

Available as SuperGrok Heavy tier.
$30/m for Super Grok
$300/m for SuperGrok Heavy.

Voice updates included, too!

Grok feels snappier and is designed to be more natural.
- 2x faster
- 5 voices
- 10x daily user seconds.

Grok 4 models are available via the xAI API. 256K context window. Real-time data search.

Grok 4 for Gaming!
Video understanding is an area the team is improving, so it will get better.

What is next?

- Smart and fast will be the focus.

- Coding models are also a big focus.

- More capable multi-modal agents are coming too.

- Video generation models are also on the horizon.

🔥4

794 views08:00

All about AI, Web 3.0, BCI

Google introduced a new models for research & development of health applications:

1. MedGemma 27B Multimodal, for complex multimodal & longitudinal EHR interpretation

2. MedSigLIP, a lightweight image & text encoder for classification, search, & related tasks.

research.google

MedGemma: Our most capable open models for health AI development

735 views13:29

All about AI, Web 3.0, BCI

Mistral announced Devstral Small and Medium 2507 with upgrading agentic coding capabilities

Hf.

mistral.ai

Upgrading agentic coding capabilities with the new Devstral models | Mistral AI

🔥5

796 views16:10

All about AI, Web 3.0, BCI

Salesforce introduced GTA1 – a new GUI Test-time Scaling Agent that is now #1 on the OSWorld leaderboard with a 45.2% success rate, outperforming OpenAI’s CUA o3 (42.9%).

786 views17:58

About

Blog

Apps

Platform