This research introduces VisPlay, a self-evolving framework where a single vision-language model splits into a "Questioner" and a "Reasoner" to generate its own training data.
It autonomously improves reasoning and reduces hallucinations across major benchmarks, pointing toward scalable, self-improving AI.
GitHub.
It autonomously improves reasoning and reduces hallucinations across major benchmarks, pointing toward scalable, self-improving AI.
GitHub.
arXiv.org
VisPlay: Self-Evolving Vision-Language Models from Images
Reinforcement learning (RL) provides a principled framework for improving Vision-Language Models (VLMs) on complex reasoning tasks. However, existing RL approaches often rely on human-annotated...
🔥4👍3❤2
SOTA open-source vibe coding from your home. Mistral Introduced the Devstral 2 coding model family
Two sizes, both open source.
Also, meet Mistral Vibe, a native CLI, enabling end-to-end automation.
Mistral Vibe CLI is an open-source command-line coding assistant powered by Devstral.
It explores, modifies, and executes changes across your codebase using natural language. Also under Apache 2.0.
Install via: uv tool install mistral-vibe
Two sizes, both open source.
Also, meet Mistral Vibe, a native CLI, enabling end-to-end automation.
Mistral Vibe CLI is an open-source command-line coding assistant powered by Devstral.
It explores, modifies, and executes changes across your codebase using natural language. Also under Apache 2.0.
Install via: uv tool install mistral-vibe
mistral.ai
Introducing: Devstral 2 and Mistral Vibe CLI. | Mistral AI
State-of-the-art, open-source agentic coding models and CLI agent.
👏3🔥2🥰2
Meta released Ax 1.0: an open-source platform for adaptive experimentation at scale.
Ax uses ML to automate complex, resource-intensive experiments, enabling efficient optimization for AI, infrastructure, and hardware.
Ax uses ML to automate complex, resource-intensive experiments, enabling efficient optimization for AI, infrastructure, and hardware.
Engineering at Meta
Efficient Optimization With Ax, an Open Platform for Adaptive Experimentation
We’ve released Ax 1.0, an open-source platform that uses machine learning to automatically guide complex, resource-intensive experimentation. Ax is used at scale across Meta to improve AI models, t…
🔥4🆒4🥰2👏2
Anthropic shipped three new updates for Claude Agent SDK to make it easier to build custom agents:
- Support for 1M context windows
- Sandboxing
- V2 of our TypeScript interface
GitHub.
- Support for 1M context windows
- Sandboxing
- V2 of our TypeScript interface
GitHub.
Claude API Docs
Agent SDK reference - TypeScript
Complete API reference for the TypeScript Agent SDK, including all functions, types, and interfaces.
🔥6🥰4👏3
Google released the FACTS Benchmark Suite
It’s the industry’s first comprehensive test evaluating LLM factuality across four dimensions: internal model knowledge, web search, grounding, and multimodal inputs.
It’s the industry’s first comprehensive test evaluating LLM factuality across four dimensions: internal model knowledge, web search, grounding, and multimodal inputs.
Google DeepMind
FACTS Benchmark Suite: a new way to systematically evaluate LLMs factuality
The FACTS Benchmark Suite provides a systematic evaluation of Large Language Models (LLMs) factuality across three areas: Parametric, Search, and Multimodal reasoning.
❤4🔥2👏2
Travis Beals, a Google executive working on the orbital data-center effort, said it would take 10,000 satellites to recreate the compute capacity of a gigawatt data center, assuming 100-kilowatt satellites.
The Wall Street Journal
Exclusive | Bezos and Musk Race to Bring Data Centers to Space
Jeff Bezos and Elon Musk are racing to take the trillion-dollar data-center boom into orbit.
🔥4❤2👏2
NVIDIA presents Alpamayo-R1
It's a vision-language-action model that uses "Chain of Causation" reasoning to plan.
It cuts off-road events by 35% and improves decision-making in complex scenarios, showing a promising path to more capable autonomy.
It's a vision-language-action model that uses "Chain of Causation" reasoning to plan.
It cuts off-road events by 35% and improves decision-making in complex scenarios, showing a promising path to more capable autonomy.
arXiv.org
Alpamayo-R1: Bridging Reasoning and Action Prediction for...
End-to-end architectures trained via imitation learning have advanced autonomous driving by scaling model size and data, yet performance remains brittle in safety-critical long-tail scenarios...
🔥3🥰3👏3
Google released the Gemini Deep Research agent for developers.
It can create a plan, spot gaps, and autonomously navigate the web to produce detailed reports.
Built on Gemini 3 Pro, it was trained using multi-step reinforcement learning to increase accuracy and reduce hallucinations.
It handles massive context – analyzing your uploaded docs alongside the web – and provides citations so you can verify every claim.
Deep Research is the first agent released on the new Interactions API – offering a single endpoint for agentic workflows.
It can create a plan, spot gaps, and autonomously navigate the web to produce detailed reports.
Built on Gemini 3 Pro, it was trained using multi-step reinforcement learning to increase accuracy and reduce hallucinations.
It handles massive context – analyzing your uploaded docs alongside the web – and provides citations so you can verify every claim.
Deep Research is the first agent released on the new Interactions API – offering a single endpoint for agentic workflows.
Google
Build with Gemini Deep Research
We have reimagined Gemini Deep Research to be more powerful than ever, now accessible to developers via the new Interactions API.
🔥2🥰2👏2❤1
OpenAI shipped a new model. GPT-5.2 showcases OpenAI's incredible post-training stack in action: significant gains in knowledge work (think building a financial model), long-context capability, and coding.
GPT-5.2 likely involved additional mid-training to refresh the cutoff date, plus significant amounts of RL.
One catch: OpenAI raised pricing 40%. Is it worth it?
SWE-Bench Pro results offer an interesting perspective. GPT-5.2 is able to reach higher scores at comparable cost to 5.1 Codex Max, while also continuing to push the capability ceiling.
This price hike will directly increase OpenAI's margins.
We saw a similar dynamic with Claude models, whereby Opus 4.5 was able to achieve comparable scores to Sonnet 4.5 at much lower cost.
This is due to models becoming increasing token efficient, requiring less thinking to get more done.
GPT-5.2 likely involved additional mid-training to refresh the cutoff date, plus significant amounts of RL.
One catch: OpenAI raised pricing 40%. Is it worth it?
SWE-Bench Pro results offer an interesting perspective. GPT-5.2 is able to reach higher scores at comparable cost to 5.1 Codex Max, while also continuing to push the capability ceiling.
This price hike will directly increase OpenAI's margins.
We saw a similar dynamic with Claude models, whereby Opus 4.5 was able to achieve comparable scores to Sonnet 4.5 at much lower cost.
This is due to models becoming increasing token efficient, requiring less thinking to get more done.
Openai
Introducing GPT-5.2
GPT-5.2 is our most advanced frontier model for everyday professional work, with state-of-the-art reasoning, long-context understanding, coding, and vision. Use it in ChatGPT and the OpenAI API to power faster, more reliable agentic workflows.
👍4🔥4👏4
Apple briefly posted then quickly pulled an arXiv paper, but the v1 snapshot is wild.
The team reveals RLAX, a scalable RL framework on TPUs.
It's built with a parameter server design where a master trainer pushes weights and massive inference fleets pull them to generate rollouts.
With new curation and alignment tricks and preemption friendly engineering, RLAX boosts QwQ-32B pass@8 by 12.8 percent in only 12h48m on 1024 v5p TPUs.
The team reveals RLAX, a scalable RL framework on TPUs.
It's built with a parameter server design where a master trainer pushes weights and massive inference fleets pull them to generate rollouts.
With new curation and alignment tricks and preemption friendly engineering, RLAX boosts QwQ-32B pass@8 by 12.8 percent in only 12h48m on 1024 v5p TPUs.
🔥4🥰4👏3
First comprehensive framework for how AI agents actually improve through adaptation.
Researchers from many universities surveyed the rapidly expanding landscape of agentic AI adaptation.
What they found: a fragmented field with no unified understanding of how agents learn to use tools, when to adapt the agent versus the tool, and which strategies work for which scenarios.
These are all important for building production-ready AI agents.
Adaptation in agentic AI follows four distinct paradigms that most practitioners conflate or ignore entirely.
The framework organizes all adaptation strategies into two dimensions.
- Agent Adaptation (A1, A2): modifying the agent's parameters, representations, or policies.
- Tool Adaptation (T1, T2): optimizing external components like retrievers, planners, and memory modules while keeping the agent frozen.
Researchers from many universities surveyed the rapidly expanding landscape of agentic AI adaptation.
What they found: a fragmented field with no unified understanding of how agents learn to use tools, when to adapt the agent versus the tool, and which strategies work for which scenarios.
These are all important for building production-ready AI agents.
Adaptation in agentic AI follows four distinct paradigms that most practitioners conflate or ignore entirely.
The framework organizes all adaptation strategies into two dimensions.
- Agent Adaptation (A1, A2): modifying the agent's parameters, representations, or policies.
- Tool Adaptation (T1, T2): optimizing external components like retrievers, planners, and memory modules while keeping the agent frozen.
GitHub
Awesome-Adaptation-of-Agentic-AI/paper.pdf at main · pat-jj/Awesome-Adaptation-of-Agentic-AI
Repo for "Adaptation of Agentic AI". Contribute to pat-jj/Awesome-Adaptation-of-Agentic-AI development by creating an account on GitHub.
🔥3❤2👏2
Diffusion LLMs are the new frontier? InclusionAI has released LLaDA 2.0—the first diffusion model to scale to 100B params, matching frontier LLMs while achieving 2× faster inference
LLaDA is 2.3x faster on average. We see unique high-TPF advantages in Coding via parallel decoding.
The Challenge: AR models had a 3-year head start.
GitHub.
GitHub.
LLaDA is 2.3x faster on average. We see unique high-TPF advantages in Coding via parallel decoding.
The Challenge: AR models had a 3-year head start.
GitHub.
GitHub.
GitHub
GitHub - inclusionAI/dFactory: Easy and Efficient dLLM Fine-Tuning
Easy and Efficient dLLM Fine-Tuning. Contribute to inclusionAI/dFactory development by creating an account on GitHub.
❤5🔥5👏3
NVIDIA launched the open Nemotron 3 model family, starting with Nano (30B-3A), which pushes the frontier of accuracy and inference efficiency with a novel hybrid SSM Mixture of Experts architecture.
Super and Ultra are coming in the next few months.
Nemotron 3 Super (~4X bigger than Nano) and Ultra (~16X bigger than Nano) are pretrained using NVFP4, a new "Latent Mixture of Experts" architecture that allows us to use 4X more experts for the same inference cost, and Multi-Token Prediction.
Super and Ultra are coming in the next few months.
Nemotron 3 Super (~4X bigger than Nano) and Ultra (~16X bigger than Nano) are pretrained using NVFP4, a new "Latent Mixture of Experts" architecture that allows us to use 4X more experts for the same inference cost, and Multi-Token Prediction.
❤4🔥4🥰2
a16z released 17 crypto predictions for 2026. Most are obvious. A few are not.
The ones worth paying attention to:
1. Privacy becomes the strongest moat
Bridging tokens is easy. Bridging secrets is hard. Users on private chains are less likely to leave.
Winner-take-most dynamics emerge.
2. Know Your Agent (KYA)
Non-human identities outnumber human employees 96-to-1 in financial services.
The agent economy's bottleneck is identity.
3. AI agents are taxing the open web
They extract value from ad-supported sites while bypassing revenue streams.
The web needs real-time, usage-based compensation or content creation collapses.
The ones worth paying attention to:
1. Privacy becomes the strongest moat
Bridging tokens is easy. Bridging secrets is hard. Users on private chains are less likely to leave.
Winner-take-most dynamics emerge.
2. Know Your Agent (KYA)
Non-human identities outnumber human employees 96-to-1 in financial services.
The agent economy's bottleneck is identity.
3. AI agents are taxing the open web
They extract value from ad-supported sites while bypassing revenue streams.
The web needs real-time, usage-based compensation or content creation collapses.
a16z crypto
17 things we're excited about for crypto in 2026 - a16z crypto
❤3🔥3💯3
DeepCode: Open Agentic Coding
DeepCode, an open agentic coding framework that treats repository synthesis as a channel optimization problem, maximizing task-relevant signals under finite context budgets.
How does this work?
Scientific papers are high-entropy specifications with scattered multimodal constraints, equations, pseudocode, and hyperparameters. Naive approaches that concatenate raw documents with growing code history cause channel saturation, where redundant tokens mask critical algorithmic details and signal-to-noise ratio collapses.
The results on OpenAI's PaperBench benchmark are impressive. DeepCode achieves 73.5% replication score, a 70% relative improvement over the best LLM agent baseline (o1 at 43.3%). It decisively outperforms commercial agents: Cursor at 58.4%, Claude Code at 58.7%, and Codex at 40.0%.
Most notably, DeepCode surpasses human experts. On a 3-paper subset evaluated by ML PhD students from Berkeley, Cambridge, and Carnegie Mellon, humans scored 72.4%. DeepCode scored 75.9%.
Principled information-flow management yields significantly larger performance gains than merely scaling model size or context length. The framework is fully open source.
DeepCode, an open agentic coding framework that treats repository synthesis as a channel optimization problem, maximizing task-relevant signals under finite context budgets.
How does this work?
Scientific papers are high-entropy specifications with scattered multimodal constraints, equations, pseudocode, and hyperparameters. Naive approaches that concatenate raw documents with growing code history cause channel saturation, where redundant tokens mask critical algorithmic details and signal-to-noise ratio collapses.
The results on OpenAI's PaperBench benchmark are impressive. DeepCode achieves 73.5% replication score, a 70% relative improvement over the best LLM agent baseline (o1 at 43.3%). It decisively outperforms commercial agents: Cursor at 58.4%, Claude Code at 58.7%, and Codex at 40.0%.
Most notably, DeepCode surpasses human experts. On a 3-paper subset evaluated by ML PhD students from Berkeley, Cambridge, and Carnegie Mellon, humans scored 72.4%. DeepCode scored 75.9%.
Principled information-flow management yields significantly larger performance gains than merely scaling model size or context length. The framework is fully open source.
arXiv.org
DeepCode: Open Agentic Coding
Recent advances in large language models (LLMs) have given rise to powerful coding agents, making it possible for code assistants to evolve into code engineers. However, existing methods still...
🔥4🆒4❤3🥰2
NEW Research from Meta Superintelligence Labs and collaborators
This new research introduces Parallel-Distill-Refine (PDR), a framework that treats LLMs as improvement operators rather than single-pass reasoners.
Instead of one long reasoning chain, PDR operates in phases:
- Generate diverse drafts in parallel.
- Distill them into a bounded textual workspace.
- Refine conditioned on this workspace.
- Repeat.
Context length becomes controllable via degree of parallelism, no longer conflated with total tokens generated. The model accumulates wisdom across rounds through compact summaries rather than replaying full histories.
The researchers also trained an 8B model with operator-consistent RL to make training match the PDR inference interface. Mixing standard and operator RL yields an additional 5% improvement on both AIME benchmarks.
Bounded memory iteration can substitute for long reasoning traces while holding latency fixed. Strategic parallelism and distillation is shown to beat brute-force sequence extension.
This new research introduces Parallel-Distill-Refine (PDR), a framework that treats LLMs as improvement operators rather than single-pass reasoners.
Instead of one long reasoning chain, PDR operates in phases:
- Generate diverse drafts in parallel.
- Distill them into a bounded textual workspace.
- Refine conditioned on this workspace.
- Repeat.
Context length becomes controllable via degree of parallelism, no longer conflated with total tokens generated. The model accumulates wisdom across rounds through compact summaries rather than replaying full histories.
The researchers also trained an 8B model with operator-consistent RL to make training match the PDR inference interface. Mixing standard and operator RL yields an additional 5% improvement on both AIME benchmarks.
Bounded memory iteration can substitute for long reasoning traces while holding latency fixed. Strategic parallelism and distillation is shown to beat brute-force sequence extension.
arXiv.org
Rethinking Thinking Tokens: LLMs as Improvement Operators
Reasoning training incentivizes LLMs to produce long chains of thought (long CoT), which among other things, allows them to explore solution strategies with self-checking. This results in higher...
💯4🔥2🥰2❤1
Demis Hassabis, CEO Google DeepMind laid out the clearest roadmap to AGI
1/ AGI won’t come from scaling alone. Demis Hassabis says it’s 50% scaling, 50% innovation. Bigger models matter, but new ideas matter just as much.
2/ Today’s AI is powerful but jagged. Gold-medal level at Olympiad math. Yet still fails basic logic and consistency tests. That gap is why we’re not at AGI.
3/ The missing ingredient isn’t intelligence. It’s reliability, reasoning, and self-awareness of uncertainty. AI needs to know what it doesn’t know.
4/ Hallucinations aren’t random. They often happen because models are forced to answer when they should say “I’m not confident.”
5/ AlphaFold showed the playbook. Solve a root problem once, unlock entire industries downstream. Now DeepMind is targeting materials, fusion, and climate.
6/ Fusion is the ultimate root node. Clean, abundant energy would reshape water, food, climate, and even space travel. AI could help crack it.
7/ Language models surprised us. They understand more about the world than expected. But language alone isn’t enough.
8/ That’s why world models matter. To understand physics, space, causality, and action, AI must experience worlds, not just read about them.
9/ Simulation is the next frontier. If an AI can generate a realistic world, it likely understands its mechanics.
10/ Drop agents into those worlds and let curiosity drive learning. Now you have infinite training data, created on the fly.
11/ This could be how AI learns like humans do. Exploration first. Understanding second. Generalization last.
12/ Hassabis believes simulation may also unlock science. Weather. Biology. Materials. Even the origins of life.
13/ Why simulations matter philosophically: If you can simulate something, you’ve understood it.
14/ That leads to the deepest question. Is there anything in the universe that’s non-computable?
15/ So far, we haven’t found one. Protein folding. Go. Complex biology. All computable.
16/ Consciousness might be next. AGI could become a mirror that shows us what, if anything, is unique about the human mind.
17/ If creativity, emotion, or dreaming are computable, machines may have them too. If not, we’ll finally learn where the boundary is.
18/ AGI isn’t just a tech problem. It’s an economic, social, and philosophical one.
19/ The industrial revolution took a century. AGI may unfold in a decade. The disruption will be faster and bigger.
20/ Hassabis’ core belief: The universe runs on information. And intelligence may be the ultimate way to understand it.
1/ AGI won’t come from scaling alone. Demis Hassabis says it’s 50% scaling, 50% innovation. Bigger models matter, but new ideas matter just as much.
2/ Today’s AI is powerful but jagged. Gold-medal level at Olympiad math. Yet still fails basic logic and consistency tests. That gap is why we’re not at AGI.
3/ The missing ingredient isn’t intelligence. It’s reliability, reasoning, and self-awareness of uncertainty. AI needs to know what it doesn’t know.
4/ Hallucinations aren’t random. They often happen because models are forced to answer when they should say “I’m not confident.”
5/ AlphaFold showed the playbook. Solve a root problem once, unlock entire industries downstream. Now DeepMind is targeting materials, fusion, and climate.
6/ Fusion is the ultimate root node. Clean, abundant energy would reshape water, food, climate, and even space travel. AI could help crack it.
7/ Language models surprised us. They understand more about the world than expected. But language alone isn’t enough.
8/ That’s why world models matter. To understand physics, space, causality, and action, AI must experience worlds, not just read about them.
9/ Simulation is the next frontier. If an AI can generate a realistic world, it likely understands its mechanics.
10/ Drop agents into those worlds and let curiosity drive learning. Now you have infinite training data, created on the fly.
11/ This could be how AI learns like humans do. Exploration first. Understanding second. Generalization last.
12/ Hassabis believes simulation may also unlock science. Weather. Biology. Materials. Even the origins of life.
13/ Why simulations matter philosophically: If you can simulate something, you’ve understood it.
14/ That leads to the deepest question. Is there anything in the universe that’s non-computable?
15/ So far, we haven’t found one. Protein folding. Go. Complex biology. All computable.
16/ Consciousness might be next. AGI could become a mirror that shows us what, if anything, is unique about the human mind.
17/ If creativity, emotion, or dreaming are computable, machines may have them too. If not, we’ll finally learn where the boundary is.
18/ AGI isn’t just a tech problem. It’s an economic, social, and philosophical one.
19/ The industrial revolution took a century. AGI may unfold in a decade. The disruption will be faster and bigger.
20/ Hassabis’ core belief: The universe runs on information. And intelligence may be the ultimate way to understand it.
YouTube
The future of intelligence | Demis Hassabis (Co-founder and CEO of DeepMind)
In our final episode of the season, Professor Hannah Fry sits down with Google DeepMind Co-founder and CEO Demis Hassabis for their annual check-in. Together, they look beyond the product launches to the scientific and technological questions that will define…
🔥9❤5🥰3
Anthropic will add 5 different starting points to its upcoming Tasks Mode: Research, Analyse, Write, Build, and Do More. Tons of granular controls
A new sidebar for tracking tasks' progress and working with Claude's context has also been added.
A new sidebar for tracking tasks' progress and working with Claude's context has also been added.
TestingCatalog
Anthropic preparing new Agentic Tasks Mode for Claude
Anthropic testing Claude's Agent mode with a new interface for tasks, to introduce new modes for research, analysis, writing, and building.
🔥2👏2💯2
Meta introduced SAM Audio, the first unified model that isolates any sound from complex audio mixtures using text, visual, or span prompts.
This is a cool model, because always struggled with finding good scenarios that combine audio and vision, where audio plays a larger role than just "like language in vlms, but, you know, as sound wave instead".
This is a cool model, because always struggled with finding good scenarios that combine audio and vision, where audio plays a larger role than just "like language in vlms, but, you know, as sound wave instead".
🔥2🥰2👏2
VC firms are building their own AI tools to compete for the best startup deals. And for founders, that's changing the relationships game.
This summer, venture capitalist Aubrie Pagano snagged the chance to invest in a buzzy funding round with a major assist from AI.
For their crucial pitch meeting with a frontier science lab, Pagano brought a list of 50 high-value prospects – academics, pharma execs and former FDA leaders – and the exact route by which her firm, Alpaca VC, could connect its founders to each.
The startup made room for Alpaca to invest $1 million. It was only afterward that its founders found out that Pagano had used an agent from the firm’s proprietary AI system, known internally as Gordon, to help secure the deal.
Seemingly every VC firm has that partner (or several) who drafts LinkedIn ‘thought leadership’ posts in Claude, runs meeting notes from Granola through NotebookLM, or calculates market projections in a custom GPT.
But Alpaca, investing out of a $78 million fund, and a growing number of boutique and emerging VC firms are looking to compete – and punch above their weight with founders – by making outsized bets on building and investing through their own advanced AI tools.
These firms are fine-tuning their own models and setting up MCP servers, and managing long-running agents that automate entire processes, from back office reporting to their investment memos and content calendars.
At DVC, a $75 million fund that’s an early backer of Perplexity, AI recommendations have helped the firm write preemptive checks into some of the firm’s fastest-growing companies, like Higgsfield AI, just before revenue or valuations soared.
And at Topology Ventures, a frontier tech firm that raised a $75 million fund last year, managing partner Casey Caruso believes her internal AI CRM, called Fiber, is so good at predicting founder movements that it could raise millions in its own right.
This summer, venture capitalist Aubrie Pagano snagged the chance to invest in a buzzy funding round with a major assist from AI.
For their crucial pitch meeting with a frontier science lab, Pagano brought a list of 50 high-value prospects – academics, pharma execs and former FDA leaders – and the exact route by which her firm, Alpaca VC, could connect its founders to each.
The startup made room for Alpaca to invest $1 million. It was only afterward that its founders found out that Pagano had used an agent from the firm’s proprietary AI system, known internally as Gordon, to help secure the deal.
Seemingly every VC firm has that partner (or several) who drafts LinkedIn ‘thought leadership’ posts in Claude, runs meeting notes from Granola through NotebookLM, or calculates market projections in a custom GPT.
But Alpaca, investing out of a $78 million fund, and a growing number of boutique and emerging VC firms are looking to compete – and punch above their weight with founders – by making outsized bets on building and investing through their own advanced AI tools.
These firms are fine-tuning their own models and setting up MCP servers, and managing long-running agents that automate entire processes, from back office reporting to their investment memos and content calendars.
At DVC, a $75 million fund that’s an early backer of Perplexity, AI recommendations have helped the firm write preemptive checks into some of the firm’s fastest-growing companies, like Higgsfield AI, just before revenue or valuations soared.
And at Topology Ventures, a frontier tech firm that raised a $75 million fund last year, managing partner Casey Caruso believes her internal AI CRM, called Fiber, is so good at predicting founder movements that it could raise millions in its own right.
Upstartsmedia
Deep Dive: As Smaller VC Firms Build AI Tools To Compete, Founders Should (Mostly) Benefit
Challengers like Alpaca VC, DVC and Topology Ventures are going beyond ChatGPT: “We want to give everybody their own personal AI analyst, so they can look at companies the way a partner at a16z would do.”
🔥5👏4🥰2
Gemini 3 flash is out. The fast mode from the model picker in the GeminiApp - it’s shockingly speedy AND smart.
What an OP model. Also mind blowing how even just flash is competitive with the best models.
What an OP model. Also mind blowing how even just flash is competitive with the best models.