Microsoft released GitHub Spark — a new tool in Copilot that turns your ideas into full-stack apps, entirely in natural language.
The GitHub Blog
GitHub Spark in public preview for Copilot Pro+ subscribers - GitHub Changelog
Stuck between idea and implementation? Spending weeks on mock ups or docs that never ship? GitHub Spark takes you from idea to deployed app in minutes. Build and ship full-stack…
🔥4
MIT introduced MEM1: RL for Memory Consolidation in Long-Horizon Agents.
Long-horizon agents (e.g., deep research, web agents) typically store all observations, actions, and intermediate thoughts in context. However, much of this information is unnecessary for subsequent reasoning, leading to inefficient memory usage and slower inference.
In MEM1, researchers introduced RL approach that trains the agent to maintain a dynamic internal state, which:
1. Consolidates and maintains only relevant information
2. Updates memory while reasoning
3. Discards unneeded history dynamically
A new method achieves:
1. 3.7× lower memory usage & 1.78× faster inference on multi-question HotpotQA
2. 2.5× lower memory usage on WebShop
code and model are fully open-sourced:
Paper.
Long-horizon agents (e.g., deep research, web agents) typically store all observations, actions, and intermediate thoughts in context. However, much of this information is unnecessary for subsequent reasoning, leading to inefficient memory usage and slower inference.
In MEM1, researchers introduced RL approach that trains the agent to maintain a dynamic internal state, which:
1. Consolidates and maintains only relevant information
2. Updates memory while reasoning
3. Discards unneeded history dynamically
A new method achieves:
1. 3.7× lower memory usage & 1.78× faster inference on multi-question HotpotQA
2. 2.5× lower memory usage on WebShop
code and model are fully open-sourced:
Paper.
mit-mi.github.io
TWITTER BANNER TITLE META TAG
TWITTER BANNER DESCRIPTION META TAG
❤5
Chat Annotator— a free chatbot where users can highlight parts of responses, leave a comment, and have the model incorporate that feedback into its next output. Powered by Cohere Command-A.
How do we train LLMs on real-world tasks where it’s hard to define a single verifiable answer?
Scale introduced Rubrics as Rewards (RaR) — a framework for on-policy post-training that uses structured, checklist-style rubrics as interpretable reward signals.
Scale introduced Rubrics as Rewards (RaR) — a framework for on-policy post-training that uses structured, checklist-style rubrics as interpretable reward signals.
👍5
ASI-Arch is the first Artificial Superintelligence for AI Research enabling fully automated neural architecture innovation.
No human-designed search space. No human in the loop.
Key Breakthroughs of ASI-Arch:
- Autonomous code generation & training
- 1,773 experiments conducted (20K+ GPU hours)
- 106 new SOTA linear attention architectures discovered
- Unveiled a scaling law for scientific discovery
No human-designed search space. No human in the loop.
Key Breakthroughs of ASI-Arch:
- Autonomous code generation & training
- 1,773 experiments conducted (20K+ GPU hours)
- 106 new SOTA linear attention architectures discovered
- Unveiled a scaling law for scientific discovery
arXiv.org
AlphaGo Moment for Model Architecture Discovery
While AI systems demonstrate exponentially improving capabilities, the pace of AI research itself remains linearly bounded by human cognitive capacity, creating an increasingly severe development...
🔥7🤯2
Another massive open-source LLM is coming from a Chinese company. Meet Step 3 — multimodal LLM from StepFun:
1. MoE architecture (321B total params, 38B active)
2. Rivals OpenAI o3, Gemini 2.5 Pro, and Claude Opus 4 in performance
3. Optimized for China’s domestic AI chips
StepFun just announced: Step 3 will be open-sourced on July 31st!
This could be the best open-source multimodal LLM you’ll get your hands on.
1. MoE architecture (321B total params, 38B active)
2. Rivals OpenAI o3, Gemini 2.5 Pro, and Claude Opus 4 in performance
3. Optimized for China’s domestic AI chips
StepFun just announced: Step 3 will be open-sourced on July 31st!
This could be the best open-source multimodal LLM you’ll get your hands on.
🔥11
Claude Code is getting a brand new feature: custom subagents
Subagents let you create teams of custom agents, each designed to handle specialized tasks.
Examples of subagents we’ve seen be useful are:
1. Software Architect: help design features elegantly and ensure appropriate layers of abstraction.
2. Code reviewer: Review best practices in a codebase, delete old code.
3. QA tester: Run unit tests, lints and writes fixes.
Subagents let you create teams of custom agents, each designed to handle specialized tasks.
Examples of subagents we’ve seen be useful are:
1. Software Architect: help design features elegantly and ensure appropriate layers of abstraction.
2. Code reviewer: Review best practices in a codebase, delete old code.
3. QA tester: Run unit tests, lints and writes fixes.
Claude Code Docs
Create custom subagents - Claude Code Docs
Create and use specialized AI subagents in Claude Code for task-specific workflows and improved context management.
🔥6❤🔥2👍2
Hunyuan released 3D World Model 1.0
It's the industry's first open-source 3D world generation model, compatible with CG pipelines for full editability & simulation. Set to transform game development, VR, digital content creation and so on.
Try it.
It's the industry's first open-source 3D world generation model, compatible with CG pipelines for full editability & simulation. Set to transform game development, VR, digital content creation and so on.
Try it.
Tencent
腾讯混元3D
腾讯混元3D生成模型基于Diffusion技术,支持文本和图像生成3D资产。该模型配备精心设计的文本和图像编码器、扩散模型及3D解码器,能够实现多视图生成、重建及单视图生成。腾讯混元3D大模型可快速生成精美3D物体,适用于多种下游应用。
A new world wodel from Meta - DINO-world: a generalist video world model that predicts the future—in latent space.
Trained on uncurated videos with DINOv2, it learns diverse temporal dynamics (driving, indoors, sims), beats prior models on segmentation & depth, and even grasps intuitive physics.
It can be fine-tuned for action-conditioned planning.
Trained on uncurated videos with DINOv2, it learns diverse temporal dynamics (driving, indoors, sims), beats prior models on segmentation & depth, and even grasps intuitive physics.
It can be fine-tuned for action-conditioned planning.
🆒5🔥2
Singapore's Sapient Intelligence dropped a Hierarchical Reasoning Model, with a brain-inspired architecture
With training on just 1K examples and 27M params, it handles complex reasoning tasks like extreme Sudoku and maze puzzles
Code.
With training on just 1K examples and 27M params, it handles complex reasoning tasks like extreme Sudoku and maze puzzles
Code.
❤4🍌2
PayPal to let US. merchants accept payment in more than 100 cryptocurrencies.
Fortune
PayPal to let U.S. businesses accept payment in more than 100 cryptocurrencies | Fortune
The fintech giant will charge a transaction rate of 0.99% for the first year and then up it to 1.5%.
🔥3
All about AI, Web 3.0, BCI
Another massive open-source LLM is coming from a Chinese company. Meet Step 3 — multimodal LLM from StepFun: 1. MoE architecture (321B total params, 38B active) 2. Rivals OpenAI o3, Gemini 2.5 Pro, and Claude Opus 4 in performance 3. Optimized for China’s…
The Step-3 tech report is now on arXiv
Step-3 is Large yet Affordable: Model-system Co-design for Cost-effective Decoding.
Step-3 is Large yet Affordable: Model-system Co-design for Cost-effective Decoding.
arXiv.org
Step-3 is Large yet Affordable: Model-system Co-design for...
Large language models (LLMs) face low hardware efficiency during decoding, especially for long-context reasoning tasks. This paper introduces Step-3, a 321B-parameter VLM with hardware-aware...
🔥3
Chinese lab Z.ai dropped GLM-4.5 and 4.5 Air, 2 open-source agentic models
The 4.5 variant with 355B params tops open models worldwide, and ranks just behind o3 and Grok 4
Also excels at agentic tasks with a 90% success in tool use.
API Pricing (per 1M tokens):
GLM-4.5: $0.6 Input / $2.2 Output
GLM-4.5-Air: $0.2 Input / $1.1 Output
Weights
API
OpenRouter
Develop Tools.
Try them.
The 4.5 variant with 355B params tops open models worldwide, and ranks just behind o3 and Grok 4
Also excels at agentic tasks with a 90% success in tool use.
API Pricing (per 1M tokens):
GLM-4.5: $0.6 Input / $2.2 Output
GLM-4.5-Air: $0.2 Input / $1.1 Output
Weights
API
OpenRouter
Develop Tools.
Try them.
Overview - Z.AI DEVELOPER DOCUMENT
GLM-4.5 - Overview - Z.AI DEVELOPER DOCUMENT
❤3🔥3🆒3
OpenAI Introduced study mode in ChatGPT — step by step guidance for students rather than quick answers.
Openai
Introducing study mode
A new way to learn in ChatGPT that offers step by step guidance instead of quick answers.
All about AI, Web 3.0, BCI
CAMEL-AI's Trifecta: Loong, OWL, and CRAB - The Future of AI Agent Systems Loong: Self-Improving AI in Specialized Domains Project Loong tackles the fundamental challenge of training LLMs to reason effectively in specialized domains without expensive labeled…
Eigent — the first open source multi-agent workforce on your desktop.
Eigent is a team of AI agents collaborating to complete complex tasks in parallel.
It brings together specialized agents, developer, search, document, multi-modal, each designed to work in parallel and adapt to your needs.
Eigent is built on CamelAI open-source multi-agent infrastructures.
It supports:
- Running parallel tasks
- Custom workers
- Cloud version or "Bring Your Own Key" (BYOK)
- Local model deployment
- Human-in-the-loop feedback
- Model Context Protocol (MCP) tools
- Secure self-hosting
- Enterprise-level security
Eigent supports multiple deployment options:
- Cloud version with instant access and managed infrastructure
- Community edition for local hosting and customization
- Enterprise edition with SLAs, auditability, and scale
Eigent is a team of AI agents collaborating to complete complex tasks in parallel.
It brings together specialized agents, developer, search, document, multi-modal, each designed to work in parallel and adapt to your needs.
Eigent is built on CamelAI open-source multi-agent infrastructures.
It supports:
- Running parallel tasks
- Custom workers
- Cloud version or "Bring Your Own Key" (BYOK)
- Local model deployment
- Human-in-the-loop feedback
- Model Context Protocol (MCP) tools
- Secure self-hosting
- Enterprise-level security
Eigent supports multiple deployment options:
- Cloud version with instant access and managed infrastructure
- Community edition for local hosting and customization
- Enterprise edition with SLAs, auditability, and scale
eigent.ai
Open Source Cowork: the open source cowork desktop
Open Source Cowork is a desktop multi-agent workforce that connects to your context and can control the browser and desktop apps to automate real work.
🔥3
Coinbase and JPMorgan have partnered to crypto access for over 80 million Chase customers, introducing three 3 methods:
- converting Chase Ultimate Rewards to USDC,
- funding Coinbase accounts with Chase credit cards,
- direct bank integration.
The integration of Ultimate Rewards to USDC offers a novel entry point, while credit card funding and direct bank links streamline transactions, potentially boosting adoption rates among mainstream users.
- converting Chase Ultimate Rewards to USDC,
- funding Coinbase accounts with Chase credit cards,
- direct bank integration.
The integration of Ultimate Rewards to USDC offers a novel entry point, while credit card funding and direct bank links streamline transactions, potentially boosting adoption rates among mainstream users.
Coinbase
Coinbase and JPMorgan Chase join forces to make it even easier to access crypto
We’re partnering with JPMorgan Chase to offer 3 new ways to participate in crypto: the ability to transfer Chase Ultimate Rewards to USDC, the ability to use Chase credit cards to fund your Coinbase account, and a new direct bank integration.
🔥4
BlockDL a free & open-source GUI that lets you visually design Keras neural networks and learn ML.
🆒5
All about AI, Web 3.0, BCI
Langchain introduced Open Deep Research. Built on LangGraph, Open Deep Research: • Uses a supervisor architecture to coordinate research sub-agents • Supports your own LLMs, tools, and MCP servers • Produces high-quality reports with scoped, iterative deep…
Langchain introduced Deep Agents
Team created a new Python package which makes it easy to build your own Deep Agents.
The core algorithm for Deep Agents is actually the same - it’s an LLM running in a loop calling tools. The difference is:
1. Planning tool
2. Sub agents
3. File system
4. A detailed system prompt (prompting is not dead!)
Team created a new Python package which makes it easy to build your own Deep Agents.
The core algorithm for Deep Agents is actually the same - it’s an LLM running in a loop calling tools. The difference is:
1. Planning tool
2. Sub agents
3. File system
4. A detailed system prompt (prompting is not dead!)
LangChain Blog
Deep Agents
Using an LLM to call tools in a loop is the simplest form of an agent. This architecture, however, can yield agents that are “shallow” and fail to plan and act over longer, more complex tasks. Applications like “Deep Research”, “Manus”, and “Claude Code”…
🔥5❤4
Google Introduced AlphaEarth Foundations an AI model that integrates petabytes of satellite data into a single digital representation of Earth.
It'll give scientists a nearly real-time view of the planet to incredible spatial precision, and help with critical issues like food security, deforestation & water resources
It'll give scientists a nearly real-time view of the planet to incredible spatial precision, and help with critical issues like food security, deforestation & water resources
Google DeepMind
AlphaEarth Foundations helps map our planet in unprecedented detail
New AI model integrates petabytes of Earth observation data to generate a unified data representation that revolutionizes global mapping and monitoring
🔥3🆒3
Deep cogito released 4 hybrid reasoning models of sizes 70B, 109B MoE, 405B, 671B MoE under open license.
The models are built on Deep cogito’s work on building superintelligence using Iterated Distillation and Amplification (IDA). In particular, team scale the model’s intelligence prior by the model internalizing the reasoning process using iterative policy improvement, rather than simply searching longer at inference time.
This seems to be a novel scaling paradigm where the models develop more “intuition”, and serves as a strong proof of concept for self-improvement. Since the Cogito models develop a better intuition of the trajectory to take while searching at inference time, they have 60% shorter reasoning chains than Deepseek R1.
The models are built on Deep cogito’s work on building superintelligence using Iterated Distillation and Amplification (IDA). In particular, team scale the model’s intelligence prior by the model internalizing the reasoning process using iterative policy improvement, rather than simply searching longer at inference time.
This seems to be a novel scaling paradigm where the models develop more “intuition”, and serves as a strong proof of concept for self-improvement. Since the Cogito models develop a better intuition of the trajectory to take while searching at inference time, they have 60% shorter reasoning chains than Deepseek R1.
Deepcogito
Deep Cogito
Building general superintelligence
🔥4