Notion released AI for Work, a suite of work-centered AI features, including:
— AI Meeting Notes
— Enterprise Search to find answers across tools
— Research Mode to draft docs
— Access to models, including GPT-4.1 & Claude 3.7
— AI Meeting Notes
— Enterprise Search to find answers across tools
— Research Mode to draft docs
— Access to models, including GPT-4.1 & Claude 3.7
Notion
Is Notion's Business or Enterprise plan right for you?
Bringing AI into daily workflows is one of the biggest challenges companies face. Our Business and Enterprise plans meet that head-on with powerful AI tools, custom workflows, and rock-solid security— all in one connected workspace. Find the plan that fits…
❤4👍2👏2
Google introduced AlphaEvolve an AI coding agent
It’s able to:
1. Design faster matrix multiplication algorithms
2. Find new solutions to open math problems
3. Make data centers, chip design and AI training more efficient across Google.
AlphaEvolve uses:
- LLMs: To synthesize information about problems as well as previous attempts to solve them - and to propose new versions of algorithms
- Automated evaluation: To address the broad class of problems where progress can be clearly and systematically measured.
- Evolution: Iteratively improving the best algorithms found, and re-combining ideas from different solutions to find even better ones.
Google applied AlphaEvolve to a fundamental problem in computer science: discovering algorithms for matrix multiplication. It managed to identify multiple new algorithms.
This significantly advances our previous model AlphaTensor, which AlphaEvolve outperforms using its better and more generalist approach.
It’s able to:
1. Design faster matrix multiplication algorithms
2. Find new solutions to open math problems
3. Make data centers, chip design and AI training more efficient across Google.
AlphaEvolve uses:
- LLMs: To synthesize information about problems as well as previous attempts to solve them - and to propose new versions of algorithms
- Automated evaluation: To address the broad class of problems where progress can be clearly and systematically measured.
- Evolution: Iteratively improving the best algorithms found, and re-combining ideas from different solutions to find even better ones.
Google applied AlphaEvolve to a fundamental problem in computer science: discovering algorithms for matrix multiplication. It managed to identify multiple new algorithms.
This significantly advances our previous model AlphaTensor, which AlphaEvolve outperforms using its better and more generalist approach.
Google DeepMind
AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms
New AI agent evolves algorithms for math and practical applications in computing by combining the creativity of large language models with automated evaluators
❤6🔥3👏2
Meta just released new models, benchmarks, and datasets that will transform the way researchers approach molecular property prediction, language processing, and neuroscience.
1. Open Molecules 2025 (OMol25): A dataset for molecular discovery with simulations of large atomic systems.
2. Universal Model for Atoms: A machine learning interatomic potential for modeling atom interactions across a wide range of materials and molecules.
3. Adjoint Sampling: A scalable algorithm for training generative models based on scalar rewards.
4. FAIR and the Rothschild Foundation Hospital partnered on a large-scale study that reveals striking parallels between language development in humans and LLMs.
1. Open Molecules 2025 (OMol25): A dataset for molecular discovery with simulations of large atomic systems.
2. Universal Model for Atoms: A machine learning interatomic potential for modeling atom interactions across a wide range of materials and molecules.
3. Adjoint Sampling: A scalable algorithm for training generative models based on scalar rewards.
4. FAIR and the Rothschild Foundation Hospital partnered on a large-scale study that reveals striking parallels between language development in humans and LLMs.
Meta AI
Sharing new breakthroughs and artifacts supporting molecular property prediction, language processing, and neuroscience
Meta FAIR is sharing new research artifacts that highlight our commitment to advanced machine intelligence (AMI) through focused scientific and academic progress.
❤6🔥6👍4
#DeepSeek presents:Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
Elaborates on hardware architecture and model design in achieving cost-efficient large-scale training and inference.
Elaborates on hardware architecture and model design in achieving cost-efficient large-scale training and inference.
❤3🔥3👏2👍1
Google introduced a notion of sufficient context to examine retrieval augmented generation (RAG) systems, developing a method to classify instances, analyzing failures of RAG systems & proposing a way to reduce hallucinations.
research.google
Deeper insights into retrieval augmented generation: The role of sufficient context
👍4🔥3❤2
Today "a milestone in the evolution of personalized therapies for rare & ultra-rare inborn errors of metabolism"
—the 1st human to undergo custom genome editing
—outgrowth of decades of NIH funded research.
Paper.
—the 1st human to undergo custom genome editing
—outgrowth of decades of NIH funded research.
Paper.
NY Times
Baby Is Healed With World’s First Personalized Gene-Editing Treatment
The technique used on a 9½-month-old boy with a rare condition has the potential to help people with thousands of other uncommon genetic diseases.
❤4🔥4👏2🤔1
Agents from scratch
This repo covers the basics of building agents:
+ Fundamentals
+ Build an agent
+ Agent eval
+ Agent w/ human-in-the-loop
+ Agent w/ long-term memory
Code (all open source).
Building agents -Combing workflow (router) with agent that can call email tools.
Notebook.
Slides.
Agent evals -Unit tests (Pytest) for triage decision + tools calls (test structured outputs using heuristic eval) and LLM-as-judge to eval email responses
Notebook.
Slides.
Human-in-the-loop -Add human in the loop for approval / editing of specific tool calls.
Notebook.
Memory - Add memory, so the agent learned email response preferences from human feedback
Notebook.
Agent can be hooked into Gmail by swapping out the tools used. Components are also general and can be used w/ various tools / MCP servers.
This repo covers the basics of building agents:
+ Fundamentals
+ Build an agent
+ Agent eval
+ Agent w/ human-in-the-loop
+ Agent w/ long-term memory
Code (all open source).
Building agents -Combing workflow (router) with agent that can call email tools.
Notebook.
Slides.
Agent evals -Unit tests (Pytest) for triage decision + tools calls (test structured outputs using heuristic eval) and LLM-as-judge to eval email responses
Notebook.
Slides.
Human-in-the-loop -Add human in the loop for approval / editing of specific tool calls.
Notebook.
Memory - Add memory, so the agent learned email response preferences from human feedback
Notebook.
Agent can be hooked into Gmail by swapping out the tools used. Components are also general and can be used w/ various tools / MCP servers.
Google Docs
Building Ambient Agents: Teaser + LangGraph 101
Building ambient agents with
❤7🔥4👏2
Qwen introduced Parallel Scaling Law for Language Models
"We introduce the third and more inference-efficient scaling paradigm: increasing the model’s parallel computation during both training and inference time."
"We draw inspiration from classifier-free guidance (CFG)".
"In this paper, we hypothesize that the effectiveness of CFG lies in its double computation."
"We propose a proof-of-concept scaling approach called parallel scaling (PARSCALE) to validate this hypothesis on language models. "
"parallelizing into P streams equates to scaling the model parameters by O(log P)".
"for a 1.6B model, when scaling to P = 8 using PARSCALE, it uses 22× less memory increase and 6× less latency increase compared to parameter scaling that achieves the same model capacity".
GitHub
"We introduce the third and more inference-efficient scaling paradigm: increasing the model’s parallel computation during both training and inference time."
"We draw inspiration from classifier-free guidance (CFG)".
"In this paper, we hypothesize that the effectiveness of CFG lies in its double computation."
"We propose a proof-of-concept scaling approach called parallel scaling (PARSCALE) to validate this hypothesis on language models. "
"parallelizing into P streams equates to scaling the model parameters by O(log P)".
"for a 1.6B model, when scaling to P = 8 using PARSCALE, it uses 22× less memory increase and 6× less latency increase compared to parameter scaling that achieves the same model capacity".
GitHub
arXiv.org
Parallel Scaling Law for Language Models
It is commonly believed that scaling language models should commit a significant space or time cost, by increasing the parameters (parameter scaling) or output tokens (inference-time scaling). We...
🔥4👍3🥰2
OpenAI introduced AI agent codex.
it is a software engineering agent that runs in the cloud and does tasks for you, like writing a new feature of fixing a bug.
U can run many tasks in parallel. Starting to roll out today to ChatGPT pro, enterprise, and team users.
it is a software engineering agent that runs in the cloud and does tasks for you, like writing a new feature of fixing a bug.
U can run many tasks in parallel. Starting to roll out today to ChatGPT pro, enterprise, and team users.
Openai
Introducing Codex
Introducing Codex: a cloud-based software engineering agent that can work on many tasks in parallel, powered by codex-1. With Codex, developers can simultaneously deploy multiple agents to independently handle coding tasks such as writing features, answering…
❤3👏3💯2👍1
Arc is hiring a Chief Scientific Officer!
We think is the most dynamic and ambitious biology research organization in the world.
They want to build the world's first virtual cell + use it to develop cures to complex diseases like Alzheimer's.
We think is the most dynamic and ambitious biology research organization in the world.
They want to build the world's first virtual cell + use it to develop cures to complex diseases like Alzheimer's.
Agentic_AI_for_Financial_Services_1747618078.pdf
2.2 MB
IBM has released report the Agentic AI in Financial Services
highlights
1. AI agents enable a paradigm shift from responsive to adaptive technology services, creating more accessible, personalised #banking services & experiences for customers.
2. Organisations must adopt a measured, phased approach balancing #innovation with comprehensive risk management, through thorough assessments, clear governance structures, talent development, & continuous monitoring.
3. Successfully managing agentic AI requires seamless cooperation across business functions, supported by transparent governance structures and communication channels.
4. Understanding where new risks emerge and implementing appropriate controls is essential, recognising agentic AI as a fundamentally different technological paradigm requiring new approaches to #governance & controls.
5. Early integration of #compliance considerations validates use cases before significant investment while ensuring alignment with organisational risk appetite.
6. Organisations must develop holistic AI literacy programs—including philosophy, linguistics, law, and anthropology—to formulate and responsible #AIstrategies and address potential model biases.
highlights
1. AI agents enable a paradigm shift from responsive to adaptive technology services, creating more accessible, personalised #banking services & experiences for customers.
2. Organisations must adopt a measured, phased approach balancing #innovation with comprehensive risk management, through thorough assessments, clear governance structures, talent development, & continuous monitoring.
3. Successfully managing agentic AI requires seamless cooperation across business functions, supported by transparent governance structures and communication channels.
4. Understanding where new risks emerge and implementing appropriate controls is essential, recognising agentic AI as a fundamentally different technological paradigm requiring new approaches to #governance & controls.
5. Early integration of #compliance considerations validates use cases before significant investment while ensuring alignment with organisational risk appetite.
6. Organisations must develop holistic AI literacy programs—including philosophy, linguistics, law, and anthropology—to formulate and responsible #AIstrategies and address potential model biases.
❤4👍4👏2🥴1
The UK tax authority has announced that, starting from January 1, 2026, crypto asset companies operating in the UK must comprehensively report user and transaction data, including user identity, address, tax identification number, and details of each transaction, in compliance with the global Crypto Asset Reporting Framework (CARF) to combat tax evasion and enhance transparency.
Violators will face a maximum fine of £300 per user.
Violators will face a maximum fine of £300 per user.
DL News
UK crypto firms told to report every user and transaction or risk stiff penalties
Crypto firms must report user and transaction data starting in 2026. Penalties of up to £300 per user apply for faulty or missing data. The UK’s open approach differs slightly from the EU’s MiCA model.
🦄5🔥3👏2
Adobe introduced HUMOTO, 4D dataset for human-object interaction, developed with a combination of wearable motion capture, SOTA 6D pose estimation vision models, LLM, and the professional refining works of multiple animation studios.
HUMOTO features:
1. Over 700 diverse daily activities
2. Interactions with 60+ objects, 70+ articulated parts.
3. Fine-grained text annotations
4. Detailed hand and finger movements.
HUMOTO features:
1. Over 700 diverse daily activities
2. Interactions with 60+ objects, 70+ articulated parts.
3. Fine-grained text annotations
4. Detailed hand and finger movements.
jiaxin-lu.github.io
HUMOTO Project Page
HUMOTO dataset.
❤4🥰2👏2
Microsoft introduced Magentic-UI — an experimental human-centered web agent
It automates your web tasks while keeping you in control —through co-planning, co-tasking, action guards, and plan learning.
Fully open-source.
It automates your web tasks while keeping you in control —through co-planning, co-tasking, action guards, and plan learning.
Fully open-source.
GitHub
GitHub - microsoft/magentic-ui: A research prototype of a human-centered web agent
A research prototype of a human-centered web agent - microsoft/magentic-ui
❤4🔥3👍2
Google is launching their own coding agent, Jules, at I/O today
It lets you make changes to your GitHub repos with English prompts in a VM using Gemini 2.5 Pro. It's Google’s version of Devin.
Here's the leaked official repo of prompts it can do.
It lets you make changes to your GitHub repos with English prompts in a VM using Gemini 2.5 Pro. It's Google’s version of Devin.
Here's the leaked official repo of prompts it can do.
jules.google
Jules - An Autonomous Coding Agent
Jules is an Autonomous agent that gets out of your way. It lets you focus on the coding you want to do, meanwhile picking up all the other random tasks that you rather not do.
🔥7❤3👏2
ByteDance presents the Pre-trained Model Averaging strategy, a novel framework for model merging during LLM pre-training.
Found that merging checkpoints trained with constant learning rates not only achieves significant performance improvements but also enables accurate prediction of annealing behavior.
Code might be posted there
Found that merging checkpoints trained with constant learning rates not only achieves significant performance improvements but also enables accurate prediction of annealing behavior.
Code might be posted there
arXiv.org
Model Merging in Pre-training of Large Language Models
Model merging has emerged as a promising technique for enhancing large language models, though its application in large-scale pre-training remains relatively unexplored. In this paper, we present...
❤5👏3🔥2
Claude 4 is here ! Anthropic Prepares to Launch Claude 4
The lineup will reportedly include 2 versions:
1. Claude Sonnet 4 - a faster version for everyday tasks.
2. Claude Opus 4 - a powerful model for complex problems and creative work.
Based on leaked information, these models are currently in a closed testing environment marked "not intended for production use" and subject to strict rate limitations.
Several intriguing features have been discovered in the configuration files:
"show_raw_thinking" / "show_raw_thinking_mechanism" - functionality that potentially allows users to observe the AI's thought formation process.
The models appear to operate in an environment based on a popular web framework.
A sophisticated "feature gates" system is being employed to precisely control which capabilities are available to different users.
Configuration includes specific parameters for various interaction methods and model performance metrics.
The technical JSON contains numerous rule sets and session parameters suggesting enhanced personalization capabilities.
The lineup will reportedly include 2 versions:
1. Claude Sonnet 4 - a faster version for everyday tasks.
2. Claude Opus 4 - a powerful model for complex problems and creative work.
Based on leaked information, these models are currently in a closed testing environment marked "not intended for production use" and subject to strict rate limitations.
Several intriguing features have been discovered in the configuration files:
"show_raw_thinking" / "show_raw_thinking_mechanism" - functionality that potentially allows users to observe the AI's thought formation process.
The models appear to operate in an environment based on a popular web framework.
A sophisticated "feature gates" system is being employed to precisely control which capabilities are available to different users.
Configuration includes specific parameters for various interaction methods and model performance metrics.
The technical JSON contains numerous rule sets and session parameters suggesting enhanced personalization capabilities.
archive.is
https://claude-ai.staging.ant.dev/api/bootstrap
archived 21 May 2025 06:51:19 UTC
👍8
Google DeepMind introduced Gemma 3n, a model that runs on as little as 2GB of RAM
It shares the same architecture as Gemini Nano, and is engineered for incredible performance. Researchers added audio understanding, so now it’s multimodal, fast and lean, and runs on-device (no cloud connection required!).
It shares the same architecture as Gemini Nano, and is engineered for incredible performance. Researchers added audio understanding, so now it’s multimodal, fast and lean, and runs on-device (no cloud connection required!).
Google AI for Developers
Gemma 3n model overview | Google AI for Developers
🆒4
Coming soon: AI avatars in Google Vids
Just write a script and choose an avatar to deliver your message. It’s a fast, consistent way to create polished video content — for onboarding, announcements, product explainers, and more.
Just write a script and choose an avatar to deliver your message. It’s a fast, consistent way to create polished video content — for onboarding, announcements, product explainers, and more.
👍4❤1
Agents at home. Mistral released Devstral, a SOTA open model designed specifically for coding agents and developed with All hands AI.
mistral.ai
Devstral | Mistral AI
Introducing the best open-source model for coding agents.
❤4
P-1 A shared the first paper: "On the Evaluation of Engineering AGI"
A paradigm shift has occurred in the AI field over the past year - the focus shifted from algorithms to ever more complex and diverse evals and RL environments
Together with domain experts researchers've developed in-house an "equivalent" of SWE-Bench but for design engineers who work at industrial oems of the world.
They informally call it the "Archie IQ" eval.
They developed a rich taxonomy of evaluation questions spanning from methodological knowledge to real-world design problems and est Archie on very complex design workflows, that encompass design synthesis, evaluation, etc.
A paradigm shift has occurred in the AI field over the past year - the focus shifted from algorithms to ever more complex and diverse evals and RL environments
Together with domain experts researchers've developed in-house an "equivalent" of SWE-Bench but for design engineers who work at industrial oems of the world.
They informally call it the "Archie IQ" eval.
They developed a rich taxonomy of evaluation questions spanning from methodological knowledge to real-world design problems and est Archie on very complex design workflows, that encompass design synthesis, evaluation, etc.
arXiv.org
On the Evaluation of Engineering Artificial General Intelligence
We discuss the challenges and propose a framework for evaluating engineering artificial general intelligence (eAGI) agents. We consider eAGI as a specialization of artificial general intelligence...
❤4