Sarvam AI announced Sarvam 1, India's first home-grown Large Language Model
A 2B parameter model trained on 10 Indic languages + English, representing a major leap in Indian language AI.
Key findings:
- 2 trillion tokens of synthetic Indic data, equivalent to 6-8T regular tokens due to our super-efficient tokenizer
- the translated benchmarks are now open-source. Researchers can properly evaluate Indic language models on MMLU, ARC-Challenge, TriviaQA, and BoolQ in 10 Indic languages.
- On standard evals translated to Indic languages (MMLU, ARC, TriviaQA, BoolQ), Sarvam 1 matches or beats much larger models like Llama 3.1 8B.
Sarvam-2T, pretraining corpus, sets new standards: 2x longer documents, 3x higher quality, and 8x more scientific content than existing Indic datasets.
A 2B parameter model trained on 10 Indic languages + English, representing a major leap in Indian language AI.
Key findings:
- 2 trillion tokens of synthetic Indic data, equivalent to 6-8T regular tokens due to our super-efficient tokenizer
- the translated benchmarks are now open-source. Researchers can properly evaluate Indic language models on MMLU, ARC-Challenge, TriviaQA, and BoolQ in 10 Indic languages.
- On standard evals translated to Indic languages (MMLU, ARC, TriviaQA, BoolQ), Sarvam 1 matches or beats much larger models like Llama 3.1 8B.
Sarvam-2T, pretraining corpus, sets new standards: 2x longer documents, 3x higher quality, and 8x more scientific content than existing Indic datasets.
🔥5❤1
Ideogram launched Canvas, a creative platform for AI image generation and editing.
The system features Magic Fill for precise regional editing and Extend for expanding images beyond borders
Both tools maintain a consistent style across modifications.
The system features Magic Fill for precise regional editing and Extend for expanding images beyond borders
Both tools maintain a consistent style across modifications.
about.ideogram.ai
Ideogram Canvas, Magic Fill, and Extend
Ideogram Canvas is an infinite creative board for organizing, generating, editing, and combining images. Bring your face or brand visuals to Ideogram Canvas and use industry-leading Magic Fill and Extend to blend them with creative, AI-generated content.
🔥5
These researchers report to be replicating the capabilities of OpenAI's o1 model
Apparently, their journey learning technique encourages learning not just shortcuts, but the complete exploration process, including trial and error, reflection, and backtracking.
Claims that with only 327 training samples, their journey learning technique surpassed shortcut learning by 8.0% on the MATH dataset.
Apparently, their journey learning technique encourages learning not just shortcuts, but the complete exploration process, including trial and error, reflection, and backtracking.
Claims that with only 327 training samples, their journey learning technique surpassed shortcut learning by 8.0% on the MATH dataset.
arXiv.org
O1 Replication Journey: A Strategic Progress Report -- Part 1
This paper introduces a pioneering approach to artificial intelligence research, embodied in our O1 Replication Journey. In response to the announcement of OpenAI's groundbreaking O1 model, we...
Researchers announced Centaur - the first foundation model of human cognition.
Centaur can predict and simulate human behavior in any experiment expressible in natural language.
Researchers finetuned a state-of-the-art language model (Llama 3.1 70B) on this data set and found that the resulting model predicts behavior of unseen participants better than existing cognitive models in almost every single experiment.
HuggingFace.
Centaur can predict and simulate human behavior in any experiment expressible in natural language.
Researchers finetuned a state-of-the-art language model (Llama 3.1 70B) on this data set and found that the resulting model predicts behavior of unseen participants better than existing cognitive models in almost every single experiment.
HuggingFace.
OSF
Centaur: a foundation model of human cognition
Establishing a unified theory of cognition has been a major goal of psychology. While there have been previous attempts to instantiate such theories by building computational models, we currently do not have one model that captures the human mind in its entirety.…
⚡️Meta develops AI search egine to lessen reliance on Google, Microsoft
The Facebook owner is working on a search engine that crawls the web to provide conversational answers about current events to people using its Meta AI chatbot.
In doing so, Meta hopes to lower its reliance on Google Search and Microsoft’s Bing, which currently provide information about news, sports and stocks to people using Meta AI.
It could also give Meta a backup option if Google or Microsoft withdrew from these arrangements.
Meta recently struck a deal with news agency Reuters to help Meta AI answer questions about current events and news.
It isn’t clear whether Meta pays Google or Microsoft for powering answers to questions to its chatbot. Zuckerberg said in an interview in April that “there’s not a ton of money flowing either way” between Meta and Google, without elaborating.
The Facebook owner is working on a search engine that crawls the web to provide conversational answers about current events to people using its Meta AI chatbot.
In doing so, Meta hopes to lower its reliance on Google Search and Microsoft’s Bing, which currently provide information about news, sports and stocks to people using Meta AI.
It could also give Meta a backup option if Google or Microsoft withdrew from these arrangements.
Meta recently struck a deal with news agency Reuters to help Meta AI answer questions about current events and news.
It isn’t clear whether Meta pays Google or Microsoft for powering answers to questions to its chatbot. Zuckerberg said in an interview in April that “there’s not a ton of money flowing either way” between Meta and Google, without elaborating.
The Information
Meta Develops AI Search Engine to Lessen Reliance on Google, Microsoft
As Meta Platforms tries to keep up with OpenAI in developing artificial intelligence, the Facebook owner is working on a search engine that crawls the web to provide conversational answers about current events to people using its Meta AI chatbot. In doing…
OpenAI will likely be promoting its Chrome extension for setting ChatGPT as a default search along with a SearchGPT launch.
This extension previously appeared in the standalone SearchGPT and still points to it
This extension previously appeared in the standalone SearchGPT and still points to it
Google
ChatGPT search - Chrome Web Store
Change default search engine to ChatGPT search.
the-next-big-arenas-of-competition_final.pdf
17.7 MB
Future Market Leaders: McKinsey's Vision for 2040
McKinsey Global Institute has identified 18 key market arenas that could reshape the global economy by 2040.
Here's what you need to know:
🔹 Total Market Potential:
- Revenue growth from $7.25T (2022) to $29-48T (2040)
- Projected profits of $2-6T by 2040
- Combined CAGR of 8-11%
🔝 Top 5 Markets by 2040 Revenue:
1. E-commerce: $14-20T (currently $4T)
2. AI Software & Services: $1.5-4.6T (from $85B)
3. Cloud Services: $1.6-3.4T (from $220B)
4. Electric Vehicles: $2.5-3.2T (from $450B)
5. Digital Advertising: $2.1-2.9T (from $520B)
🚀 Fastest Growing Sectors (CAGR):
- AI Software & Services: 17-25%
- Robotics: 13-23%
- Cloud Services: 12-17%
- Batteries: 12-14%
💰 Highest Profit Margins:
- Obesity Drugs: 25-35%
- Semiconductors: 20-25%
- AI Software & Services: 15-20%
- Digital Advertising: 15-20%
🌟 Emerging Technologies:
- Future Air Mobility
- Shared Autonomous Vehicles
- Industrial & Consumer Biotech
- Nuclear Fission Power Plants
McKinsey Global Institute has identified 18 key market arenas that could reshape the global economy by 2040.
Here's what you need to know:
🔹 Total Market Potential:
- Revenue growth from $7.25T (2022) to $29-48T (2040)
- Projected profits of $2-6T by 2040
- Combined CAGR of 8-11%
🔝 Top 5 Markets by 2040 Revenue:
1. E-commerce: $14-20T (currently $4T)
2. AI Software & Services: $1.5-4.6T (from $85B)
3. Cloud Services: $1.6-3.4T (from $220B)
4. Electric Vehicles: $2.5-3.2T (from $450B)
5. Digital Advertising: $2.1-2.9T (from $520B)
🚀 Fastest Growing Sectors (CAGR):
- AI Software & Services: 17-25%
- Robotics: 13-23%
- Cloud Services: 12-17%
- Batteries: 12-14%
💰 Highest Profit Margins:
- Obesity Drugs: 25-35%
- Semiconductors: 20-25%
- AI Software & Services: 15-20%
- Digital Advertising: 15-20%
🌟 Emerging Technologies:
- Future Air Mobility
- Shared Autonomous Vehicles
- Industrial & Consumer Biotech
- Nuclear Fission Power Plants
❤6
How effective is human-AI collaboration?
A meta-analysis of 106 studies just published in Nature reports an interesting result:
On average, there was no synergy: Human–AI combinations did not perform better than both humans and AI.
In particular, when the AI alone outperformed the human alone, the human–AI combination led to performance losses, likely because humans were unable to integrate the suggestions provided by the AI.
Conversely, when the human outperformed the AI alone, there was some synergy and human–AI combination led to performance gains, likely because this time humans were better at integrating the AI suggestions.
A meta-analysis of 106 studies just published in Nature reports an interesting result:
On average, there was no synergy: Human–AI combinations did not perform better than both humans and AI.
In particular, when the AI alone outperformed the human alone, the human–AI combination led to performance losses, likely because humans were unable to integrate the suggestions provided by the AI.
Conversely, when the human outperformed the AI alone, there was some synergy and human–AI combination led to performance gains, likely because this time humans were better at integrating the AI suggestions.
Nature
When combinations of humans and AI are useful: A systematic review and meta-analysis
Nature Human Behaviour - Vaccaro et al. present a systematic review and meta-analysis of the performance of human–AI combinations, finding that on average, human–AI combinations...
❗️ OpenAI builds first chip with Broadcom and TSMC, scales back foundry ambition
Company has dropped ambitious foundry plans for now due to the costs and time needed to build a network, and plans instead to focus on in-house chip design efforts.
Company has dropped ambitious foundry plans for now due to the costs and time needed to build a network, and plans instead to focus on in-house chip design efforts.
Reuters
Exclusive: OpenAI builds first chip with Broadcom and TSMC, scales back foundry ambition
OpenAI is working with Broadcom and TSMC to build its first in-house chip designed to support its artificial intelligence systems, while adding AMD chips alongside Nvidia chips to meet its surging infrastructure demands, sources told Reuters.
Osmo digitized scent! A fresh summer plum was the first fruit and scent to be fully digitized and reprinted with no human intervention
Osmo is revolutionizing fragrance creation with AI!
3 new scent molecules, GLOSSINE, FRACTALINE, and QUASARINE, offer perfumers a fresh and innovative palette.
Also Osmo Introduced Inspire, a GenAI tool to turn your imagination and memories directly into fragrance.
Osmo is revolutionizing fragrance creation with AI!
3 new scent molecules, GLOSSINE, FRACTALINE, and QUASARINE, offer perfumers a fresh and innovative palette.
Also Osmo Introduced Inspire, a GenAI tool to turn your imagination and memories directly into fragrance.
🔥5❤🔥2
Anthropic published a repo with courses on how to use LLMs.
GitHub
GitHub - anthropics/courses: Anthropic's educational courses
Anthropic's educational courses. Contribute to anthropics/courses development by creating an account on GitHub.
How do we represent 3D world knowledge for spatial intelligence in next-generation robots? An extensive survey paper on this emerging topic, covering recent state-of-the-art.
GitHub.
GitHub.
arXiv.org
Neural Fields in Robotics: A Survey
Neural Fields have emerged as a transformative approach for 3D scene representation in computer vision and robotics, enabling accurate inference of geometry, 3D semantics, and dynamics from posed...
🆒3
Jailbreaking LLM-Controlled Robots
Recent research has uncovered a concerning vulnerability in AI-powered robots that should make us all pause and think.
While robots controlled by LLMs like ChatGPT represent an exciting technological advancement, they may also pose unexpected security risks.
The Rise of AI Robots
We're already seeing AI-powered robots in our world. Boston Dynamics' Spot robot dog ($75,000) is being used by SpaceX and NYPD. The more affordable Unitree Go2 ($3,500) is commercially available to consumers. These robots can now be controlled through voice commands or text, thanks to integration with LLMs like ChatGPT.
While LLMs are programmed to refuse harmful requests (like providing instructions for building explosives), researchers have discovered they can be "jailbroken" - tricked into bypassing these safety measures.
What's particularly alarming is that this vulnerability extends to robots controlled by these AI systems.
The RoboPAIR Discovery
A research team developed a method called RoboPAIR that demonstrated how alarmingly easy it is to bypass these robots' safety protocols.
In controlled experiments, they successfully manipulated:
- Self-driving vehicle systems to ignore safety protocols
- Robot platforms to enter restricted areas
- Mobile robots to execute potentially dangerous actions
This isn't just about theoretical risks. These robots are already being deployed in various settings - from construction sites to law enforcement. The ability to bypass their safety measures poses real-world risks that need immediate attention.
The researchers emphasize that we urgently need:
1. Robust defense mechanisms specifically designed for robotic systems
2. Better understanding of context-dependent safety protocols
3. Collaboration between robotics and AI safety experts
Recent research has uncovered a concerning vulnerability in AI-powered robots that should make us all pause and think.
While robots controlled by LLMs like ChatGPT represent an exciting technological advancement, they may also pose unexpected security risks.
The Rise of AI Robots
We're already seeing AI-powered robots in our world. Boston Dynamics' Spot robot dog ($75,000) is being used by SpaceX and NYPD. The more affordable Unitree Go2 ($3,500) is commercially available to consumers. These robots can now be controlled through voice commands or text, thanks to integration with LLMs like ChatGPT.
While LLMs are programmed to refuse harmful requests (like providing instructions for building explosives), researchers have discovered they can be "jailbroken" - tricked into bypassing these safety measures.
What's particularly alarming is that this vulnerability extends to robots controlled by these AI systems.
The RoboPAIR Discovery
A research team developed a method called RoboPAIR that demonstrated how alarmingly easy it is to bypass these robots' safety protocols.
In controlled experiments, they successfully manipulated:
- Self-driving vehicle systems to ignore safety protocols
- Robot platforms to enter restricted areas
- Mobile robots to execute potentially dangerous actions
This isn't just about theoretical risks. These robots are already being deployed in various settings - from construction sites to law enforcement. The ability to bypass their safety measures poses real-world risks that need immediate attention.
The researchers emphasize that we urgently need:
1. Robust defense mechanisms specifically designed for robotic systems
2. Better understanding of context-dependent safety protocols
3. Collaboration between robotics and AI safety experts
Machine Learning Blog | ML@CMU | Carnegie Mellon University
Jailbreaking LLM-Controlled Robots
Summary. Recent research has shown that large language models (LLMs) such as ChatGPT are susceptible to jailbreaking attacks, wherein malicious users fool an LLM into generating toxic content (e.g., bomb-building instructions). However, these attacks are…
RELAI agents for real-time hallucination detection in popular LLMs.
relai.ai
RELAI: Optimized Agentic AI on Your Data
Rely on RELAI agents for your AI reliability needs, from model evaluation and debugging to leveraging state-of-the-art system-level and user-facing safeguards.
New paper from OpenAI: SimpleQA is newly open-sourced factuality benchmark that contains 4,326 short, fact-seeking questions that are challenging for frontier models.
- High correctness via robust data quality verification / human agreement rates.
- Good researcher UX. Easy to grade, easy to run.
- Challenging for frontier models. GPT-4o and Claude both score less than 50%
- Diversity. SimpleQA contains questions from a wide range of topics, including history, science & technology, art, geography, TV shows, etc.
- High correctness via robust data quality verification / human agreement rates.
- Good researcher UX. Easy to grade, easy to run.
- Challenging for frontier models. GPT-4o and Claude both score less than 50%
- Diversity. SimpleQA contains questions from a wide range of topics, including history, science & technology, art, geography, TV shows, etc.
Openai
Introducing SimpleQA
A factuality benchmark called SimpleQA that measures the ability for language models to answer short, fact-seeking questions.
OpenAI introduced ChatGPT search
ChatGPT can now search the web in a much better way than before so you get fast, timely answers with links to relevant web sources.
Plus and Team users will get access to Search today.
OpenAI used synthetic data to fine tune the search model. 'The search model is a fine-tuned version of GPT-4o, post-trained using novel synthetic data generation techniques, including distilling outputs from OpenAI o1-preview.'
Search is also coming soon to both Advanced Voice and Canvas.
ChatGPT can now search the web in a much better way than before so you get fast, timely answers with links to relevant web sources.
Plus and Team users will get access to Search today.
OpenAI used synthetic data to fine tune the search model. 'The search model is a fine-tuned version of GPT-4o, post-trained using novel synthetic data generation techniques, including distilling outputs from OpenAI o1-preview.'
Search is also coming soon to both Advanced Voice and Canvas.
Openai
Introducing ChatGPT search
Get fast, timely answers with links to relevant web sources
Breakthrough: Physical Intelligence (π) Have Created "GPT for Robots"
Meet π0 (pi-zero) - the first AI that lets robots understand human commands just like ChatGPT, but for physical tasks.
Watch it fold laundry, clean tables, and assemble boxes like a pro.
The future where we can simply tell robots what to do - and they'll figure out how to do it - is finally here.
Meet π0 (pi-zero) - the first AI that lets robots understand human commands just like ChatGPT, but for physical tasks.
Watch it fold laundry, clean tables, and assemble boxes like a pro.
The future where we can simply tell robots what to do - and they'll figure out how to do it - is finally here.
www.pi.website
Our First Generalist Policy
Our first generalist policy, π0, a prototype model that combines large-scale multi-task and multi-robot data collection with a new network architecture to enable the most capable and dexterous generalist robot policy to date.
👍3
Meta FAIR announced 3 new cutting-edge developments in robotics and touch perception — and a new benchmark for human-robot collaboration to enable future work in this space.
1. Meta Sparsh is the first general-purpose encoder for vision-based tactile sensing that works across many tactile sensors and many tasks. Trained on 460K+ tactile images using self-supervised learning.
2. Meta Digit 360 is a breakthrough artificial fingertip-based tactile sensor, equipped with 18+ sensing features to deliver detailed touch data with human-level precision and touch-sensing capabilities.
3. Meta Digit Plexus is a standardized platform for robotic sensor connections and interactions. It provides a hardware-software solution to integrate tactile sensors on a single robot hand and enables seamless data collection, control and analysis over a single cable.
Also Meta released PARTNR: a benchmark for Planning And Reasoning Tasks in humaN-Robot collaboration. Built on Habitat 3.0, it’s the largest benchmark of its kind to study and evaluate human-robot collaboration in household activities.
1. Meta Sparsh is the first general-purpose encoder for vision-based tactile sensing that works across many tactile sensors and many tasks. Trained on 460K+ tactile images using self-supervised learning.
2. Meta Digit 360 is a breakthrough artificial fingertip-based tactile sensor, equipped with 18+ sensing features to deliver detailed touch data with human-level precision and touch-sensing capabilities.
3. Meta Digit Plexus is a standardized platform for robotic sensor connections and interactions. It provides a hardware-software solution to integrate tactile sensors on a single robot hand and enables seamless data collection, control and analysis over a single cable.
Also Meta released PARTNR: a benchmark for Planning And Reasoning Tasks in humaN-Robot collaboration. Built on Habitat 3.0, it’s the largest benchmark of its kind to study and evaluate human-robot collaboration in household activities.
Meta AI
Advancing embodied AI through progress in touch perception, dexterity, and human-robot interaction
Today, Meta FAIR is publicly releasing several new research artifacts that advance robotics and support our goal of reaching advanced machine intelligence (AMI).
Huggingface made an 11T magical training set that made 3 different SOTA models.
The best 135M, 360M, 1.7B models to date.
The best 135M, 360M, 1.7B models to date.
huggingface.co
SmolLM2 - a HuggingFaceTB Collection
State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M
Systems of Agents will become the new source of truth for enterprises.
As they capture and act on information at its source, understand the full context of business communication, and continuously learn from every interaction, they'll generate richer, more accurate data than traditional systems ever could.
The old boundaries between data entry, engagement, and analysis won't just blur - they'll become irrelevant.
The winners in this new era won't be those who build better UX or smarter "predictive" analytics. They'll be those who create systems that think, learn, and act with the fluidity of human teams while operating at machine scale. They'll be those who recognize that the future of enterprise software isn't about making better tools - it's about creating digital workers that truly understand and enhance how business gets done.
The era of separated systems is ending. The age of Systems of Agents has begun.
As they capture and act on information at its source, understand the full context of business communication, and continuously learn from every interaction, they'll generate richer, more accurate data than traditional systems ever could.
The old boundaries between data entry, engagement, and analysis won't just blur - they'll become irrelevant.
The winners in this new era won't be those who build better UX or smarter "predictive" analytics. They'll be those who create systems that think, learn, and act with the fluidity of human teams while operating at machine scale. They'll be those who recognize that the future of enterprise software isn't about making better tools - it's about creating digital workers that truly understand and enhance how business gets done.
The era of separated systems is ending. The age of Systems of Agents has begun.
Foundation Capital
A System of Agents brings Service-as-Software to life - Foundation Capital
Software stands at the threshold of the most profound change in its history.
🦄7