Amazon announced they have developed their own quantum chip; the Ocelot.
This follows Microsoft's reveal last week of their Majorana 1.
'We believe that scaling Ocelot to a full-fledged quantum computer capable of transformative societal impact would require as little as one-tenth as many resources as common approaches, helping bring closer the age of practical quantum computing.'
'We believe that Ocelot's architecture, with its hardware-efficient approach to error correction, positions us well to tackle the next phase of quantum computing: learning how to scale.'
Paper.
This follows Microsoft's reveal last week of their Majorana 1.
'We believe that scaling Ocelot to a full-fledged quantum computer capable of transformative societal impact would require as little as one-tenth as many resources as common approaches, helping bring closer the age of practical quantum computing.'
'We believe that Ocelot's architecture, with its hardware-efficient approach to error correction, positions us well to tackle the next phase of quantum computing: learning how to scale.'
Paper.
❤3🔥3👍2
OpenAI will soon introduce the new GPT-4.5 model. Here's what is known about it.
"GPT-4.5 is not a frontier model, but it is OpenAI's largest LLM, improving on GPT-4's computational efficiency by more than 10x."
It offers:
— increased world knowledge
— improved writing ability
— refined personality
2-7% lift on 4o at SWE-Bench
"GPT-4.5 is not a frontier model, but it is OpenAI's largest LLM, improving on GPT-4's computational efficiency by more than 10x."
It offers:
— increased world knowledge
— improved writing ability
— refined personality
2-7% lift on 4o at SWE-Bench
🔥5❤2👍2
GPT-4.5 is out! Knowledge Still Stuck in October 2023, it’s not going to blow your mind, but it might befriend you.
It's more like a personality, communication, and creativity upgrade than a huge intelligence leap. It's like OpenAI is pivoting its base model from "bland assistant" to "AI bestie."
What it does do well:
- OpenAI says it scores 64% on SimpleQA (double GPT-4's score)
- Much better writing with cleaner, better structured, more human-like prose
- Genuinely warmer and more emotionally intelligent (gave me some good advice!)
- Less robotic, more opinionated responses
4.5 is more extroverted, agreeable, and less neurotic than 4o.
It's sometimes worse at following instructions and because it's less sycophantic and more creative.
The model received approximately 10x more computational resources during pre-training compared to GPT-4. Training occurred simultaneously across multiple data centers.
Pricing $75 per million input tokens and $150 per million output tokens – 15-30x more expensive than GPT-4o! This pricing reflects the model's scale and resource requirements.
Performance and Context Generation is noticeably slower than its predecessors, context length remains at 128K tokens. Knowledge cutoff stays at October 2023, which is disappointing for many users.
Functionality Supports Canvas, search, and file uploads. Currently lacks multimodal features like voice mode or video.
Availability:
Already available to Pro users and developers of all API tiers
Coming to Plus subscribers ($20) next week
OpenAI plans to add "tens of thousands of GPUs" next week to expand access
Independent Benchmark Results:
Aider Polyglot Coding Benchmark: Recent tests show that GPT-4.5 Preview significantly outperforms its predecessor but lags behind specialized models:
Claude 3.7 Sonnet with thinking mode (32k tokens) — 65%
Claude 3.7 Sonnet without thinking mode — 60%
DeepSeek V3 — 48%
GPT-4.5 Preview — 45%
ChatGPT-4o — 27%
GPT-4o — 23%
It's more like a personality, communication, and creativity upgrade than a huge intelligence leap. It's like OpenAI is pivoting its base model from "bland assistant" to "AI bestie."
What it does do well:
- OpenAI says it scores 64% on SimpleQA (double GPT-4's score)
- Much better writing with cleaner, better structured, more human-like prose
- Genuinely warmer and more emotionally intelligent (gave me some good advice!)
- Less robotic, more opinionated responses
4.5 is more extroverted, agreeable, and less neurotic than 4o.
It's sometimes worse at following instructions and because it's less sycophantic and more creative.
The model received approximately 10x more computational resources during pre-training compared to GPT-4. Training occurred simultaneously across multiple data centers.
Pricing $75 per million input tokens and $150 per million output tokens – 15-30x more expensive than GPT-4o! This pricing reflects the model's scale and resource requirements.
Performance and Context Generation is noticeably slower than its predecessors, context length remains at 128K tokens. Knowledge cutoff stays at October 2023, which is disappointing for many users.
Functionality Supports Canvas, search, and file uploads. Currently lacks multimodal features like voice mode or video.
Availability:
Already available to Pro users and developers of all API tiers
Coming to Plus subscribers ($20) next week
OpenAI plans to add "tens of thousands of GPUs" next week to expand access
Independent Benchmark Results:
Aider Polyglot Coding Benchmark: Recent tests show that GPT-4.5 Preview significantly outperforms its predecessor but lags behind specialized models:
Claude 3.7 Sonnet with thinking mode (32k tokens) — 65%
Claude 3.7 Sonnet without thinking mode — 60%
DeepSeek V3 — 48%
GPT-4.5 Preview — 45%
ChatGPT-4o — 27%
GPT-4o — 23%
Openai
Introducing GPT-4.5
We’re releasing a research preview of GPT‑4.5—our largest and best model for chat yet. GPT‑4.5 is a step forward in scaling up pre-training and post-training.
👍4❤2🔥2🤣2
#DeepSeek to built a new file system to train their AI model more efficiently
Fire-Flyer File System (3FS) - a parallel file system that utilizes the full bandwidth of modern SSDs and RDMA networks.
- 6.6 TiB/s aggregate read throughput in a 180-node cluster
- 3.66 TiB/min throughput on GraySort benchmark in a 25-node cluster
- 40+ GiB/s peak throughput per client node for KVCache lookup
- Disaggregated architecture with strong consistency semantics.
Training data preprocessing, dataset loading, checkpoint saving/reloading, embedding vector search & KVCache lookups for inference in V3/R1.
3FS
Smallpond - data processing framework on 3FS.
Fire-Flyer File System (3FS) - a parallel file system that utilizes the full bandwidth of modern SSDs and RDMA networks.
- 6.6 TiB/s aggregate read throughput in a 180-node cluster
- 3.66 TiB/min throughput on GraySort benchmark in a 25-node cluster
- 40+ GiB/s peak throughput per client node for KVCache lookup
- Disaggregated architecture with strong consistency semantics.
Training data preprocessing, dataset loading, checkpoint saving/reloading, embedding vector search & KVCache lookups for inference in V3/R1.
3FS
Smallpond - data processing framework on 3FS.
GitHub
GitHub - deepseek-ai/3FS: A high-performance distributed file system designed to address the challenges of AI training and inference…
A high-performance distributed file system designed to address the challenges of AI training and inference workloads. - GitHub - deepseek-ai/3FS: A high-performance distributed file system design...
❤4🔥4👏2
Reasoning models lack atomic thought. Unlike humans using independent units, they store full histories
Researchers Introduced Atom of Thoughts (AOT): lifts gpt-4o-mini to 80.6% F1 on HotpotQA, surpassing o3-mini and DeepSeek-R1
Code.
Researchers Introduced Atom of Thoughts (AOT): lifts gpt-4o-mini to 80.6% F1 on HotpotQA, surpassing o3-mini and DeepSeek-R1
Code.
❤5👏3🔥2
Nvidia presented Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids
Learning humanoid dexterous manipulation using sim-to-real RL, achieving robust generalization and high performance w/o the need for human demonstration
Learning humanoid dexterous manipulation using sim-to-real RL, achieving robust generalization and high performance w/o the need for human demonstration
🔥3❤2👏2
#DeepSeek introduced DeepSeek-V3/R1 Inference System Overview
Optimized throughput and latency via:
1. Cross-node EP-powered batch scaling
2. Computation-communication overlap
3. Load balancing
Statistics of DeepSeek's Online Service:
- 73.7k/14.8k input/output tokens per second per H800 node
- Cost profit margin 545%
Optimized throughput and latency via:
1. Cross-node EP-powered batch scaling
2. Computation-communication overlap
3. Load balancing
Statistics of DeepSeek's Online Service:
- 73.7k/14.8k input/output tokens per second per H800 node
- Cost profit margin 545%
GitHub
open-infra-index/202502OpenSourceWeek/day_6_one_more_thing_deepseekV3R1_inference_system_overview.md at main · deepseek-ai/open…
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation - deepseek-ai/open-infra-index
🥰3👍2👏2
New BIG-Bench Extra Hard benchmark released by Google DeepMind: average accuracy for general-purpose models is 9.8%, and 44.8% for reasoning models.
arXiv.org
BIG-Bench Extra Hard
Large language models (LLMs) are increasingly deployed in everyday applications, demanding robust general reasoning capabilities and diverse reasoning skillset. However, current LLM reasoning...
🔥3❤2👏2
Deutsche Telekom and Perplexity announced new ‘AI Phone’ priced at under $1K
Deutsche Telekom said that it is building an “AI Phone,” a low-cost handset created in close collaboration with Perplexity, along with Picsart and others, plus a new AI assistant app it’s calling “Magenta AI.”
Deutsche Telekom said that it is building an “AI Phone,” a low-cost handset created in close collaboration with Perplexity, along with Picsart and others, plus a new AI assistant app it’s calling “Magenta AI.”
TechCrunch
Deutsche Telekom and Perplexity announce new 'AI Phone' priced at under $1K | TechCrunch
It was inevitable that this year at MWC in Barcelona, at least one carrier would announce a major effort at building a smartphone with a top AI company.
❤3🔥3👍2
ReSearch: Teaching LLMs to Make Better Decisions Through Search
Baichuan AI has unveiled an exciting open-source project called ReSearch.
This innovative system teaches Large Language Models to improve their reasoning capabilities by actively searching for information when needed.
How ReSearch Works:
ReSearch combines Reinforcement Learning (RL) with Retrieval-Augmented Generation (RAG) to empower LLMs with a crucial skill: determining when to search for external information.
Similar to how humans look up facts when uncertain, these enhanced models learn to:
Identify knowledge gaps requiring external information
Formulate effective search queries
Execute multi-step, multi-hop searches for complex problems
Integrate search results into their reasoning process.
What makes this approach particularly impressive is that the model learns these search patterns without direct supervision.
Baichuan AI has unveiled an exciting open-source project called ReSearch.
This innovative system teaches Large Language Models to improve their reasoning capabilities by actively searching for information when needed.
How ReSearch Works:
ReSearch combines Reinforcement Learning (RL) with Retrieval-Augmented Generation (RAG) to empower LLMs with a crucial skill: determining when to search for external information.
Similar to how humans look up facts when uncertain, these enhanced models learn to:
Identify knowledge gaps requiring external information
Formulate effective search queries
Execute multi-step, multi-hop searches for complex problems
Integrate search results into their reasoning process.
What makes this approach particularly impressive is that the model learns these search patterns without direct supervision.
GitHub
GitHub - Agent-RL/ReCall: ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning & ReCall: Learning to Reason…
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning & ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning - Agent-RL/ReCall
🔥4❤3👍3
Sophgo has introduced 1st RISC-V servers that support #DeepSeek R1 models (1.5B to 70B)
Can do 11.8tps for 70B
SRA3-40 computing server use Sophgo's latest SG2044 64-core server CPU.
It also released SRB3-40 storage server & SRM3-40 convergence server on SG2044.
Can do 11.8tps for 70B
SRA3-40 computing server use Sophgo's latest SG2044 64-core server CPU.
It also released SRB3-40 storage server & SRM3-40 convergence server on SG2044.
Ithome
算能推出 SRA3-40:全球首款支持 DeepSeek 的 RISC-V 众核服务器 - IT之家
SRA3-40 属于计算服务器范畴,基于算能旗下算丰团队开发的新一代服务器级 64 核心 RISC-V 处理器 SG2044。
👍3🔥3❤2
Huge VLM release from Cohere for AI is just in
Aya-Vision is a new VLM family based on SigLIP and Aya, and it outperforms many larger models.
> 8B and 32B models covering 23 languages and two new benchmark dataset
> supported by HF transformers from get-go
Aya-Vision is a new VLM family based on SigLIP and Aya, and it outperforms many larger models.
> 8B and 32B models covering 23 languages and two new benchmark dataset
> supported by HF transformers from get-go
huggingface.co
Cohere Labs Aya Vision - a CohereLabs Collection
Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages.
👍3👏3🔥2
Forwarded from kurilo.md (Dmitri)
Looking for exceptionally strong engineers.
iOS (native, swift, obj-c)
Backend (GCP, Node.js, NestJS, Nx, Kubernetes)
Location: Europe. Remote is OK.
DM for more info @masterrr
Products I'm hiring for:
https://bereal.com/
https://carrotcare.health/
iOS (native, swift, obj-c)
Backend (GCP, Node.js, NestJS, Nx, Kubernetes)
Location: Europe. Remote is OK.
DM for more info @masterrr
Products I'm hiring for:
https://bereal.com/
https://carrotcare.health/
carrotcare.health
Carrot Care – Bloodwork Tracker for iPhone
Turn lab results into clear health insights. Track bloodwork, biomarkers, and trends with Carrot Care.
❤3👍3🔥3🤡1
Cohere released Aya Vision on Hugging Face
Aya Vision outperforms the leading open-weight models in multilingual text generation and image understanding.
In its parameter class, Aya Vision 8B achieves the best performance in combined multilingual multimodal tasks, outperforming Qwen2.5-VL 7B, Gemini Flash 1.5 8B, Llama-3.2 11B Vision, and Pangea 7B by up to 70% win rates on AyaVisionBench and 79% on m-WildVision.
Aya Vision 32B sets a new frontier in multilingual vision open-weights models, outperforming Llama-3.2 90B Vision, Molmo 72B and Qwen2-VL 72B by up to 64% win rates on AyaVisionBench and 72% win rates on m-WildVision.
Aya Vision outperforms the leading open-weight models in multilingual text generation and image understanding.
In its parameter class, Aya Vision 8B achieves the best performance in combined multilingual multimodal tasks, outperforming Qwen2.5-VL 7B, Gemini Flash 1.5 8B, Llama-3.2 11B Vision, and Pangea 7B by up to 70% win rates on AyaVisionBench and 79% on m-WildVision.
Aya Vision 32B sets a new frontier in multilingual vision open-weights models, outperforming Llama-3.2 90B Vision, Molmo 72B and Qwen2-VL 72B by up to 64% win rates on AyaVisionBench and 72% win rates on m-WildVision.
huggingface.co
Cohere Labs Aya Vision - a CohereLabs Collection
Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages.
🔥4❤3👏2
Commerce Secretary confirms the US Bitcoin Strategic Reserve is likely on the cards:
“A Bitcoin strategic reserve is something the President’s interested in and I think you’re going to see it executed on Friday.”
Trump will unveil the Bitcoin reserve strategy at the White House Crypto Summit. "So Bitcoin is one thing, and then the other currencies, the other crypto tokens, I think, will be treated differently—positively, but differently."
“A Bitcoin strategic reserve is something the President’s interested in and I think you’re going to see it executed on Friday.”
Trump will unveil the Bitcoin reserve strategy at the White House Crypto Summit. "So Bitcoin is one thing, and then the other currencies, the other crypto tokens, I think, will be treated differently—positively, but differently."
The Pavlovic Today
Howard Lutnick Reveals: Trump to Unveil Bitcoin Reserve Strategy at White House Crypto Summit - The Pavlovic Today
Commerce Secretary Howard Lutnick tells The Pavlovic Today that President Trump will unveil a Bitcoin reserve strategy at the White House Crypto Summit, marking a major shift in U.S. crypto policy.
🔥4❤2👏2
The 2024 Turing Award, the Nobel for Computer Science, goes to the inventors of reinforcement learning.
Andrew Barto and former PhD student Rich Sutton’s (famous for his essay The Bitter Lesson) work is foundational for ChatGPT post-training, AlphaGo, robotics and more.
Andrew Barto and former PhD student Rich Sutton’s (famous for his essay The Bitter Lesson) work is foundational for ChatGPT post-training, AlphaGo, robotics and more.
awards.acm.org
Andrew Barto and Richard Sutton are the recipients of the 2024 ACM A.M. Turing Award for developing the conceptual and algorithmic…
❤3🔥3👏2🦄2
All about AI, Web 3.0, BCI
Commerce Secretary confirms the US Bitcoin Strategic Reserve is likely on the cards: “A Bitcoin strategic reserve is something the President’s interested in and I think you’re going to see it executed on Friday.” Trump will unveil the Bitcoin reserve strategy…
Full list of confirmed attendees for the White House Crypto Summit this Friday with Trump
❤5🔥5👏3
Alibaba released QwQ-32B a new reasoning model with 32B parameters that rivals cutting-edge reasoning model, e.g., DeepSeek-R1.
HF
ModelScope
Demo
Chat.
HF
ModelScope
Demo
Chat.
Qwen
QwQ-32B: Embracing the Power of Reinforcement Learning
QWEN CHAT Hugging Face ModelScope DEMO DISCORD
Scaling Reinforcement Learning (RL) has the potential to enhance model performance beyond conventional pretraining and post-training methods. Recent studies have demonstrated that RL can significantly improve…
Scaling Reinforcement Learning (RL) has the potential to enhance model performance beyond conventional pretraining and post-training methods. Recent studies have demonstrated that RL can significantly improve…
👍3🔥3❤2
Emirates NBD, a wholly owned bank of the Dubai government, launched the Liv X app on March 6, offering cryptocurrency buying and selling services.
The service is based on the infrastructure of Aquanow and is hosted by Zodia, which is supported by Standard Chartered.
The service is based on the infrastructure of Aquanow and is hosted by Zodia, which is supported by Standard Chartered.
Coindesk
Dubai Government-Owned Bank Emirates NBD Offers Crypto Trading Through Liv X App
Liv is offering its crypto service using infrastructure operated by Aquanow, a digital asset platform licensed by Dubai's VARA.
❤3👍2🔥2🆒2
Today Anthropic submitted their recommendations to the OSTP for the U.S. AI Action Plan
Anthropic predicts powerful AI systems will appear by late 2026 or early 2027, with intellectual abilities matching Nobel Prize winners, able to autonomously handle digital tasks (text, audio, video, internet browsing), reason independently over hours or weeks, and control physical equipment digitally
They recommend stronger national security actions, including government testing of AI models for security risks, stricter export controls on key chips like the H20, and secure communication channels between AI labs and intelligence agencies
They suggest the government build 50 gigawatts of additional power capacity dedicated to AI by 2027, speed up AI adoption across federal agencies, and improve economic data collection to prepare for AI’s impact on jobs and society
Anthropic predicts powerful AI systems will appear by late 2026 or early 2027, with intellectual abilities matching Nobel Prize winners, able to autonomously handle digital tasks (text, audio, video, internet browsing), reason independently over hours or weeks, and control physical equipment digitally
They recommend stronger national security actions, including government testing of AI models for security risks, stricter export controls on key chips like the H20, and secure communication channels between AI labs and intelligence agencies
They suggest the government build 50 gigawatts of additional power capacity dedicated to AI by 2027, speed up AI adoption across federal agencies, and improve economic data collection to prepare for AI’s impact on jobs and society
Anthropic
Anthropic’s Recommendations to OSTP for the U.S. AI Action Plan
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
👍5🔥4👏2