A super interesting talk on Ring Attention, probably the magic behind Gemini's 1 million context window
You organize your devices (GPU/TPU) in a ring, each computing a part of the final attention output
Each device needs to see all keys/values to produce its part. The idea is that the attention output can be computed blockwise (by splitting on the sequence dimension). Each device computes the updated queries of a chunk of the sequence by sending/receiving keys/values
This is a great repo to understand it in code.
You organize your devices (GPU/TPU) in a ring, each computing a part of the final attention output
Each device needs to see all keys/values to produce its part. The idea is that the attention output can be computed blockwise (by splitting on the sequence dimension). Each device computes the updated queries of a chunk of the sequence by sending/receiving keys/values
This is a great repo to understand it in code.
GitHub
ring-flash-attention/test/test_ring_flash_attn_func.py at main · zhuzilin/ring-flash-attention
Ring attention implementation with flash attention - zhuzilin/ring-flash-attention
Intelligent fabrics, which can sense and communicate information scalably and unobtrusively, can fundamentally change how people interact with the world.
Science
Intelligent textiles are looking bright
Flexible fiber electronics couple with the human body for wireless tactile sensing
👍4
Apple presents Ferret-UI
Grounded Mobile UI Understanding with Multimodal LLMs
Recent advancements in multimodal large language models (MLLMs) have been noteworthy, yet, these general-domain MLLMs often fall short in their ability to comprehend and interact effectively with user interface (UI) screens.
Grounded Mobile UI Understanding with Multimodal LLMs
Recent advancements in multimodal large language models (MLLMs) have been noteworthy, yet, these general-domain MLLMs often fall short in their ability to comprehend and interact effectively with user interface (UI) screens.
⚡️AutoCodeRover is autonomous software engineer from Singapore
Takes in a Github issue (bug fixing or feature addition), resolves in few minutes, with minimal LLM cost ~$0.5
Takes in a Github issue (bug fixing or feature addition), resolves in few minutes, with minimal LLM cost ~$0.5
GitHub
auto-code-rover/preprint.pdf at main · nus-apr/auto-code-rover
A project structure aware autonomous software engineer aiming for autonomous program improvement. Resolved 37.3% tasks (pass@1) in SWE-bench lite and 46.2% tasks (pass@1) in SWE-bench verified with...
Google released CodeGemma, a new version of the Gemma line of models fine-tuned on code generation and completion, that achieves state-of-the-art results. Available in sizes 2B and 7B.
HF is here.
HF is here.
CEO Intel announced Lunar Lake with over 100 TOPS of platform AI performance. Shows off a Lunar Lake SoC on stage and says to expect significant gains.
👍3
Gemma is expanding.... Google announced CodeGemma, a version of Gemma tuned for code generation. And bonus... Gemma is now bumped to v1.1, addressing lots of feedback we got.
Googleblog
Google for Developers Blog - News about Web, Mobile, AI and Cloud
Meta confirmed its GPT-4 competitor, Llama 3, is coming within the month.
At an event in London, Meta confirmed that it plans an initial release of Llama 3, its GPT-4 competitor, within the next month.
The company did not disclose the size of the parameters used in Llama 3, but it's expected to have about 140 billion parameters.
At an event in London, Meta confirmed that it plans an initial release of Llama 3, its GPT-4 competitor, within the next month.
The company did not disclose the size of the parameters used in Llama 3, but it's expected to have about 140 billion parameters.
TechCrunch
Meta confirms that its Llama 3 open source LLM is coming in the next month
Meta's Llama families, built as open-source products, represent a different philosophical approach to how AI should develop as a wider technology.
❤4
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Distributed RObot Interaction Dataset: A diverse robot manipulation dataset with 76k demonstrations, collected across 564 scenes and 84 tasks over the course of a year.
Paper.
Distributed RObot Interaction Dataset: A diverse robot manipulation dataset with 76k demonstrations, collected across 564 scenes and 84 tasks over the course of a year.
Paper.
arXiv.org
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and robust robotic manipulation policies. However, creating...
This media is not supported in your browser
VIEW IN TELEGRAM
GE HealthCare’s Vscan Air SL with Caption AI software provides real-time guidance that shows healthcare professionals how to maneuver the probe to capture diagnostic-quality standard cardiac images.
With the help of on-device AI, there's now a way for handheld ultrasound users to confidently acquire cardiac views for rapid assessments at the point of care.
With the help of on-device AI, there's now a way for handheld ultrasound users to confidently acquire cardiac views for rapid assessments at the point of care.
Meta announced 2nd-gen inference chip MTIAv2
- 708TF/s Int8 / 353TF/s BF16
- 256MB SRAM, 128GB memory
- 90W TDP. 24 chips per node, 3 nodes per rack.
- standard PyTorch stack (Dynamo, Inductor, Triton) for flexibility
Fabbed on TSMC's 5nm process, its fully programmable via the standard PyTorch stack, driven via Triton for software kernels.
This chip is an inference power-house and the software work is entirely driven by the PyTorch team, making usability a first; and its been great to see it in action on various Meta workloads.
- 708TF/s Int8 / 353TF/s BF16
- 256MB SRAM, 128GB memory
- 90W TDP. 24 chips per node, 3 nodes per rack.
- standard PyTorch stack (Dynamo, Inductor, Triton) for flexibility
Fabbed on TSMC's 5nm process, its fully programmable via the standard PyTorch stack, driven via Triton for software kernels.
This chip is an inference power-house and the software work is entirely driven by the PyTorch team, making usability a first; and its been great to see it in action on various Meta workloads.
Meta
Our next generation Meta Training and Inference Accelerator
We are sharing details of our next generation chip in our Meta Training and Inference Accelerator (MTIA) family. MTIA is a long-term bet to provide the most efficient architecture for Meta’s unique workloads.
New paper from Berkeley on Autonomous Evaluation and Refinement of Digital Agents
VLM/LLM-based evaluators can significantly improve the performance of agents for web browsing and device control, advancing sotas by 29% to 75%.
VLM/LLM-based evaluators can significantly improve the performance of agents for web browsing and device control, advancing sotas by 29% to 75%.
arXiv.org
Autonomous Evaluation and Refinement of Digital Agents
We show that domain-general automatic evaluators can significantly improve the performance of agents for web navigation and device control. We experiment with multiple evaluation models that trade...
The music industry just had its 'ChatGPT for music' moment with Udio.
A new AI-powered music creation app called Udio from former Google DeepMind researchers just launched.
It allows users to generate full audio tracks in under 40 seconds with simple prompts and secured early funding from a16z, will i am, Common, and more.
A new AI-powered music creation app called Udio from former Google DeepMind researchers just launched.
It allows users to generate full audio tracks in under 40 seconds with simple prompts and secured early funding from a16z, will i am, Common, and more.
Udio
Udio | AI Music Generator - Official Website
Discover, create, and share music with the world. Use the latest technology to create AI music in seconds.
A newly revealed patent from Microsoft Bing detailed ‘Visual Search’.
The patent describes a reverse image search with personal results tailored to user preferences and interests.
The patent describes a reverse image search with personal results tailored to user preferences and interests.
⚡5
Integrated data visualization and analysis for workhorse biological assays
Make publication-quality figures in under a minute. No more fiddling with Excel, Prism, ggplot, and PowerPoint.
Make publication-quality figures in under a minute. No more fiddling with Excel, Prism, ggplot, and PowerPoint.
⚡4
Google presents Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
1B model that was fine-tuned on up to 5K sequence length passkey instances solves the 1M length problem.
The future of attention just dropped, and it looks a lot like a state space model (finite size, continual updates)
Little doubt now that a mixture of architectures will support the long-term, gradually conditioned memory needed for highly capable agents
1B model that was fine-tuned on up to 5K sequence length passkey instances solves the 1M length problem.
The future of attention just dropped, and it looks a lot like a state space model (finite size, continual updates)
Little doubt now that a mixture of architectures will support the long-term, gradually conditioned memory needed for highly capable agents
arXiv.org
Leave No Context Behind: Efficient Infinite Context Transformers...
This work introduces an efficient method to scale Transformer-based Large Language Models (LLMs) to infinitely long inputs with bounded memory and computation. A key component in our proposed...
⚡4
MR_for_autism_1712848311.pdf
2.6 MB
Eye tracking as a window onto conscious and non-conscious processing in the brain.
The goals are two-fold and relatively simple:
1. propose and test a method for familiarizing individuals with severe autism spectrum disorder (ASD) with the HoloLens 2 headset and the use of MR technology through a tutorial.
2. obtain quantitative learning indicators in MR, such as execution speed and eye tracking, by comparing individuals with ASD to neurotypical individuals.
Over 80% of individuals with ASD successfully familiarized themselves with MR after several sessions.
In addition, the visual activity of individuals with ASD did not differ from that of neurotypical individuals when they successfully familiarized themselves.
This opens a lot of doors for potential learning opportunities in this population.
The goals are two-fold and relatively simple:
1. propose and test a method for familiarizing individuals with severe autism spectrum disorder (ASD) with the HoloLens 2 headset and the use of MR technology through a tutorial.
2. obtain quantitative learning indicators in MR, such as execution speed and eye tracking, by comparing individuals with ASD to neurotypical individuals.
Over 80% of individuals with ASD successfully familiarized themselves with MR after several sessions.
In addition, the visual activity of individuals with ASD did not differ from that of neurotypical individuals when they successfully familiarized themselves.
This opens a lot of doors for potential learning opportunities in this population.
⚡4
Apple nears production of first M4 chips with AI upgrades and plans to bring the new chips to all of its Macs, including new MacBook Pros and Airs, Mac Pro, Mac Studio, Mac mini and iMac across the end of this year and 2025.
Bloomberg.com
Apple Plans to Overhaul Entire Mac Line With AI-Focused M4 Chips
Apple Inc., aiming to boost sluggish computer sales, is preparing to overhaul its entire Mac line with a new family of in-house processors designed to highlight artificial intelligence.
⚡4
Sanctuary AI announced a partnership with European automaker Magna.
The collab features Sanctuary AI’s development of general-purpose AI robots for deployment in Magna’s manufacturing operations.
The collab features Sanctuary AI’s development of general-purpose AI robots for deployment in Magna’s manufacturing operations.
Bloomberg.com
Robotics Startup Sanctuary Signs Deal for Factory Tests, Funds
Humanoid robot-making startup Sanctuary AI has struck a deal with a major auto-parts manufacturer for deployment in its factories and additional equity.
👍4⚡3