All about AI, Web 3.0, BCI
3.32K subscribers
733 photos
26 videos
161 files
3.17K links
This channel about AI, Web 3.0 and brain computer interface(BCI)

owner @Aniaslanyan
Download Telegram
How it started. How it's going.

Visualization of Thought Elicits Spatial Reasoning in LLM.
A super interesting talk on Ring Attention, probably the magic behind Gemini's 1 million context window

You organize your devices (GPU/TPU) in a ring, each computing a part of the final attention output

Each device needs to see all keys/values to produce its part. The idea is that the attention output can be computed blockwise (by splitting on the sequence dimension). Each device computes the updated queries of a chunk of the sequence by sending/receiving keys/values

This is a great repo to understand it in code.
Intelligent fabrics, which can sense and communicate information scalably and unobtrusively, can fundamentally change how people interact with the world.
👍4
Apple presents Ferret-UI

Grounded Mobile UI Understanding with Multimodal LLMs

Recent advancements in multimodal large language models (MLLMs) have been noteworthy, yet, these general-domain MLLMs often fall short in their ability to comprehend and interact effectively with user interface (UI) screens.
Google released CodeGemma, a new version of the Gemma line of models fine-tuned on code generation and completion, that achieves state-of-the-art results. Available in sizes 2B and 7B.

HF is here.
CEO Intel announced Lunar Lake with over 100 TOPS of platform AI performance. Shows off a Lunar Lake SoC on stage and says to expect significant gains.
👍3
Gemma is expanding.... Google announced CodeGemma, a version of Gemma tuned for code generation. And bonus... Gemma is now bumped to v1.1, addressing lots of feedback we got.
Meta confirmed its GPT-4 competitor, Llama 3, is coming within the month.

At an event in London, Meta confirmed that it plans an initial release of Llama 3, its GPT-4 competitor, within the next month.

The company did not disclose the size of the parameters used in Llama 3, but it's expected to have about 140 billion parameters.
4
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

Distributed RObot Interaction Dataset: A diverse robot manipulation dataset with 76k demonstrations, collected across 564 scenes and 84 tasks over the course of a year.

Paper.
This media is not supported in your browser
VIEW IN TELEGRAM
GE HealthCare’s Vscan Air SL with Caption AI software provides real-time guidance that shows healthcare professionals how to maneuver the probe to capture diagnostic-quality standard cardiac images.

With the help of on-device AI, there's now a way for handheld ultrasound users to confidently acquire cardiac views for rapid assessments at the point of care.
Meta announced 2nd-gen inference chip MTIAv2

- 708TF/s Int8 / 353TF/s BF16
- 256MB SRAM, 128GB memory
- 90W TDP. 24 chips per node, 3 nodes per rack.
- standard PyTorch stack (Dynamo, Inductor, Triton) for flexibility

Fabbed on TSMC's 5nm process, its fully programmable via the standard PyTorch stack, driven via Triton for software kernels.

This chip is an inference power-house and the software work is entirely driven by the PyTorch team, making usability a first; and its been great to see it in action on various Meta workloads.
New paper from Berkeley on Autonomous Evaluation and Refinement of Digital Agents

VLM/LLM-based evaluators can significantly improve the performance of agents for web browsing and device control, advancing sotas by 29% to 75%.
The music industry just had its 'ChatGPT for music' moment with Udio.

A new AI-powered music creation app called Udio from former Google DeepMind researchers just launched.

It allows users to generate full audio tracks in under 40 seconds with simple prompts and secured early funding from a16z, will i am, Common, and more.
A newly revealed patent from Microsoft Bing detailed ‘Visual Search’.

The patent describes a reverse image search with personal results tailored to user preferences and interests.
5
Integrated data visualization and analysis for workhorse biological assays

Make publication-quality figures in under a minute. No more fiddling with Excel, Prism, ggplot, and PowerPoint.
4
Google presents Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

1B model that was fine-tuned on up to 5K sequence length passkey instances solves the 1M length problem.

The future of attention just dropped, and it looks a lot like a state space model (finite size, continual updates)

Little doubt now that a mixture of architectures will support the long-term, gradually conditioned memory needed for highly capable agents
4
MR_for_autism_1712848311.pdf
2.6 MB
Eye tracking as a window onto conscious and non-conscious processing in the brain.

The goals are two-fold and relatively simple:

1.      propose and test a method for familiarizing individuals with severe autism spectrum disorder (ASD) with the HoloLens 2 headset and the use of MR technology through a tutorial.

2.   obtain quantitative learning indicators in MR, such as execution speed and eye tracking, by comparing individuals with ASD to neurotypical individuals.

Over 80% of individuals with ASD successfully familiarized themselves with MR after several sessions.

In addition, the visual activity of individuals with ASD did not differ from that of neurotypical individuals when they successfully familiarized themselves.

This opens a lot of doors for potential learning opportunities in this population.
4
Apple nears production of first M4 chips with AI upgrades and plans to bring the new chips to all of its Macs, including new MacBook Pros and Airs, Mac Pro, Mac Studio, Mac mini and iMac across the end of this year and 2025.
4
Sanctuary AI announced a partnership with European automaker Magna.

The collab features Sanctuary AI’s development of general-purpose AI robots for deployment in Magna’s manufacturing operations.
👍43