For Developers
213 subscribers
65 photos
3 videos
1.01K files
991 links
YAC
Download Telegram
Forwarded from HN Best Comments
Re: The Era of 1-bit LLMs: ternary parameters for cost...

Fun to see ternary weights making a comeback. This was hot back in 2016 with BinaryConnect and TrueNorth chip from IBM research (disclosure, I was one of the lead chip architects there).

Authors seemed to have missed the history. They should at least cite Binary Connect or Straight Through Estimators (not my work).

Helpful hint to authors: you can get down to 0.68 bits / weight using a similar technique, good chance this will work for LLMs too.

https://arxiv.org/abs/1606.01981

This was a passion project of mine in my last few months at IBM research :).

I am convinced there is a deep connection to understanding why backprop is unreasonably effective, and the result that you can train low precision DNNs; for those note familiar, the technique is to compute the loss wrt to the low precision parameters (eg project to ternary) but apply the gradient to high precision copy of parameters (known as the straight through estimator). This is a biased estimator and there is no theoretical underpinning for why this should work, but in practice it works well.

My best guess is that it is encouraging the network to choose good underlying subnetworks to solve the problem, similar to Lottery Ticket Hypothesis. With ternary weights it is just about who connects to who (ie a graph), and not about the individual weight values anymore.

paul_mk1, 9 hours ago
Interpretable medical image Visual Question Answering via multi-modal relationship graph learning

https://www.sciencedirect.com/science/article/abs/pii/S1361841524002044
Knowledge Graphs Meet Multi-Modal Learning:
A Comprehensive Survey

https://arxiv.org/pdf/2402.05391
KAN 2.0: Kolmogorov-Arnold Networks Meet Science

https://arxiv.org/pdf/2408.10205
Forwarded from SpaceX Feed
This media is not supported in your browser
VIEW IN TELEGRAM
We’re excited to team up with TMobile to bring our Starlink Direct to Cell capability to the US!
Source: RT @Gwynne_Shotwell, @TMobile