Gradient Dude – Telegram

Gradient Dude

2.65K subscribers

180 photos

50 videos

2 files

169 links

TL;DR for DL/CV/ML/AI papers from an author of publications at top-tier AI conferences (CVPR, NIPS, ICCV,ECCV).

Most ML feeds go for fluff, we go for the real meat.

YouTube: youtube.com/c/gradientdude
IG instagram.com/gradientdude

Download Telegram

About

Blog

Apps

Platform

2.65K subscribers

📌 Facts and figures: Though the authors say they’ve trained a 10billion and 100billion parameter model, they mostly report performance statistics for the 10billion. The 100b is a mixture-of-experts model, while the 10b is based on NVIDIA’s Megatron training code. The model’s size and sophistication is notable – this feels like a symptom of the maturing capabilities of various Chinese AI organization. I wonder when we’ll get an M6-scale system from people affiliated with India, or regions like Europe or Africa.

🤷🏼‍♂️ Why this matters: M6 is notable for being a non-English model at equivalent scale to some of the largest primarily-English ones. We’re entering an era where there will be multiple, gigantic AI models, with variations stemming from the organizations that trained them. It’s also interesting to consider how these models proliferate, and who will get access to them. Will students and researchers at Tsinghua get access to M6, or just Alibaba’s researchers, or both? And how might access schemes develop in other countries, as well?

🌀 A word about bias: There’s no discussion of bias in the paper (or ethics), which isn’t typical for papers of this type but is typical of papers that come out of Chinese research organizations 😉

📝 ArXiv Paper link

—
Source: https://jack-clark.net/

565 views18:32

The results, honestly, are quite good. Especially enjoyed the humble opinion about "The Great Wall" 😄

562 views18:32

We are on the eve of the Matrix. Constantly increased dopamine level in the VR world or poverty and fighting with robots in reality.

Scientists from the universities of Helsinki used GANs to create personalized attractive faces. To gradually increase the face attractiveness they recorded the electrical activity of the brain of the tested person while changing the synthetic faces by random walking in the GAN latent space. This way, we get a GAN, in which a living person acts as a discriminator, and therefore the generated faces were more likable for that person.

I have thought about a similar idea a couple of years ago. We can analyze users' preferences in male/female appearance by their likes in social media and then use it to generate personalized ads with the faces of the most attractive people. This seems like a more feasible scenario than using brain encephalograms 🧠.

1.76K views16:38

Imagine now, that with the help of such techniques, one can create an ideal virtual partner. To go even further, think about how personalized porn can be created with the face/appearance of the most-attractive person (maybe not even existing).

The terrible new world is almost ready 😅.

📝 Paper
🌐 Blogpost

Brain-computer interface for generating personally attractive images

Spapé, M., Davis, K., Kangassalo, L., Ravaja, N., Sovijärvi-Spapé, Z., & Ruotsalo, T. (2021) . Brain-computer interface for generating personally attractive images. IEEE Transactions on Affective Computing, in press.

1.73K views16:38

This media is not supported in your browser

VIEW IN TELEGRAM

Learning High Fidelity Depths of Dressed Humans by Watching TikTok Dance Videos

The single-frame depth is refined by self-supervised leveraging local transformations of body parts to enforce geometric consistency across different poses.

636 views17:24

First, depth and normal estimation network is pretrained using Synthetic 3D data (RenderPeople). Then this network is refined by using geometric consistency between pairs of different frames. Each body part transformation is modeled independently as a rigid transformation, then estimated 3D coordinates of the points on each body part can be warped onto a different frame and the disparity can be used as a loss function.

📝 Paper
🛠 Code (will be released soon)

550 views17:24

This media is not supported in your browser

VIEW IN TELEGRAM

538 views17:24

A gentle introduction to RL (18 min) by Sergey Levine, UC Berkeley, one of the leading experts in the field.

▶️ Video

Thanks @ml_for_curious

A Gentle Introduction to Offline Reinforcement Learning

This short talk presents a non-technical introduction to offline reinforcement learning: algorithms for learning to make decisions from data.

Slides available here: https://drive.google.com/file/d/1Ip9CaAr8bF-nnvbmU63c9CPnIr2FOyPG/view?usp=sharing

568 views06:30

This media is not supported in your browser

VIEW IN TELEGRAM

NeX: Real-time View Synthesis with Neural Basis Expansion

An amazing new approach to novel view synthesis a combination of multiplane image (MPI) and neural basis expansion (NeRF-like networks). It can reproduce spectacular complex view-dependent effects (see video).

Unlike traditional MPI that uses a set of simple RGBαplanes, this technique models view-dependent effects by instead parameterizing each pixel as a linear combination of basis functions learned by a neural network.

It is stunningly fast to render! The first real-time neural rendering. 60FPS! 1000x faster than NeRF.
However, training NeX still takes a long time and may require a higher number of input views to replicate view-dependent effects.

—
By the way it is the first paper that I see from Thailand!

📝 Paper
▶️ Video from authors
🌐 Project page
🛠 Code will come soon

573 views16:45

https://youtu.be/HyfkF7Z-ddA

[CVPR 2021] NeX: Real-time View Synthesis with Neural Basis Expansion

This is a supplementary video for
NeX: Real-time View Synthesis with Neural Basis Expansion
by Suttisak Wizadwongsa*, Pakkapon Phongthawee*, Jiraphon Yenphraphai*, Supasorn Suwajanakorn. (* first co-authors)

https://nex-mpi.github.io/

Abstract
We present…

532 views16:45

This media is not supported in your browser

VIEW IN TELEGRAM

There is also an Interactive online demo.

519 views16:45

Unsupervised Semantic Segmentation by Contrasting Object Mask Proposal
ETH, Luc Van Gool

TL;DR is below ⬇️

📝 Arxiv
🛠 Code

541 viewsedited 20:51

Forwarded from Self Supervised Boy

Yet again simple approach leading to unsupervised segmentation. Mostly useful as pre-training though.

Proposed pipeline first mines saliency object areas (with any available framework, possibly supervised) and then makes contrast learning for pixel embeddings inside those regions. During second step individual pixel embedding is attracted to the mean embedding of its object and pushed away from mean embeddings of other objects. This additional detail differs it from some previously proposed pipelines and allows wider training, because of slower growing rate of the loss pairs.

Less briefly and with some external links here.
Source here.

swanky-pleasure-bcf on Notion

Unsupervised Semantic Segmentation by Contrasting Objects Mask Proposals | Notion

Paper proposes versatile two-step approach of pixel-level embeddings training which could be used both for unsupervised segmentation, or as pre-training for semi-supervised segmentation. Authors argue, that the mid-range prior for training embeddings is better…

518 views20:51

Forwarded from Self Supervised Boy

Spotlight on ICLR 2021 by Schmidhuber. Proposes the method of unsupervised keypoints location algorithm with RL application on Atari.

Very clear and simple idea.:
1. Compressing image with VAE and using features from some intermediate layer of encoder later on.
2. Trying to predict feature vector by its surrounding vectors. If the prediction error is high, we found some important object.
3. Compressing error map for image as the mixture of gaussians with fixed covariance, each center representing one keypoint.

SoTA on Atari games, more robust to input noise.

Probably, could be also used outside of simple Atari framework if you have enough data to train, and take later layers of encoder.

With colorfull images here: https://www.notion.so/Unsupervised-Object-Keypoint-Learning-Using-Local-Spatial-Predictability-ddcf36a856ff4e389050b3089cd710bc
Source here: https://openreview.net/pdf?id=GJwMHetHc73

swanky-pleasure-bcf on Notion

Unsupervised Object Keypoint Learning Using Local Spatial Predictability | Notion

In this paper authors proposed the new approach to the unsupervised keypoint learning. Previous SoTA approach, Transporter, was guided by the movement between slices to learn keypoints. In current paper authors shown possible flaws of such training procedure…

538 views20:55

Involution: Inverting the Inherence of Convolution for Visual Recognition
ByteDance AI Lab

Convolution has been the core ingredient of modern neural networks. Now authors propose a novel atomic operation or deep neural networks by inverting the design principles of convolution.

Proposed Involution-based models improve over the conv-based baselines using ResNet-50:
- by up to 1.6% top-1 accuracy on Imagent classification,
- by 2.5% detection AP on COCO and
- by 2.4% on COCO segmentation
- by 4.7% mean IoU on Cityscapes segmentation
Moreover, the computational cost is reduced by ~60%.

To understand the Involution, it's better to read the paper though.
I don't know but maybe it will be something that universal like GroupNorm and will improve performance in almost any task?

📝 Paper
🛠 Code

671 views06:01

Results on ImageNet. RedNet is their novel backbone ⚙️ architecture.

618 views06:01

It has been less than a week since Mark Zuckerberg promised face tracking in Oculus devices and HTC rapidly announced VIVE Facial Tracker which seamlessly tracks 38 facial movements across the lips, jaw, teeth, tongue, chin, and cheeks.
Amazing how this seamingly simple technology significantly improves virtual experience.

With VR becoming more profitable, companies like Valve and Facebook continue to invest in the technology. And now rumors are swirling that Apple is working on a mixed-reality headset as well.

This is my approximate interpretation of the Russian post from @ai_newz

HTC Vive has a new VR trick – full facial tracking

It'll require a new accessory though, with improved body tracking incoming too via a new add-on.

553 viewsedited 13:56

This media is not supported in your browser

VIEW IN TELEGRAM

Example of HTC VIVE Face tracking in action.

582 viewsedited 14:02