Gradient Dude – Telegram

Gradient Dude

2.65K subscribers

180 photos

50 videos

2 files

169 links

TL;DR for DL/CV/ML/AI papers from an author of publications at top-tier AI conferences (CVPR, NIPS, ICCV,ECCV).

Most ML feeds go for fluff, we go for the real meat.

YouTube: youtube.com/c/gradientdude
IG instagram.com/gradientdude

Download Telegram

About

Blog

Apps

Platform

2.65K subscribers

Hi guys! A productive Sunday is when you feel like you have learned something new.
To learn more details about our 3rd place solution for the Kaggle competition "Lyft Prediction for Autonomous Vehicles competition" you can check out my Medium blogpost.

760 views12:36

Self-Attention models are gaining popularity in Computer Vision.
DETR applied transformers for end-to-end detection, VideoBERT learns a joint visual-linguistic representation for videos, ViT uses self-attention to achieve SOTA classification results on ImageNet, etc.

PapersWithCode created a taxonomy of modern self-attention models for vision and discusses Recent Progress. You can read it here.
I'm planning to delve deeper into this topic and it looks like it is a perfect place to start 🤓!

736 viewsedited 19:43

This media is not supported in your browser

VIEW IN TELEGRAM

This media is not supported in your browser

VIEW IN TELEGRAM

New full-frame video stabilization method. Looking forward to having it on my Google Pixel phone! There is hope as one of the authors is at Google.

The core idea is a learning-based fusion approach to aggregate warped contents from multiple neighboring frames (see pipeline figure below).
This method is several magnitudes slower than the built-in Adobe Premiere Pro 2020 warp stabilizer. However, this method does not aggressively crop the frame borders and hence better preserves the original content, in contrast to the warp stabilizer in Adobe Premiere Pro.

✏️ Paper
🧾 Project page

782 views14:52

858 views14:52

Graph Representation Learning Book 🦾

A brief but comprehensive introduction to graph representation learning, including methods for embedding graph data, graph neural networks, and deep generative models of graphs.

https://cs.mcgill.ca/~wlh/grl_book/

790 views21:56

#beginners_guide
Learn About Transformers: A Recipe

A blogpost summarizing key study material to learn about the Transformer models (theory + code).
Tasty!

4.07K views06:00

Hi guys! New video on my YouTube channel!

I this video I give intuition behind self-supervised representation learning (also easy to understand for beginners).

You will learn how to learn useful representation from just a bunch of unlabeled images.
I will explain CliqueCNN method which builds compact cliques for classification as a pretext task and give an overview of other recent self-supervised learning approaches.

https://youtu.be/DEm6pDyYbt4

CliqueCNN: Self-supervised image representation learning

How to learn useful representation from just a bunch of unlabeled images?
I will give a high-level overview of what is self-supervised learning and explain the CliqueCNN method.
We will also briefly talk about a bunch of other important self-supervised learning…

783 views15:32

670 views15:32

Google open-sourced its AutoML framework for model architecture search at scale.
It helps to find the right model architecture for any classification problems (i.e., CNN with different types of layers).
Now you can write fit(); predict() and call it a day! Of course, in case you have enough GPUs 🙊😅

You can define your own model building blocks to use for search as well.
The framework uses Bayesian optimization to find proper hyperparameters and can build an ensemble of the models.
Works both for table and image data.

https://github.com/google/model_search

1.67K views13:28

This media is not supported in your browser

VIEW IN TELEGRAM

How does Bayesian optimization help to find the proper hyperparameters for a machine learning model?

Bayesian optimization works by constructing a posterior distribution of the objective function (Gaussian process) and use it to select the most promising hyperparameters to evaluate.
As the number of observations grows, the posterior distribution improves, and the algorithm becomes more certain of which regions in the parameter space are worth exploring, and which are not.

Good blogposts to learn about Bayesian optimization: [at towardsdatascience] [at research.fb.com]

633 viewsedited 13:34

A talk on Theoretical Foundations of Graph Neural Networks by Petar Veličković from DeepMind.

In this talk Petar derives GNNs from first principles, motivates their use in the sciences, and explain how they emerged along several research lines.
Should be very interesting for those who wanted to learn about GNNs but could not find a good starting point.

Video: https://youtu.be/uF53xsT7mjc
Slides: https://petar-v.com/talks/GNN-Wednesday.pdf

741 views15:30

This media is not supported in your browser

VIEW IN TELEGRAM

Guys from RunwayML created an awesome user-friendly demo for our approach "Adaptive Style Transfer".

You can play around with it and easily stylize your own photos. One important thing: the larger an input image, the more crispy becomes a stylization.

Run Models for 8 different artists
Run Picasso model
Run Van Gogh model

Method source code on Github: https://github.com/CompVis/adaptive-style-transfer

658 viewsedited 09:33

Media is too big

VIEW IN TELEGRAM

Stable View Synthesis (by Vladlen Koltun from Intel)

Given a set of source images depicting a scene from arbitrary viewpoints, it synthesizes new views of the scene.

The method operates on a geometric scaffold computed via structure-from-motion and multi-view stereo. Each point on this 3D scaffold is associated with view rays and corresponding feature vectors that encode the appearance of this point in the input images. The core of SVS is view-dependent on-surface feature aggregation, in which directional feature vectors at each 3D point are processed to produce a new feature vector for a ray that maps this point into the new target view. The target view is then rendered by a convolutional network from a tensor of features synthesized in this way for all pixels. The method is trained end-to-end.

The results are magnificent!

Source code
Paper

712 views15:00

This media is not supported in your browser

VIEW IN TELEGRAM

VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation

I continue discussing deep learning approaches for self-driving cars.
Future motion prediction is a task of paramount importance for autonomous driving. For a self-driving car to safely operate it is crucial to be able to anticipate the actions of other agents on the road.

In this video, I explain VectorNet - one of the methods for future motion prediction based on the vectorized representation of the scene instead of RGB images.

▶️YouTube Video
📝Paper

594 viewsedited 06:31

This is just awesome 😀

515 views00:56

Forwarded from Технологии | Нейросети | Боты

This media is not supported in your browser

VIEW IN TELEGRAM

😅

530 views00:56