Gradient Dude

Example of HTC VIVE Face tracking in action.

582 viewsedited 14:02

Some psychedelic neural art. The first one is pretty awesome and indeed worth printing on a t-shirt. Thanks @krasniy_doshik.

539 views05:24

Gradient Dude

0:09

This media is not supported in your browser

VIEW IN TELEGRAM

MIT 6.S192: Deep Learning for Art, Aesthetics, and Creativity

Privet guys!

As you could notice I'm fond of neural art and artistic style transfer and have even published some papers on this topic (ECCV18, CVPR19, ICCV19). That's why today I'm very happy to share an awesome mini-course from MIT on Neural Art and Creativity👩🏼‍🎨. This course has a lineup of great invited speakers like Phillip Isola (MIT), Alyosha Efros (UC Berkeley), Jeff Clune (OpenAI), etc. The video lectures are free and available online.

🌀 http://deepcreativity.csail.mit.edu

676 viewsedited 06:30

Gradient Dude

Transformers Comprise the Fourth Pillar of Deep Learning

ARK Invest - one of the biggest asset-management companies and it is focused on disruptive technologies. They are convinced that Transformers is the next big thing and as recent language models with billions of parameters are very computationally demanding ARK Invest bets a lot on the growth of the AI chip market 🦾.

According to their research, Deep Learning had added a mindblowing $1 trillion in equity market capitalization to companies like Alphabet, Amazon, Nvidia, and TSMC as of year-end 2019 and perhaps another $250-500 billion in 2020. They predict that AI would contribute roughly $30 trillion to global equity market cap creation over the next 20 years.

🗣 Source post

559 viewsedited 11:00

Gradient Dude

Google and Facebook Datacenter AI Workloads as of year 2018 (before the raise of Transformers 😀). Multi-layer perceptrons (MLPs) here are responsible for ranking and recommendations for search and content feeds like Instagram, Netflix, and YouTube.
—
Have you seen anywhere any recent stats on this matter? Would be very interesting to see and compare.

567 views11:00

Gradient Dude

Self-training Improves Pre-training for Natural Language Understanding
Facebook AI & Stanford

Most semi-supervised NLP approaches require specifically in-domain unlabeled data. It means that for the best results, the unlabeled portion of the data which we want to use for semi-supervised training must be from the same domain as the annotated dataset.

This paper proposes SenAugment - a method that constructs task-specific in-domain unannotated datasets on the fly from the large external bank of sentences. So for any new NLP task where we have only a small dataset, we don't need to bother anymore to collect a very similar unannotated dataset if we want to use semi-supervised training.
Now we can sort of cheat to improve the performance of an NLP model on almost any downstream task using Self-training (which is also called Teacher-Student training):
1. We retrieve the most relevant sentences (few millions of them) for the current downstream task from the external bank. For retrieval we use the embedding space of a sentence encoder - Transformer, pre-trained with masked language modeling and finetuned to maximize cosine similarity between similar sentences.
2. We train the teacher model - a RoBERTa-Large model finetuned on the downstream task.
3. Then we use a teacher model to annotate the retrieved unlabeled in-domain sentences. We perform additional filtering by keeping the ones that have the high-confident predictions.
4. As our student model, we then finetune a new RoBERTa-Large using KL-divergence on the synthetic data by considering the post-softmax class probabilities as labels (i.e., not only the most confident class but the entire class distribution is used as a label for every sentence).

Such a self-training procedure significantly boosts the performance compared to the baseline. And the positive effect is higher when fewer GT annotated sentences are available.

As a large-scale external bank of unannotated sentences, authors use CommonCrowl. In particular, they use a corpus with 5 billion sentences (100B words). Because of its scale and diversity, the sentence bank contains data from various domains and with different styles, allowing to retrieve relevant data for many downstream tasks. To retrieve the most relevant sentences for a specific downstream task, we need to obtain an embedding for the task. Several options exist: (1) average embeddings of all sentences in the training set; (2) average embeddings for every class; (3) keep original sentences embeddings.

📝 Paper
🛠 Code

#paper_explained #nlp

662 viewsedited 05:00

Gradient Dude

Results🦾. ST stands for Self-training.

599 views05:00

Gradient Dude

Channel photo updated

12:02

Gradient Dude

What happens if you augment your training dataset with a load of stylized images as well?

Someone trained a StyleGAN2-ada on the images augmented with style transfer and synced the output with audio 🎶.

YouTube

StyleGAN2-ada-pytorch audio reactive weirdness

So what happens if you augment your dataset with a load of style-transfer images as well? Well, I guess it sort of seems to work. Now I think I need to up my dataset size from 3000 to over 9000! I should probably test with 256x256 images first, right? Think…

571 views16:01

Gradient Dude

Designing, Visualizing and Understanding Deep Neural Networks, CS182

Sergey Levine released his new lectures for deep learning class, CS182! This is an introductory deep learning course (advanced undergraduate + graduate) covering a broad range of deep learning topics. Prof. Levine is an Assistant Professor at UC Berkeley and is the head of the Robotic Artificial Intelligence and Learning Lab, I have posted about him a few months ago.

🔖 Course website
▶️ Lectures playlist

572 views23:36

Gradient Dude

This media is not supported in your browser

VIEW IN TELEGRAM

Neural Corgi 🤓

StyleGAN2-ADA trained on cute Corgi images. Looks amazing!

1. Scrape 350k Corgi images from Instagram.
2. Detect dogs using YOLOv3.
3. Remove small detections and dogs not facing in the camera.
4. Remove duplicates and crop the images. Around 130k 1024x1024 were obtained at this step.
5. Upsample crops to 1024 x 1024.
6. Train StyleGAN2-ADA for 5 million iterations for 18 days on Tesla V100.
7. Profit ?!

🌀 Colab
🛠Code and dataset

685 viewsedited 07:00

Gradient Dude

MacaquePose: A Novel “In the Wild” Macaque Monkey Pose Dataset

Recently, Computer vision for animals is getting more traction. Several works on this topic have already been discussed in this channel: post [1], post [2] , post [3].

❓ Why?
Pose estimation is fundamental for analyzing the relationship between the animal’s behaviors and its brain functions and malfunctions. And Macaque monkeys are excellent non-human primate models, especially for studying neuroscience.
Another possible application is Instagram / Snapchat masks and effects for your cute quadruple friends.

🏛 Dataset
This dataset provides keypoints for macaques in naturalistic scenes, it consists of 13k images and 16k monkey instances.
- 17 keypoints and instance segmentation for each monkey in COCO format.
- Annotations are of high quality because crowd-sourced annotations were curated and refined by 8 researchers working specifically with macaques.

579 views15:01

🙈 Interesting findings
The most challenging for both human annotators and the DeepLabCut baseline is to predict the positions of shoulders and hips. Another point of failure for Neural Networks is self occlusions.

📝 Paper
🏛 Dataset
⚙️ Pretrained models in DeepLabCut Model Zoo
📓 Colab

541 views15:01

Gradient Dude

0:30

This media is not supported in your browser

VIEW IN TELEGRAM

I totally need glasses that move with my eyebrows. (c) Yann LeCun

The quality is wicked because of the pesky twitter compression.

570 views16:00

Gradient Dude

CLIP + StyleGAN. Searching in StyleGAN latent space using description embedded with CLIP.

Queries: "A pony that looks like Beyonce", "... like Billie Eilish", ".. like Rihanna"

📐 The basic idea
Generate an image with StyleGAN and pass the image to CLIP for the loss against a CLIP text query representation. You then backprop through both networks and optimize a latent space in StyleGAN.

🤬 Drawbacks 1) it only works on text it knows 2) needs some cherry picking, only about 1/5 are really good.

Source twitt.

684 views17:00

About

Blog

Apps

Platform