I use the podcasts of Lex Fridman as an opportunity to talk to very intelligent and clever people while having breakfast. These conversations always give me the motivation to keep up with my research work as well.
I have just finished listening to Lex's conversation with Prof. Sergey Levine. Very insightful!
Sergey is a brilliant researcher in the field of Deep RL and Computer Vision and a very humble and genuine person. I was lucky to meet him in person and to talk to him a little bit at my first big scientific conference NeurIPS 2016.
A piece of advice for students from Sergey Levine:
"It is important to not be afraid to spend time imagining the kind of outcome that you might like to see. If someone who is a student considering a career in AI takes a little while, sits down and thinks like "What do I really want to see a machine do? What do I want to see a robot do? What do I want to see a natural language system do?". Imagine it almost like a commercial for a future product or something that you'd like to see in the world. And then actually sit down and think about the steps that are necessary to get there. And hopefully, that thing is not a better number on ImageNet classification, it's probably like an actual thing that we can't do today. That would be really AWESOME.
Whether it's a robot butler or an awesome healthcare decision-making support system. Whatever it is that you find inspiring. And I think that thinking about that and then backtracking from there and imagining the steps needed to get there will actually do much better research, it will lead to rethinking the assumptions, it will lead to working on the bottlenecks other people aren't working on."
I have just finished listening to Lex's conversation with Prof. Sergey Levine. Very insightful!
Sergey is a brilliant researcher in the field of Deep RL and Computer Vision and a very humble and genuine person. I was lucky to meet him in person and to talk to him a little bit at my first big scientific conference NeurIPS 2016.
A piece of advice for students from Sergey Levine:
"It is important to not be afraid to spend time imagining the kind of outcome that you might like to see. If someone who is a student considering a career in AI takes a little while, sits down and thinks like "What do I really want to see a machine do? What do I want to see a robot do? What do I want to see a natural language system do?". Imagine it almost like a commercial for a future product or something that you'd like to see in the world. And then actually sit down and think about the steps that are necessary to get there. And hopefully, that thing is not a better number on ImageNet classification, it's probably like an actual thing that we can't do today. That would be really AWESOME.
Whether it's a robot butler or an awesome healthcare decision-making support system. Whatever it is that you find inspiring. And I think that thinking about that and then backtracking from there and imagining the steps needed to get there will actually do much better research, it will lead to rethinking the assumptions, it will lead to working on the bottlenecks other people aren't working on."
YouTube
Sergey Levine: Robotics and Machine Learning | Lex Fridman Podcast #108
Sergey Levine is a professor at Berkeley and a world-class researcher in deep learning, reinforcement learning, robotics, and computer vision, including the development of algorithms for end-to-end training of neural network policies that combine perception…
Hi guys! New video on my YouTube channel!
Computer Vision for animals is a fast-growing and very promising sub-field.
I this video I will explain how to reconstruct a 3D model of an animal with a single photo.
The method is based on cycle consistency loss between image pixels and vertices on a 3D mesh.
Reference papers:
1) "Articulation-aware Canonical Surface Mapping", Kulkarni et al., CVPR 2020
2) "Canonical Surface Mapping via Geometric Cycle Consistency", Kulkarni et al., ICCV 2019
Method source code: GitHub repo.
Computer Vision for animals is a fast-growing and very promising sub-field.
I this video I will explain how to reconstruct a 3D model of an animal with a single photo.
The method is based on cycle consistency loss between image pixels and vertices on a 3D mesh.
Reference papers:
1) "Articulation-aware Canonical Surface Mapping", Kulkarni et al., CVPR 2020
2) "Canonical Surface Mapping via Geometric Cycle Consistency", Kulkarni et al., ICCV 2019
Method source code: GitHub repo.
YouTube
How to Reconstruct 3D Model of an Animal from a single Photo via Cycle Consistency [Deep Learning]
Computer Vision for Animals is one of the growing sub-fields with huge potential. In this video, I explain 2 papers for reconstructing 3D meshes of animals just from photos.
Timecodes:
0:00 Intro
0:24 Intuitive example
0:48 Goal
0:58 Geometrical priors…
Timecodes:
0:00 Intro
0:24 Intuitive example
0:48 Goal
0:58 Geometrical priors…
Hi guys! A productive Sunday is when you feel like you have learned something new.
To learn more details about our 3rd place solution for the Kaggle competition "Lyft Prediction for Autonomous Vehicles competition" you can check out my Medium blogpost.
To learn more details about our 3rd place solution for the Kaggle competition "Lyft Prediction for Autonomous Vehicles competition" you can check out my Medium blogpost.
Self-Attention models are gaining popularity in Computer Vision.
DETR applied transformers for end-to-end detection, VideoBERT learns a joint visual-linguistic representation for videos, ViT uses self-attention to achieve SOTA classification results on ImageNet, etc.
PapersWithCode created a taxonomy of modern self-attention models for vision and discusses Recent Progress. You can read it here.
I'm planning to delve deeper into this topic and it looks like it is a perfect place to start 🤓!
DETR applied transformers for end-to-end detection, VideoBERT learns a joint visual-linguistic representation for videos, ViT uses self-attention to achieve SOTA classification results on ImageNet, etc.
PapersWithCode created a taxonomy of modern self-attention models for vision and discusses Recent Progress. You can read it here.
I'm planning to delve deeper into this topic and it looks like it is a perfect place to start 🤓!
New full-frame video stabilization method. Looking forward to having it on my Google Pixel phone! There is hope as one of the authors is at Google.
The core idea is a learning-based fusion approach to aggregate warped contents from multiple neighboring frames (see pipeline figure below).
This method is several magnitudes slower than the built-in Adobe Premiere Pro 2020 warp stabilizer. However, this method does not aggressively crop the frame borders and hence better preserves the original content, in contrast to the warp stabilizer in Adobe Premiere Pro.
✏️ Paper
🧾 Project page
The core idea is a learning-based fusion approach to aggregate warped contents from multiple neighboring frames (see pipeline figure below).
This method is several magnitudes slower than the built-in Adobe Premiere Pro 2020 warp stabilizer. However, this method does not aggressively crop the frame borders and hence better preserves the original content, in contrast to the warp stabilizer in Adobe Premiere Pro.
✏️ Paper
🧾 Project page
Graph Representation Learning Book 🦾
A brief but comprehensive introduction to graph representation learning, including methods for embedding graph data, graph neural networks, and deep generative models of graphs.
https://cs.mcgill.ca/~wlh/grl_book/
A brief but comprehensive introduction to graph representation learning, including methods for embedding graph data, graph neural networks, and deep generative models of graphs.
https://cs.mcgill.ca/~wlh/grl_book/
#beginners_guide
Learn About Transformers: A Recipe
A blogpost summarizing key study material to learn about the Transformer models (theory + code).
Tasty!
Learn About Transformers: A Recipe
A blogpost summarizing key study material to learn about the Transformer models (theory + code).
Tasty!
Hi guys! New video on my YouTube channel!
I this video I give intuition behind self-supervised representation learning (also easy to understand for beginners).
You will learn how to learn useful representation from just a bunch of unlabeled images.
I will explain CliqueCNN method which builds compact cliques for classification as a pretext task and give an overview of other recent self-supervised learning approaches.
https://youtu.be/DEm6pDyYbt4
I this video I give intuition behind self-supervised representation learning (also easy to understand for beginners).
You will learn how to learn useful representation from just a bunch of unlabeled images.
I will explain CliqueCNN method which builds compact cliques for classification as a pretext task and give an overview of other recent self-supervised learning approaches.
https://youtu.be/DEm6pDyYbt4
YouTube
CliqueCNN: Self-supervised image representation learning
How to learn useful representation from just a bunch of unlabeled images?
I will give a high-level overview of what is self-supervised learning and explain the CliqueCNN method.
We will also briefly talk about a bunch of other important self-supervised learning…
I will give a high-level overview of what is self-supervised learning and explain the CliqueCNN method.
We will also briefly talk about a bunch of other important self-supervised learning…
Google open-sourced its AutoML framework for model architecture search at scale.
It helps to find the right model architecture for any classification problems (i.e., CNN with different types of layers).
Now you can write
You can define your own model building blocks to use for search as well.
The framework uses Bayesian optimization to find proper hyperparameters and can build an ensemble of the models.
Works both for table and image data.
https://github.com/google/model_search
It helps to find the right model architecture for any classification problems (i.e., CNN with different types of layers).
Now you can write
fit(); predict()
and call it a day! Of course, in case you have enough GPUs 🙊😅You can define your own model building blocks to use for search as well.
The framework uses Bayesian optimization to find proper hyperparameters and can build an ensemble of the models.
Works both for table and image data.
https://github.com/google/model_search
This media is not supported in your browser
VIEW IN TELEGRAM
How does Bayesian optimization help to find the proper hyperparameters for a machine learning model?
Bayesian optimization works by constructing a posterior distribution of the objective function (Gaussian process) and use it to select the most promising hyperparameters to evaluate.
As the number of observations grows, the posterior distribution improves, and the algorithm becomes more certain of which regions in the parameter space are worth exploring, and which are not.
Good blogposts to learn about Bayesian optimization: [at towardsdatascience] [at research.fb.com]
Bayesian optimization works by constructing a posterior distribution of the objective function (Gaussian process) and use it to select the most promising hyperparameters to evaluate.
As the number of observations grows, the posterior distribution improves, and the algorithm becomes more certain of which regions in the parameter space are worth exploring, and which are not.
Good blogposts to learn about Bayesian optimization: [at towardsdatascience] [at research.fb.com]
A talk on Theoretical Foundations of Graph Neural Networks by Petar Veličković from DeepMind.
In this talk Petar derives GNNs from first principles, motivates their use in the sciences, and explain how they emerged along several research lines.
Should be very interesting for those who wanted to learn about GNNs but could not find a good starting point.
Video: https://youtu.be/uF53xsT7mjc
Slides: https://petar-v.com/talks/GNN-Wednesday.pdf
In this talk Petar derives GNNs from first principles, motivates their use in the sciences, and explain how they emerged along several research lines.
Should be very interesting for those who wanted to learn about GNNs but could not find a good starting point.
Video: https://youtu.be/uF53xsT7mjc
Slides: https://petar-v.com/talks/GNN-Wednesday.pdf
This media is not supported in your browser
VIEW IN TELEGRAM
Guys from RunwayML created an awesome user-friendly demo for our approach "Adaptive Style Transfer".
You can play around with it and easily stylize your own photos. One important thing: the larger an input image, the more crispy becomes a stylization.
Run Models for 8 different artists
Run Picasso model
Run Van Gogh model
Method source code on Github: https://github.com/CompVis/adaptive-style-transfer
You can play around with it and easily stylize your own photos. One important thing: the larger an input image, the more crispy becomes a stylization.
Run Models for 8 different artists
Run Picasso model
Run Van Gogh model
Method source code on Github: https://github.com/CompVis/adaptive-style-transfer