Neural Networks | Нейронные сети
11.6K subscribers
802 photos
184 videos
170 files
9.45K links
Все о машинном обучении

По всем вопросам - @notxxx1

№ 4959169263
Download Telegram
​ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Lan et al. Google
arxiv.org/abs/1909.11942

🔗 ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks. However, at some point further model increases become harder due to GPU/TPU memory limitations, longer training times, and unexpected model degradation. To address these problems, we present two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT. Comprehensive empirical evidence shows that our proposed methods lead to models that scale much better compared to the original BERT. We also use a self-supervised loss that focuses on modeling inter-sentence coherence, and show it consistently helps downstream tasks with multi-sentence inputs. As a result, our best model establishes new state-of-the-art results on the GLUE, RACE, and SQuAD benchmarks while having fewer parameters compared to BERT-large.
DeepMind Measures 7 Capabilities Every AI Should Have

video: https://www.youtube.com/watch?v=zrF5_O92ELQ

📝 The paper "Behaviour Suite for Reinforcement Learning"

https://arxiv.org/abs/1908.03568

code https://github.com/deepmind/bsuite

🎥 DeepMind Measures 7 Capabilities Every AI Should Have
👁 1 раз 242 сек.
❤️ Thank you so much for your support on Patreon: https://www.patreon.com/TwoMinutePapers

📝 The paper "Behaviour Suite for Reinforcement Learning" is available here:
https://arxiv.org/abs/1908.03568
https://github.com/deepmind/bsuite

🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Anthony Vdovitchenko, Brian Gilman, Bruno Brito, Bryan Learn, Christian Ahlin, Christoph Jadanowski, Claudio Fernandes, Daniel Hasega
🎥 How to Become a Deep Learning Expert
👁 1 раз 1431 сек.
In this video you will learn how to level up in your deep learning expertise. I share the path I took, and give you my guidelines on how to think about expertise.

You have to recognize that expertise is a sliding scale, rather than a state of being. Even the deep learning pioneers are learning more each day, and are gaining in expertise over time.

The key is to gradually increase your skills in mathematics and implementing cutting edge solutions at the forefront of deep learning. Always be striving for
🎥 Recitation 5 | Training Convolutional Neural Networks
👁 3 раз 2542 сек.
Carnegie Mellon University
Course: 11-785, Intro to Deep Learning
Offering: Fall 2019

For more information, please visit: http://deeplearning.cs.cmu.edu/

Contents:
• Convolutional Neural Networks (CNNs)
• Arriving at the convolutional mode
​Параметризация нейросетью физической модели для решения задачи топологической оптимизации
Недавно на arXiv.org была загружена статья с не очень интригующим названием "Neural reparameterization improves structural optimization" [arXiv:1909.04240]. Однако оказалось, что авторы, по сути, придумали и описали весьма нетривиальный метод использования нейросети для получения решения задачи структурной/топологической оптимизации физических моделей (хотя и сами авторы говорят, что метод более универсален). Подход очень любопытный, результативный и судя по всему, — совершенно новый (впрочем, за последнее не поручусь, но ни авторы работы, ни сообщество ODS, ни я, аналогов припомнить не смогли), поэтому его может быть полезно знать интересующимся как использованием нейросетей, так и решением разнообразных задач оптимизации.

🔗 Параметризация нейросетью физической модели для решения задачи топологической оптимизации
Недавно на arXiv.org была загружена статья с не очень интригующим названием "Neural reparameterization improves structural optimization" [arXiv:1909.04240]. Одна...
🎥 How does Machine Learning Change Software Development Practices?
👁 1 раз 3100 сек.
Активное развитие технологий машинного обучения и широкий успех систем, основанных на них, приводит к их повсеместному применению в самых различных областях науки и индустрии. В связи с этим можно отметить и исследовать изменения, которые использование данных методов привнесли во внутренние процессы разработки программного обеспечения, сравнивая опыт разработчиков.

В первом семинаре нового учебного года мы исследуем данную тему, представив обзор на две недавние статьи, ставящие своей целью изучить изменени
DCTD: Deep Conditional Target Densities for Accurate Regression

Authors: Fredrik K. Gustafsson, Martin Danelljan, Goutam Bhat, Thomas B. Schön

Abstract: While deep learning-based classification is generally addressed using standardized approaches, a wide variety of techniques are employed for regression. In computer vision, one particularly popular such technique is that of confidence-based regression, which entails predicting a confidence value for each input-target pair (x, y). While this approach has demonstrated impressive results, it requires important task-dependent design choices, and the predicted confidences often lack a natural probabilistic meaning. We address these issues by proposing Deep Conditional Target Densities (DCTD), a novel and general regression method with a clear probabilistic interpretation.

https://arxiv.org/abs/1909.12297

🔗 DCTD: Deep Conditional Target Densities for Accurate Regression
While deep learning-based classification is generally addressed using standardized approaches, a wide variety of techniques are employed for regression. In computer vision, one particularly popular such technique is that of confidence-based regression, which entails predicting a confidence value for each input-target pair (x, y). While this approach has demonstrated impressive results, it requires important task-dependent design choices, and the predicted confidences often lack a natural probabilistic meaning. We address these issues by proposing Deep Conditional Target Densities (DCTD), a novel and general regression method with a clear probabilistic interpretation. DCTD models the conditional target density p(y|x) by using a neural network to directly predict the un-normalized density from (x, y). This model of p(y|x) is trained by minimizing the associated negative log-likelihood, approximated using Monte Carlo sampling. We perform comprehensive experiments on four computer vision regression tasks. Our app