Data Science by ODS.ai 🦜

🥇Parameter optimization in neural networks.

Play with three interactive visualizations and develop your intuition for optimizing model parameters.

Link: https://www.deeplearning.ai/ai-notes/optimization/

#interactive #demo #optimization #parameteroptimization #novice #entrylevel #beginner #goldcontent #nn #neuralnetwork

14.5K views10:11

🥇 83 💩 6

Data Science by ODS.ai 🦜

The HSIC Bottleneck: Deep Learning without Back-Propagation

An alternative to conventional backpropagation, that has a number of distinct advantages.

Link: https://arxiv.org/abs/1908.01580

#nn #backpropagation #DL #theory

arXiv.org

The HSIC Bottleneck: Deep Learning without Back-Propagation

We introduce the HSIC (Hilbert-Schmidt independence criterion) bottleneck for training deep neural networks. The HSIC bottleneck is an alternative to the conventional cross-entropy loss and...

13.4K views12:24

Data Science by ODS.ai 🦜

And the Bit Goes Down: Revisiting the Quantization of Neural Networks

Researchers at Facebook AI Research found a way to compress neural networks with minimal sacrifice in accuracy.

Works only on fully connected and CNN only for now.

Link: https://arxiv.org/abs/1907.05686

#nn #DL #minimization #compresson

11.4K views07:13

Data Science by ODS.ai 🦜

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

How normalization applied to layers helps to reach faster convergence.

ArXiV: https://arxiv.org/abs/1502.03167

#NeuralNetwork #nn #normalization #DL

arXiv.org

Batch Normalization: Accelerating Deep Network Training by...

Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the...

10.2K views08:43

👌 19 👎 3

Data Science by ODS.ai 🦜

Classification and Loss Evaluation - Softmax and Cross Entropy Loss

Nice notes on softmax cross entropy loss and how to implement it in numpy.

Link: https://deepnotes.io/softmax-crossentropy

#nn #entrylevel #wheretostart

Parasdahal

Softmax and Cross Entropy Loss

Understanding the intuition and maths behind softmax and the cross entropy loss - the ubiquitous combination in classification algorithms.

8.89K views10:06

Data Science by ODS.ai 🦜

27,600 V100 GPUs, 0.5 PB data, and a neural net with 220,000,000 weights

If you wonder, it all was used to address scientific inverse problem in materials imaging.

ArXiV: https://arxiv.org/pdf/1909.11150.pdf

#ItIsNotAboutSize #nn #dl

10.3K views05:57

🤯 41 😻 9

Data Science by ODS.ai 🦜

Applying deep learning to Airbnb search

Story of how #Airbnb research team moved from using #GBDT (gradient boosting) to #NN (neural networks) for search, with all the metrics and hypothesises.

Link: https://blog.acolyer.org/2019/10/09/applying-deep-learning-to-airbnb-search/

8.34K views05:19

🙈 2 😑 3 👍 27

Data Science by ODS.ai 🦜

Two papers stating random architecture search is a competitive (in some cases superior) baseline for NAS methods.

These are papers demonstrating that Neural Architecture Search can be stohastic.

Paper 1: https://arxiv.org/abs/1902.08142
Paper 2: https://arxiv.org/abs/1902.07638

#NAS #nn #DL

arXiv.org

Evaluating the Search Phase of Neural Architecture Search

Neural Architecture Search (NAS) aims to facilitate the design of deep networks for new tasks. Existing techniques rely on two stages: searching over the architecture space and validating the best...

10.4K views07:05

😸 13 🤕 9

Data Science by ODS.ai 🦜

All the vector algebra you need for understanding neural networks

Article contains great explanations and description of matrix calculus you need to know and understand to really grok neural networks.

Link: https://explained.ai/matrix-calculus/index.html

#WhereToStart #entrylevel #novice #base #DL #nn

explained.ai

The Matrix Calculus You Need For Deep Learning

Most of us last saw calculus in school, but derivatives are a critical part of machine learning, particularly deep neural networks, which are trained by optimizing a loss function. This article is an attempt to explain all the matrix calculus you need in…

8.68K views15:04

Data Science by ODS.ai 🦜

Free eBook from Stanford: Introduction to Applied Linear Algebra – Vectors, Matrices, and Least Squares

Base material you need to understand how neural networks and other #ML algorithms work.

Link: https://web.stanford.edu/~boyd/vmls/

#Stanford #MOOC #WhereToStart #free #ebook #algebra #linalg #NN

8.73K views17:09

Data Science by ODS.ai 🦜

Characterising Bias in Compressed Models

Popular compression techniques turned out to amplify bias in deep neural networks.

ArXiV: https://arxiv.org/abs/2010.03058

#NN #DL #bias

17.4K views08:28

Data Science by ODS.ai 🦜

Towards Causal Representation Learning

Work on how neural networks derive casual variables from low-level observations.

Link: https://arxiv.org/abs/2102.11107

#casuallearning #bengio #nn #DL

16.2K views10:49

Data Science by ODS.ai 🦜

Deep Neural Nets: 33 years ago and 33 years from now

Great post by Andrej Karpathy on the progress #CV made in 33 years.

Author's ideas on what would a time traveler from 2055 think about the performance of current networks:

* 2055 neural nets are basically the same as 2022 neural nets on the macro level, except bigger.
* Our datasets and models today look like a joke. Both are somewhere around 10,000,000X larger.
* One can train 2022 state of the art models in ~1 minute by training naively on their personal computing device as a weekend fun project.
* Today’s models are not optimally formulated, and just changing some of the details of the model, loss function, augmentation or the optimizer we can about halve the error.
* Our datasets are too small, and modest gains would come from scaling up the dataset alone.
* Further gains are actually not possible without expanding the computing infrastructure and investing into some R&D on effectively training models on that scale.

Website: https://karpathy.github.io/2022/03/14/lecun1989/
OG Paper link: http://yann.lecun.com/exdb/publis/pdf/lecun-89e.pdf

#karpathy #archeology #cv #nn

karpathy.github.io

Deep Neural Nets: 33 years ago and 33 years from now

Musings of a Computer Scientist.

👍48😢4🤮3🥰2😁2❤1

25.1K viewsedited 07:09

Data Science by ODS.ai 🦜

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper submitted by #DeepSeek team has generated significant attention in the AI community.

This work addresses the enhancement of reasoning capabilities in Large Language Models (LLMs) through the application of reinforcement learning techniques. The authors introduce a novel framework, DeepSeek-R1, which aims to improve LLM reasoning abilities by incorporating incentives for logical reasoning processes within their training. This integration of reinforcement learning allows LLMs to go beyond basic linguistic processing, developing sophisticated reasoning methods that can boost performance across a wide array of complex applications.

This approach has cause lots of discussions in different communities, but it definitely opens up the whole new direction of development for the research.

Source: https://arxiv.org/abs/2501.12948

#nn #LLM

@opendatascience

👍24❤6

8.5K views18:54

Data Science by ODS.ai 🦜

Publication: https://arxiv.org/abs/2506.01963
Original post in Russian: https://xn--r1a.website/Fourier_series/416

P.S. Fourier Series (@Fourier_series) is a great channel, get serialized! Fourier Transform is for the best!

#LLM #nn

arXiv.org

Breaking Quadratic Barriers: A Non-Attention LLM for Ultra-Long...

We present a novel non attention based architecture for large language models (LLMs) that efficiently handles very long context windows, on the order of hundreds of thousands to potentially...

2.16K views10:04

About

Blog

Apps

Platform