Data Science by ODS.ai ๐Ÿฆœ
45.1K subscribers
754 photos
84 videos
7 files
1.83K links
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @malev
Download Telegram
โ€‹โ€‹๐Ÿฅ‡Parameter optimization in neural networks.

Play with three interactive visualizations and develop your intuition for optimizing model parameters.

Link: https://www.deeplearning.ai/ai-notes/optimization/

#interactive #demo #optimization #parameteroptimization #novice #entrylevel #beginner #goldcontent #nn #neuralnetwork
โ€‹โ€‹And the Bit Goes Down: Revisiting the Quantization of Neural Networks

Researchers at Facebook AI Research found a way to compress neural networks with minimal sacrifice in accuracy.

Works only on fully connected and CNN only for now.

Link: https://arxiv.org/abs/1907.05686

#nn #DL #minimization #compresson
27,600 V100 GPUs, 0.5 PB data, and a neural net with 220,000,000 weights

If you wonder, it all was used to address scientific inverse problem in materials imaging.

ArXiV: https://arxiv.org/pdf/1909.11150.pdf

#ItIsNotAboutSize #nn #dl
Applying deep learning to Airbnb search

Story of how #Airbnb research team moved from using #GBDT (gradient boosting) to #NN (neural networks) for search, with all the metrics and hypothesises.

Link: https://blog.acolyer.org/2019/10/09/applying-deep-learning-to-airbnb-search/
Free eBook from Stanford: Introduction to Applied Linear Algebra โ€“ Vectors, Matrices, and Least Squares

Base material you need to understand how neural networks and other #ML algorithms work.

Link: https://web.stanford.edu/~boyd/vmls/

#Stanford #MOOC #WhereToStart #free #ebook #algebra #linalg #NN
โ€‹โ€‹Characterising Bias in Compressed Models

Popular compression techniques turned out to amplify bias in deep neural networks.

ArXiV: https://arxiv.org/abs/2010.03058

#NN #DL #bias
Towards Causal Representation Learning

Work on how neural networks derive casual variables from low-level observations.

Link: https://arxiv.org/abs/2102.11107

#casuallearning #bengio #nn #DL
Deep Neural Nets: 33 years ago and 33 years from now

Great post by Andrej Karpathy on the progress #CV made in 33 years.

Author's ideas on what would a time traveler from 2055 think about the performance of current networks:

* 2055 neural nets are basically the same as 2022 neural nets on the macro level, except bigger.
* Our datasets and models today look like a joke. Both are somewhere around 10,000,000X larger.
* One can train 2022 state of the art models in ~1 minute by training naively on their personal computing device as a weekend fun project.
* Todayโ€™s models are not optimally formulated, and just changing some of the details of the model, loss function, augmentation or the optimizer we can about halve the error.
* Our datasets are too small, and modest gains would come from scaling up the dataset alone.
* Further gains are actually not possible without expanding the computing infrastructure and investing into some R&D on effectively training models on that scale.


Website: https://karpathy.github.io/2022/03/14/lecun1989/
OG Paper link: http://yann.lecun.com/exdb/publis/pdf/lecun-89e.pdf

#karpathy #archeology #cv #nn
๐Ÿ‘48๐Ÿ˜ข4๐Ÿคฎ3๐Ÿฅฐ2๐Ÿ˜2โค1
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper submitted by #DeepSeek team has generated significant attention in the AI community.

This work addresses the enhancement of reasoning capabilities in Large Language Models (LLMs) through the application of reinforcement learning techniques. The authors introduce a novel framework, DeepSeek-R1, which aims to improve LLM reasoning abilities by incorporating incentives for logical reasoning processes within their training. This integration of reinforcement learning allows LLMs to go beyond basic linguistic processing, developing sophisticated reasoning methods that can boost performance across a wide array of complex applications.

This approach has cause lots of discussions in different communities, but it definitely opens up the whole new direction of development for the research.

Source: https://arxiv.org/abs/2501.12948

#nn #LLM

@opendatascience
๐Ÿ‘24โค6