Spark in me

RTX 3090 + Multi-Instance-GPU

So, ~2x faster than 2080 Ti, which is 30% faster than 1080Ti.
2x VRAM.

The only real question for me is, will it support Multi-Instance-GPU?

Let me explain why this is important. Now usually when you train a network, you increase your batch-size to fit the VRAM and monitor your IO and GPU load to ensure saturation.

But if a GPU has 2x VRAM and is 2-3x faster than 1080Ti, then maybe you can have multiple instances of your model on you GPU (that matters only for models that do not scale with large batch-sizes easily).

The only problem is that:

- You cannot use DDP in PyTorch (usually it is faster than DP for 4+ devices), because:

 DDP processes can be placed on the same machine or across machines, but GPU devices cannot be shared across processes.

- So you will have to invent something / change your code / or maybe even use their bleeding edge RPC functions;

If this function is available on 3090 ... then you could turn your GPU into 2-3 virtual GPUs and use it accordingly? That would be truly epic, especially for production use-cases (yeah I know about their SLA)! Also would be great for teamworking.

#hardware

NVIDIA Blog

Ride the Fast Lane to AI Productivity with Multi-Instance GPUs

The multi-instance GPU (MIG) technology in the NVIDIA Ampere architecture enables the NVIDIA A100 GPU to deliver up to 7x higher utilization compared to prior GPUs.

1.76K viewsAlexander, 12:13

Spark in me

https://youtu.be/deqljDr618c

YouTube

Что рассказала и НЕ РАССКАЗАЛА Nvidia о RTX 3000 (RTX Ampere)

Видеокарты — https://www.e-katalog.ru/u/PHcYQ5/a (там уже есть RTX 3000)
Комплектующие — https://www.e-katalog.ru/u/vA95cL/a
В видео разбираемся с тем - что показали Nvidia о своих будущих игровых видеокартах, а так же о том, что осталось "за кадром" презентации…

1.56K viewsAlexander, 03:36

Spark in me

Notebooks + Spreadsheets

Notebooks and spreadsheets (Excel or Google Sheets) have always been two most useful and helpful instruments I have ever used. Whole companies were built based on pseudo-relational Excel databases (this is ofc does not scale well).

Now there is a new library in python that integrates some JS tables library seamlessly with ipywidgets and notebooks. It is news and predictably sucks a little bit (as most of interactive tables in JS).

It goes without saying that it opens up a lot of possibilities for ML annotation - you can essentially combine tables and ipywidgets easily.

As far as I see It does not have an option to embed some HTML code, but recently there just appeared and Audio widget in ipywidgets (buried in the release notes somewhere)

So you can just use this to load audio into ipysheet:

wavb = open('test.wav', "rb").read()
audio = Audio(value=wavb,
              format='wav',
              autoplay=False)

#data_science

GitHub

GitHub - QuantStack/ipysheet: Jupyter handsontable integration

Jupyter handsontable integration. Contribute to QuantStack/ipysheet development by creating an account on GitHub.

1.35K viewsAlexander, edited 06:18

Spark in me

1.4K viewsAlexander, 06:18

Spark in me

https://youtu.be/X8jsijhllIA

YouTube

But what are Hamming codes? The origin of error correction

A discovery-oriented introduction to error correction codes.
Part 2: https://youtu.be/b3NxrZOu_CE
Ben Eater:'s take: https://youtu.be/h0jloehRKas
Help fund future projects: https://www.patreon.com/3blue1brown
An equally valuable form of support is to simply…

1.32K viewsAlexander, 08:40

Spark in me

https://youtu.be/b3NxrZOu_CE

YouTube

Hamming codes part 2: The one-line implementation

A cleaner perspective on Hamming error correction codes
Part 1: https://youtu.be/X8jsijhllIA
Watch Ben Eater's video: https://youtu.be/h0jloehRKas
Help fund future projects: https://www.patreon.com/3blue1brown
An equally valuable form of support is to simply…

1.52K viewsAlexander, 08:40

Spark in me

https://timdettmers.com/2020/09/07/which-gpu-for-deep-learning/

Tim Dettmers

The Best GPUs for Deep Learning in 2023 — An In-depth Analysis

Here, I provide an in-depth analysis of GPUs for deep learning/machine learning and explain what is the best GPU for your use-case and budget.

1.42K viewsAlexander, 02:28

Spark in me

1.49K viewsAlexander, 02:29

Spark in me

Notes from captain obvious:

Сomparing two GPUs with Tensor Cores, one of the single best indicators for each GPU’s performance is their memory bandwidth;

Most computation time on GPUs is memory access;

A100 compared to the V100 is 1.70x faster for NLP and 1.45x faster for computer vision;

Tesla A100 compared to the V100 is 1.70x faster for NLP and 1.45x faster for computer vision;

3-Slot design of the RTX 3090 makes 4x GPU builds problematic. Possible solutions are 2-slot variants or the use of PCIe extenders;

4x RTX 3090 will need more power than any standard power supply unit on the market can provide right now (this is BS, but power connectors may be an issue - I have 2000W PSU);

With BF16 precision, training might be more stable than with FP16 precision while providing the same speedups;

The new fan design for the RTX 30sV series features both a blower fan and a push/pull fan;

350W TDP;

Compared to an RTX 2080 Ti, the RTX 3090 yields a speedup of 1.57x for convolutional networks and 1.5x for transformers while having a 15% higher release price. Thus the Ampere RTX 30s delivers a pretty substantial improvement over the Turing RTX 20s series;

PCIe 4.0 and PCIe lanes do not matter in 2x GPU setups. For 4x GPU setups, they still do not matter much;

NVLink is not useful. Only useful for GPU clusters;

No info about power connector. But I believe the first gaming gpus use 2*6 pin plus maybe some adapter;

Despite heroic software engineering efforts, AMD GPUs + ROCm will probably not be able to compete with NVIDIA due to lacking community and Tensor Core equivalent for at least 1-2 years;

You will need +50Gbits/s network cards to gain speedups if you want to parallelize across machines;

So if you expect to run deep learning models after 300 days, it is better to buy a desktop instead of using AWS spot instances (also fuck off AWS and Nvidia with sla about data centers);

2.12K viewsAlexander, edited 03:04

Spark in me

Data Science by ODS.ai 🦜

Nvidia announced new card RTX 3090 RTX 3090 is roughly 2 times more powerful than 2080. There is probably no point in getting 3080 because RAM volume is only 10G. But what really matters, is how it was presented. Purely technological product for mostly…

https://www.youtube.com/watch?v=lD7vUX8skfc

YouTube

Греет ли RTX 3080 память и кулер процессора? Моделирование воздушных потоков референсной RTX 3080.

RTX 3000 серии - https://www.e-katalog.ru/u/SCQd7w/a
Комплектующие - https://www.e-katalog.ru/u/znDHaL/a
В видео смотрим на то как RTX 3080 со сквозным вентилятором ведёт в себя в обычных и в необычных корпусах. замеряем температуры внутри корпуса на разных…

1.85K viewsAlexander, 18:06

Spark in me

Silero Speech-To-Text Models V1 Released

We are proud to announce that we have released our high-quality (i.e. on par with premium Google models) speech-to-text Models for the following languages:

- English
- German
- Spanish

Why this is a big deal:

- STT Research is typically focused on huge compute budgets
- Pre-trained models and recipes did not generalize well, were difficult to use even as-is, relied on obsolete tech
- Until now STT community lacked easy to use high quality production grade STT models

How we solve it:

- We publish a set of pre-trained high-quality models for popular languages
- Our models are embarrassingly easy to use
- Our models are fast and can be run on commodity hardware

Even if you do not work with STT, please give us a star / share!

Links

- https://github.com/snakers4/silero-models

GitHub

GitHub - snakers4/silero-models: Silero Models: pre-trained text-to-speech models made embarrassingly simple

Silero Models: pre-trained text-to-speech models made embarrassingly simple - snakers4/silero-models

24K viewsAlexander, 12:52

Spark in me

Reposts on Habr

https://habr.com/ru/post/519564/
https://habr.com/ru/post/519562/

If you have an account, please give is a like

Хабр

Мы опубликовали современные STT модели сравнимые по качеству с Google

Мы наконец опубликовали наш набор высококачественных пре-тренированных моделей для распознавания речи (т.е. сравнимых по качеству с премиум-моделями Google ) для следующих языков: Английский;...

1.74K viewsAlexander, 16:49

Spark in me

Reposts on Habr https://habr.com/ru/post/519564/ https://habr.com/ru/post/519562/ If you have an account, please give is a like

Repost on Medium

https://medium.com/@aveysov/modern-google-level-stt-models-released-c6491019e30c?sk=0d51c5301da830c31dcd9d2de7171c17

Medium

Modern Google-level STT Models Released

Our models are on par with premium Google models and also really simple to use

1.81K viewsAlexander, 15:21

Spark in me

Silero Models on Torch Hub

TLDR - https://pytorch.org/hub/snakers4_silero-models_stt/

Also Soumith Chintala himself commented on this release.

PS
Upvote on HackerNews
https://news.ycombinator.com/item?id=24565831

1.63K viewsAlexander, edited 08:52

Spark in me

2020 DS / ML Digest 10

Highlights:

- Silero STT models release on Torch hub
- Oculus Quest 2, now $100 cheaper at $300
- Nvidia 30?0 close to release (see above a detailed commentary on Tim's post)
- Fast.ai book
- Benchmarking Deep Learning Optimizers
- Language-Agnostic BERT Sentence Embedding + new Transformer LASER
- Are we done with ImageNet?

Please like / share / repost!

https://spark-in.me/post/2020_ds_ml_digest_11

#digest

Spark in me - Internet, data science, math, deep learning, philosophy

1.45K viewsAlexander, 10:04

Spark in me

Microsoft ... Stepping up its Game in ML?

Wow, wow, do not close yet! I am a big MS-hater myself.

Well, an outsider / legacy player in cutting edge tech / ML, looks like with a series of well-placed decisions it may earn its place under the ML sun?

Yes, you heard me right. I am big Microsoft hater, but Just check this out:

- https://github.com/microsoft/onnxjs
- https://github.com/microsoft/onnxruntime#binaries
- https://github.com/microsoft/DeepSpeed
- ... OpenAI deal is just for hype I guess, no-one takes OpenAI seriously, right? ( ͡° ͜ʖ ͡°)
- Also I recently used Azure datasets ... it was clunky compared to S3, but beggars cannot be choosers. Download speeds were slow, but their VScode-like desktop app was ok ... but some features just did not work

It used to be a standard narrative "TF = production". But I guess a more correct one would be "Google has invested billions in marketing and it has huge captive audience".

Lately I spent some time reading TF tutorials ... and they are so needlessly difficult - they fucking invent a protocol for everything! For what PyTorch hub achieves in 4 steps, TF hub requires you to read 10 markdown docs ... written in corporate language.

So, why is this important? Because proper competition makes everything shine brighter.

Why TF 2.0? Because PyTorch 1.0 ( ͡° ͜ʖ ͡°). Now it looks like Google and Nvidia have a new real competitors in ML inference market together with Intel (which afaik is losing in general, but that is another story).

Nice!

#deep_learning
#machine_learning
#rant

GitHub

GitHub - microsoft/onnxjs: ONNX.js: run ONNX models using JavaScript

ONNX.js: run ONNX models using JavaScript. Contribute to microsoft/onnxjs development by creating an account on GitHub.

1.47K viewsAlexander, 17:04

Spark in me

Also about competition.

.... why Microsoft of all people wants to train a effing TRILLION parameter transformer?

... ( ͡° ͜ʖ ͡°) because they license a 100bn one from another company.

PS
I may be wrong in the exact figures.

1.32K viewsAlexander, 17:08

Spark in me

Which ML hub have you used?

Anonymous Poll

172 voters1.44K viewsAlexander, 05:03

Spark in me

Core DL Framework?

Anonymous Poll

I use wrappers built on top

363 voters1.61K viewsAlexander, 05:07

Spark in me

Our Model Featured on TF Hub

https://github.com/snakers4/silero-models

So far I added only the English as a start:

- https://tfhub.dev/silero
- https://tfhub.dev/silero/silero-stt/en/1
- https://tfhub.dev/silero/collections/silero-stt/1

GitHub

GitHub - snakers4/silero-models: Silero Models: pre-trained text-to-speech models made embarrassingly simple

Silero Models: pre-trained text-to-speech models made embarrassingly simple - snakers4/silero-models

1.54K viewsAlexander, 13:56

Spark in me

Forwarded from NVIDIA Inception

ДОКЛАД NVIDIA "Fast training with AMP/TF32 using TensorCores on NVIDIA GPU" на Data Fest + СЕССИЯ Q&A

Денис Тимонин, AI Solutions Architect в NVIDIA, расскажет об одном из самых эффективных методов ускорения обучения и инференса нейросетей - применении смешанной точности. В своем докладе Денис разберет статью “Mixed Precision Training” от NVIDIA и Baidu Research и расскажет о деталях работы с точностью формата TensorFloat32. Также мы обсудим алгоритмы, которые применяются при обучении с помощью смешанной точности и поговорим об аппаратных решениях, которые обеспечивают высокую скорость работы для форматов данных в нейросетях.
В первой части доклада мы разберем числа с плавающей точкой, мотивацию за обучением в смешанной точности, тензорные ядра, а также обучим сложную нейросеть StarGAN V2 (CVPR 2020) в режиме Automatic Mixed precision (AMP).
Во второй части погрузимся в оптимизацию работы с тензорными ядрами: разберем трюки для быстрого обучения в высокоуровневых фреймворках, C++ API, а так же научимся подбирать правильные размеры данных и слоев в нейросети для наибыстрейшего обучения.

Доклад записан на английском языке.

Доклад уже доступен на Youtube канале ODS: https://bit.ly/3kPAvPA

Сессия Q&A состоится в субботу, 26 сентября с 12 до 14 тут: https://spatial.chat/s/ods Пароль для входа можно получить тут: https://bit.ly/2GbDB1j

YouTube

Optimization Track. Denis Timonin: Fast training with AMP/TF32 using TensorCores on NVIDIA GPU

Increasing the size of a neural network typically improves accuracy but also increases the memory and compute requirements for training the model. At the same time amount of data is constantly growing (exponentially in the last years). So we will talk about…

1.41K viewsAlexander, 13:56

About

Blog

Apps

Platform