Data Science by ODS.ai 🦜

An autonomous AI racecar using NVIDIA Jetson Nano

Usually DS means some blue collar work. Rare cases suggest physical interactions. This set by #NVidia allows to build $400/$600 toy car capable of #selfdriving.

#JetRacer comes with a couple examples to get you up and running. The examples are in the format of Jupyter Notebooks, which are interactive documents which combine text, code, and visualization. Once you've completed the notebooks, start tweaking them to create your own racing software!

Github: https://github.com/NVIDIA-AI-IOT/jetracer

#autonomousvehicle #rl #jupyter #physical

10.1K views06:01

🏎 58 😒 5

Data Science by ODS.ai 🦜

OpenCV ‘dnn’ with NVIDIA GPUs: 1.549% faster YOLO, SSD, and Mask R-CNN

- Object detection and segmentation
- Working Python implementations of each
- Includes pre-trained models

tutorial: https://t.co/Wt0IrJObcE?amp=1

#OpenCV #dl #nvidia

10.3K views08:42

Data Science by ODS.ai 🦜

1:00

This media is not supported in your browser

VIEW IN TELEGRAM

Nvidia AI Noise Reduction

#Nvidia launches #KrispAI competitor Noise Reduction by AI on RTX Videocards.

Seems it works significantly better then other that kind of tools. But it needs to have Nvidia RTX officially.

But it possible to run it on older cards. The instruction is below. Or you can just download already hacked executable (also, below)

Setup Guide: https://www.nvidia.com/en-us/geforce/guides/nvidia-rtx-voice-setup-guide/
The instruction: https://forums.guru3d.com/threads/nvidia-rtx-voice-works-without-rtx-gpu-heres-how.431781/
Executable (use it on your own risk): https://mega.nz/file/CJ0xDYTB#LPorY_aPVqVKfHqWVV7zxK8fNfRmxt6iw6KdkHodz1M

#noisereduction #soundlearning #dl #noise #sound #speech #nvidia

15.9K viewsedited 11:58

Data Science by ODS.ai 🦜

Learning to Simulate Dynamic Environments with GameGAN

#Nvidia designed a GAN that able to recreate games without any game engine. To train it, authors of the model use experience collected by reinforcement learning and other techniques.

GameGAN successfully reconstructed all mechanics of #Pacman game. Moreover, the trained model can generate new mazes that have never appeared in the original game. It can even replace background (static objects) and foreground (dynamic objects) with different images!

As the authors say, applying reinforcement learning algorithms to real world tasks requires accurate simulation of that task. Currently designing such simulations is expensive and time-consuming. Using neural networks instead of hand-written simulations may help to solve these problems.

Paper: https://cdn.arstechnica.net/wp-content/uploads/2020/05/Nvidia_GameGAN_Research.pdf
Blog: https://blogs.nvidia.com/blog/2020/05/22/gamegan-research-pacman-anniversary/
Github Page: https://nv-tlabs.github.io/gameGAN/

#GAN #RL

10.7K viewsedited 06:30

🕹 41 🤔 14

Data Science by ODS.ai 🦜

Nvidia announced new card RTX 3090

RTX 3090 is roughly 2 times more powerful than 2080.
There is probably no point in getting 3080 because RAM volume is only 10G.

But what really matters, is how it was presented. Purely technological product for mostly proffesionals, techheads and gamers was presented with absolute brialliancy. That is much more exciting then the release itself.

YouTube: https://www.youtube.com/watch?v=E98hC9e__Xs

#Nvidia #GPU #techstack

15.3K views11:59

Comment

Data Science by ODS.ai 🦜

#NVidia performance per dollar

12.6K views07:30

Comment

Data Science by ODS.ai 🦜

NVidia released a technology to change face alignment on video

Nvidia has unveiled AI face-alignment that means you're always looking at the camera during video calls. Its new Maxine platform uses GANs to reconstruct the unseen parts of your head — just like a deepfake.

Link: https://www.theverge.com/2020/10/5/21502003/nvidia-ai-videoconferencing-maxine-platform-face-gaze-alignment-gans-compression-resolution

#NVidia #deepfake #GAN

15.6K views20:10

🧐 39 🤖 27 👾 38

Data Science by ODS.ai 🦜

Unsupervised 3D Neural Rendering of Minecraft Worlds

Work on unsupervised neural rendering framework for generating photorealistic images of Minecraft (or any large 3D block worlds).

Why this is cool: this is a step towards better graphics for games.

Project Page: https://nvlabs.github.io/GANcraft/
YouTube: https://www.youtube.com/watch?v=1Hky092CGFQ&t=2s

#GAN #Nvidia #Minecraft

17.1K views08:09

Data Science by ODS.ai 🦜

14 seconds of April #Nvidia 's CEO speech was generated in silico

Why this important: demand for usage of 3080 and newer GPU models might also get pumped by CGI artists and researchers working in VR / AR tech.

And this raises the bar for #speechsinthesis / #speechgeneration and definately for the rendering of photorealistic picture.

YouTube making of video: https://www.youtube.com/watch?v=1qhqZ9ECm70&t=1430s
Vice article on the subject: https://www.vice.com/en/article/88nbpa/nvidia-reveals-its-ceo-was-computer-generated-in-keynote-speech

YouTube

Connecting in the Metaverse: The Making of the GTC Keynote

See how a small team of artists were able to blur the line between real and rendered in NVIDIA’s #GTC21 keynote in this behind-the-scenes documentary. Read more: https://nvda.ws/3s97Tpy

@NVIDIAOmniverse is an open platform built for virtual collaboration…

👍1

16.1K views22:02

Data Science by ODS.ai 🦜

🔥Alias-Free Generative Adversarial Networks (StyleGAN3) release

King is dead! Long live the King! #StyleGAN2 was #SOTA and default standard for generating images. #Nvidia released update version, which will lead to more realistic images generated by the community.

Article: https://nvlabs.github.io/stylegan3/
GitHub: https://github.com/NVlabs/stylegan3
Colab: https://colab.research.google.com/drive/1BXNHZBai-pXtP-ncliouXo_kUiG1Pq7M

#GAN #dl

27.1K views11:00

Data Science by ODS.ai 🦜

EditGAN: High-Precision Semantic Image Editing

Nvidia researches built an approach for editing segments of a picture with supposedly realtime picture augmentation according to the segment alterations. No demo is available yet though.

All the photoshop power users should relax, because appereance of such a tools means less work for them, not that the demand for the manual retouch will cease.

Website: https://nv-tlabs.github.io/editGAN/
ArXiV: https://arxiv.org/abs/2111.03186

#GAN #Nvidia

👍3👎1🔥1

14.2K views05:14

Data Science by ODS.ai 🦜

🔥 Say Goodbye to LoRA, Hello to DoRA 🤩🤩

DoRA consistently outperforms LoRA with various tasks (LLM, LVLM, etc.) and backbones (LLaMA, LLaVA, etc.)

[Paper] https://arxiv.org/abs/2402.09353
[Code] https://github.com/NVlabs/DoRA

#Nvidia
#icml #PEFT #lora #ML #ai

@opendatascience

👍28🔥8🤣5❤3👏1

21.6K viewsedited 16:58

Data Science by ODS.ai 🦜

Forwarded from Machinelearning

⚡️

SANA: Генерация изображений изображений высокого разрешения от Nvidia Labs.

Sana - семейство моделей для генерации изображений с разрешением до 4096x4096 пикселей. Главное преимущество Sana - высокая скорость инференса и низкие требования к ресурсам, модели можно запустить даже на ноутбуке.

Секрет эффективности Sana в ее архитектуре, которая состоит из нескольких инновационных компонентов:

🟢

Deep Compression Autoencoder (DC-AE)
Сжимает изображение в 32 раза, в результате чего значительно сокращается число латентных токенов, что, в свою очередь, повышает эффективность обучения и позволяет генерировать изображения с разрешением 4K.

🟢Linear Diffusion Transformer (Linear DiT)
Использует линейное внимание вместо традиционного, ускоряя генерацию с разрешением 4K в 1.7 раза.

В Linear DiT вместо модуля MLP-FFN используется Mix-FFN, который объединяет в себе свертку 3x3 и Gated Linear Unit (GLU). Mix-FFN позволяет отказаться от позиционного кодирования без потери качества.

🟢Decoder-only Small LLM as Text Encoder
Энкодер, основанный на LLM Gemma, который лучше понимает текстовые запросы пользователя и точнее передает их смысл на генерации.

Для точного соответствия "текст - изображение" при обучении энкодера применялись "сложные человеческие инструкции" (CHI), которые научили Gemma учитывать контекст запроса.

Sana создавалась с помощью уникальной стратегии обучения и выборки. В процессе обучения используются несколько VLM (VILA, InternVL2) для создания различных аннотаций к каждому изображению. Затем, на основе CLIP-оценки, были отобраны наиболее подходящие пары "текст-изображение".

Обучение происходило постепенно, начиная с разрешения 512x512 и заканчивая 4096x4096, а алгоритм Flow-DPM-Solver ускорил процесс выборки, сократив количество шагов по сравнению с Flow-Euler-Solver.

Результаты тестирования Sana впечатляют:

🟠Sana-0.6B, работающая с изображениями 512x512, в 5 раз быстрее, чем PixArt-Σ, при этом показывает лучшие результаты по метрикам FID, Clip Score, GenEval и DPG-Bench.

🟠При разрешении 1024x1024 Sana-0.6B в 40 раз быстрее PixArt-Σ.

🟠Sana-0.6B превосходит по скорости Flux-12B в 39 раз при разрешении 1024x1024) и может быть запущена на ноутбуке с 16 GB VRAM, генерируя изображения 1024x1024 менее чем за секунду.

⚠️ Для локального инференса модели 0.6B требуется 9GB VRAM, а для модели 1.6B - 12GB VRAM.

▶️ Установка и инференс c GradioUI:

# official online demo
DEMO_PORT=15432 \
python app/app_sana.py \
      --config=configs/sana_config/1024ms/Sana_1600M_img1024.yaml \
      --model_path=hf://Efficient-Large-Model/Sana_1600M_1024px/checkpoints/Sana_1600M_1024px.pth

🟡

Страница проекта

🟡

Коллекция моделей на HF

🟡

Arxiv

🟡

Demo

🖥

GitHub

@ai_machinelearning_big_data

#AI #ML #Diffusion #SANA #NVIDIA

Please open Telegram to view this post

VIEW IN TELEGRAM