ββGoogle announced the updated YouTube-8M dataset
Updated set now includes a subset with verified 5-s segment level labels, along with the 3rd Large-Scale Video Understanding Challenge and Workshop at #ICCV19.
Link: https://ai.googleblog.com/2019/06/announcing-youtube-8m-segments-dataset.html
#Google #YouTube #CV #DL #Video #dataset
Updated set now includes a subset with verified 5-s segment level labels, along with the 3rd Large-Scale Video Understanding Challenge and Workshop at #ICCV19.
Link: https://ai.googleblog.com/2019/06/announcing-youtube-8m-segments-dataset.html
#Google #YouTube #CV #DL #Video #dataset
ββSimultaneous food and facial recognition at a Foxconn factory canteen, Shenzhen China
#video #foodlearning #facerecogniction #dl #cv #foxconn
#video #foodlearning #facerecogniction #dl #cv #foxconn
Facebook open sourced video alignment algorithms that detect identical and near identical videos to build more robust defenses against harmful visual content.
Project page: https://newsroom.fb.com/news/2019/08/open-source-photo-video-matching/
Code: https://github.com/facebookresearch/videoalignment
#Facebook #video #cv #dl
Project page: https://newsroom.fb.com/news/2019/08/open-source-photo-video-matching/
Code: https://github.com/facebookresearch/videoalignment
#Facebook #video #cv #dl
Meta Newsroom
Open-Sourcing Photo- and Video-Matching Technology to Make the Internet Safer
We're sharing some of the tech we use to fight abuse on our platform with others.
Deep Fake Challenge by Facebook team
#Facebook launches a competition to fight deep fakes. Unfortunately, results of this competition will be obviously used to create better fakes, to the cheers of the people, wishing to watch the Matrix with Bruce Lee or more questionable deep fake applications.
Link: https://ai.facebook.com/blog/deepfake-detection-challenge/
#deepfake #video #cv #dl
#Facebook launches a competition to fight deep fakes. Unfortunately, results of this competition will be obviously used to create better fakes, to the cheers of the people, wishing to watch the Matrix with Bruce Lee or more questionable deep fake applications.
Link: https://ai.facebook.com/blog/deepfake-detection-challenge/
#deepfake #video #cv #dl
ββFSGAN: Subject Agnostic Face Swapping and Reenactment
New paper on #DeepFakes creation
YouTube demo: https://www.youtube.com/watch?v=duo-tHbSdMk
Link: https://nirkin.com/fsgan/
ArXiV: https://arxiv.org/pdf/1908.05932.pdf
#FaceSwap #DL #Video #CV
New paper on #DeepFakes creation
YouTube demo: https://www.youtube.com/watch?v=duo-tHbSdMk
Link: https://nirkin.com/fsgan/
ArXiV: https://arxiv.org/pdf/1908.05932.pdf
#FaceSwap #DL #Video #CV
ββπΉHow Tesla self-driving AI sees the world
#Tesla #selfdriving #cv #dl #video #Autonomous #video
#Tesla #selfdriving #cv #dl #video #Autonomous #video
ββBarak Obamaβs deep fake video used as intro to MIT 6.S191 class
Brilliant idea to win attention of students and to demonstrate at the very beggining of the course one of the applications of the materials they have to stydy.
YouTube: https://www.youtube.com/watch?v=l82PxsKHxYc
#DL #DeepFake #MIT #video
Brilliant idea to win attention of students and to demonstrate at the very beggining of the course one of the applications of the materials they have to stydy.
YouTube: https://www.youtube.com/watch?v=l82PxsKHxYc
#DL #DeepFake #MIT #video
YouTube
Barack Obama: Intro to Deep Learning | MIT 6.S191
MIT Introduction to Deep Learning 6.S191 (2020)
DISCLAIMER: The following video is synthetic and was created using deep learning with simultaneous speech-to-speech translation as well as video dialogue replacement (CannyAI).
** NOTE**: The audio qualityβ¦
DISCLAIMER: The following video is synthetic and was created using deep learning with simultaneous speech-to-speech translation as well as video dialogue replacement (CannyAI).
** NOTE**: The audio qualityβ¦
Castle in the Sky
Dynamic Sky Replacement and Harmonization in Videos
Fascinating and ready to be applied for work. (With colab notebook)
The authors proposed a method to replace the sky in the video that works well in high resolution. The results are very impressive. The method runs in real-time and produces video almost without glitches and artifacts. Also, can generate for example lightning and glow on target video.
The pipeline is quite complicated and contains several tasks:
β A sky matting network to segmentation sky on video frames
β A motion estimator for sky objects
β A skybox for blending where sky and other environments on video are relighting and recoloring.
Authors say their work, in a nutshell, proposes a new framework for sky augmentation in outdoor videos. The solution is purely vision-based and it can be applied to both online and offline scenarios.
But let's take a closer look.
A sky matting module is a ResNet-like encoder and several layers upsampling decoder to solve sky pixel-wise segmentation tasks followed by a refinement stage with guided image filtering.
A motion estimator directly estimates the motion of the objects in the sky. The motion patterns are modeled by an affine matrix and optical flow.
The sky image blending module is a decoder that models a linear combination of target sky matte and aligned sky template.
Overall, the network architecture is ResNet-50 as encoder and decoder with coordConv upsampling layers with skip connections and implemented in Pytorch,
The result is presented in a very cool video https://youtu.be/zal9Ues0aOQ
site: https://jiupinjia.github.io/skyar/
paper: https://arxiv.org/abs/2010.11800
github: https://github.com/jiupinjia/SkyAR
#sky #CV #video #cool #resnet
Dynamic Sky Replacement and Harmonization in Videos
Fascinating and ready to be applied for work. (With colab notebook)
The authors proposed a method to replace the sky in the video that works well in high resolution. The results are very impressive. The method runs in real-time and produces video almost without glitches and artifacts. Also, can generate for example lightning and glow on target video.
The pipeline is quite complicated and contains several tasks:
β A sky matting network to segmentation sky on video frames
β A motion estimator for sky objects
β A skybox for blending where sky and other environments on video are relighting and recoloring.
Authors say their work, in a nutshell, proposes a new framework for sky augmentation in outdoor videos. The solution is purely vision-based and it can be applied to both online and offline scenarios.
But let's take a closer look.
A sky matting module is a ResNet-like encoder and several layers upsampling decoder to solve sky pixel-wise segmentation tasks followed by a refinement stage with guided image filtering.
A motion estimator directly estimates the motion of the objects in the sky. The motion patterns are modeled by an affine matrix and optical flow.
The sky image blending module is a decoder that models a linear combination of target sky matte and aligned sky template.
Overall, the network architecture is ResNet-50 as encoder and decoder with coordConv upsampling layers with skip connections and implemented in Pytorch,
The result is presented in a very cool video https://youtu.be/zal9Ues0aOQ
site: https://jiupinjia.github.io/skyar/
paper: https://arxiv.org/abs/2010.11800
github: https://github.com/jiupinjia/SkyAR
#sky #CV #video #cool #resnet
YouTube
Dynamic Sky Replacement and Harmonization in Videos
Preprint: Castle in the Sky: Zhengxia Zou, Dynamic Sky Replacement and Harmonization in Videos, 2020.
Project page: https://jiupinjia.github.io/skyar/
Project page: https://jiupinjia.github.io/skyar/
π1
Forwarded from Machinelearning
Tencent Π²ΡΠΏΡΡΡΠΈΠ»Π° HunyuanCustom, ΡΡΠ΅ΠΉΠΌΠ²ΠΎΡΠΊ, ΠΊΠΎΡΠΎΡΡΠΉ Π½Π΅ ΡΠΎΠ»ΡΠΊΠΎ Π³Π΅Π½Π΅ΡΠΈΡΡΠ΅Ρ Π²ΠΈΠ΄Π΅ΠΎ ΠΏΠΎ Π·Π°Π΄Π°Π½Π½ΡΠΌ ΡΡΠ»ΠΎΠ²ΠΈΡΠΌ, Π½ΠΎ ΠΈ ΡΠΌΠ΅Π΅Ρ ΡΠΎΡ ΡΠ°Π½ΡΡΡ ΠΊΠΎΠ½ΡΠΈΡΡΠ΅Π½ΡΠ½ΠΎΡΡΡ ΡΡΠ±ΡΠ΅ΠΊΡΠΎΠ², Π±ΡΠ΄Ρ ΡΠΎ ΡΠ΅Π»ΠΎΠ²Π΅ΠΊ, ΠΆΠΈΠ²ΠΎΡΠ½ΠΎΠ΅ ΠΈΠ»ΠΈ ΠΏΡΠ΅Π΄ΠΌΠ΅Ρ. ΠΠΎΠ΄Π΅Π»Ρ ΡΠΏΡΠ°Π²Π»ΡΠ΅ΡΡΡ Π΄Π°ΠΆΠ΅ Ρ ΠΌΡΠ»ΡΡΠΈΡΡΠ±ΡΠ΅ΠΊΡΠ½ΡΠΌΠΈ ΡΡΠ΅Π½Π°ΠΌΠΈ: Π² Π΄Π΅ΠΌΠΎ-ΡΠΎΠ»ΠΈΠΊΠ°Ρ Π»ΡΠ΄ΠΈ Π΅ΡΡΠ΅ΡΡΠ²Π΅Π½Π½ΠΎ Π²Π·Π°ΠΈΠΌΠΎΠ΄Π΅ΠΉΡΡΠ²ΡΡΡ Ρ ΠΏΡΠ΅Π΄ΠΌΠ΅ΡΠ°ΠΌΠΈ, Π° ΡΠ΅ΠΊΡΡ Π½Π° ΡΠΏΠ°ΠΊΠΎΠ²ΠΊΠ°Ρ Π½Π΅ ΠΏΠ»ΡΠ²Π΅Ρ ΠΌΠ΅ΠΆΠ΄Ρ ΠΊΠ°Π΄ΡΠ°ΠΌΠΈ.
Π ΠΎΡΠ½ΠΎΠ²Π΅ ΠΌΠΎΠ΄Π΅Π»ΠΈ Π»Π΅ΠΆΠΈΡ ΡΠ»ΡΡΡΠ΅Π½Π½ΡΠΉ ΠΌΠ΅Ρ Π°Π½ΠΈΠ·ΠΌ ΡΠ»ΠΈΡΠ½ΠΈΡ ΡΠ΅ΠΊΡΡΠ° ΠΈ ΠΈΠ·ΠΎΠ±ΡΠ°ΠΆΠ΅Π½ΠΈΠΉ ΡΠ΅ΡΠ΅Π· LLaVA. ΠΠ°ΠΏΡΠΈΠΌΠ΅Ρ, Π΅ΡΠ»ΠΈ Π²Ρ Π·Π°Π³ΡΡΠΆΠ°Π΅ΡΠ΅ ΡΠΎΡΠΎ ΠΆΠ΅Π½ΡΠΈΠ½Ρ Π² ΠΏΠ»Π°ΡΡΠ΅ ΠΈ ΡΠ΅ΠΊΡΡ Β«ΡΠ°Π½ΡΡΠ΅Ρ ΠΏΠΎΠ΄ Π΄ΠΎΠΆΠ΄Π΅ΠΌΒ», ΡΠΈΡΡΠ΅ΠΌΠ° Π°Π½Π°Π»ΠΈΠ·ΠΈΡΡΠ΅Ρ ΠΎΠ±Π° ΠΈΠ½ΠΏΡΡΠ°, ΡΠ²ΡΠ·ΡΠ²Π°Ρ ΠΎΠΏΠΈΡΠ°Π½ΠΈΠ΅ Ρ Π²ΠΈΠ·ΡΠ°Π»ΡΠ½ΡΠΌΠΈ Π΄Π΅ΡΠ°Π»ΡΠΌΠΈ.
ΠΠΎ Π³Π»Π°Π²Π½ΠΎΠ΅ - ΡΡΠΎ ΠΌΠΎΠ΄ΡΠ»Ρ Π²ΡΠ΅ΠΌΠ΅Π½Π½ΠΎΠΉ ΠΊΠΎΠ½ΠΊΠ°ΡΠ΅Π½Π°ΡΠΈΠΈ: ΠΎΠ½ Β«ΡΠ°ΡΡΡΠ³ΠΈΠ²Π°Π΅ΡΒ» ΠΎΡΠΎΠ±Π΅Π½Π½ΠΎΡΡΠΈ ΠΈΠ·ΠΎΠ±ΡΠ°ΠΆΠ΅Π½ΠΈΡ Π²Π΄ΠΎΠ»Ρ Π²ΡΠ΅ΠΌΠ΅Π½Π½ΠΎΠΉ ΠΎΡΠΈ Π²ΠΈΠ΄Π΅ΠΎ, ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΡ 3D-VAE. ΠΡΠΎ ΠΏΠΎΠΌΠΎΠ³Π°Π΅Ρ ΠΈΠ·Π±Π΅ΠΆΠ°ΡΡ Β«ΠΏΡΡΠ³Π°ΡΡΠΈΡ Β» Π»ΠΈΡ ΠΈΠ»ΠΈ Π²Π½Π΅Π·Π°ΠΏΠ½ΡΡ ΠΈΠ·ΠΌΠ΅Π½Π΅Π½ΠΈΠΉ ΡΠΎΠ½Π°, ΠΏΡΠΎΠ±Π»Π΅ΠΌΡ, ΠΊΠΎΡΠΎΡΠ°Ρ Ρ Π°ΡΠ°ΠΊΡΠ΅ΡΠ½Π° Π΄Π°ΠΆΠ΅ Π΄Π»Ρ ΡΠΎΠΏΠΎΠ²ΡΡ ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ Π²ΠΈΠ΄Π΅ΠΎΠ³Π΅Π½Π΅ΡΠ°ΡΠΈΠΈ.
Tencent ΠΏΠ΅ΡΠ΅ΡΠ°Π±ΠΎΡΠ°Π»ΠΈ ΠΈ ΠΏΠ°ΠΉΠΏΠ»Π°ΠΉΠ½ Π°ΡΠ΄ΠΈΠΎ. ΠΠ»Ρ ΡΠΈΠ½Ρ ΡΠΎΠ½ΠΈΠ·Π°ΡΠΈΠΈ Π·Π²ΡΠΊΠ° Ρ Π΄Π²ΠΈΠΆΠ΅Π½ΠΈΡΠΌΠΈ Π³ΡΠ± ΠΈΠ»ΠΈ Π΄Π΅ΠΉΡΡΠ²ΠΈΡΠΌΠΈ Π² ΠΊΠ°Π΄ΡΠ΅ HunyuanCustom ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅Ρ AudioNet, ΠΌΠΎΠ΄ΡΠ»Ρ, ΠΊΠΎΡΠΎΡΡΠΉ Π²ΡΡΠ°Π²Π½ΠΈΠ²Π°Π΅Ρ Π°ΡΠ΄ΠΈΠΎ- ΠΈ Π²ΠΈΠ΄Π΅ΠΎΡΠΈΡΠΈ ΡΠ΅ΡΠ΅Π· ΠΏΡΠΎΡΡΡΠ°Π½ΡΡΠ²Π΅Π½Π½ΠΎΠ΅ ΠΊΡΠΎΡΡ-Π²Π½ΠΈΠΌΠ°Π½ΠΈΠ΅.
Π€ΡΠ΅ΠΉΠΌΠ²ΠΎΡΠΊ ΠΏΠΎΠ΄Π΄Π΅ΡΠΆΠΈΠ²Π°Π΅Ρ Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΠΎΡΡΡ Π·Π°ΠΌΠ΅Π½Ρ ΠΎΠ±ΡΠ΅ΠΊΡΠ° Π² Π³ΠΎΡΠΎΠ²ΠΎΠΌ ΡΠΎΠ»ΠΈΠΊΠ΅ (ΡΠΊΠ°ΠΆΠ΅ΠΌ, ΠΏΠΎΠ΄ΡΡΠ°Π²ΠΈΡΡ Π½ΠΎΠ²ΡΡ ΠΌΠΎΠ΄Π΅Π»Ρ ΠΊΡΠΎΡΡΠΎΠ²ΠΎΠΊ Π² ΡΠ΅ΠΊΠ»Π°ΠΌΡ), ΠΌΠΎΠ΄Π΅Π»Ρ ΡΠΆΠΈΠΌΠ°Π΅Ρ ΠΈΡΡ ΠΎΠ΄Π½ΠΎΠ΅ Π²ΠΈΠ΄Π΅ΠΎ Π² Π»Π°ΡΠ΅Π½ΡΠ½ΠΎΠ΅ ΠΏΡΠΎΡΡΡΠ°Π½ΡΡΠ²ΠΎ, Π²ΡΡΠ°Π²Π½ΠΈΠ²Π°Π΅Ρ Π΅Π³ΠΎ Ρ ΡΡΠΌΠ½ΡΠΌΠΈ Π΄Π°Π½Π½ΡΠΌΠΈ ΠΈ Π²ΡΡΡΠ°ΠΈΠ²Π°Π΅Ρ ΠΈΠ·ΠΌΠ΅Π½Π΅Π½ΠΈΡ Π±Π΅Π· Π°ΡΡΠ΅ΡΠ°ΠΊΡΠΎΠ² Π½Π° Π³ΡΠ°Π½ΠΈΡΠ°Ρ .
ΠΠΊΡΠΏΠ΅ΡΠΈΠΌΠ΅Π½ΡΠ°Π»ΡΠ½ΡΠ΅ ΡΠ΅ΡΡΡ ΠΏΠΎΠΊΠ°Π·Π°Π»ΠΈ, ΡΡΠΎ HunyuanCustom ΠΎΠ±Ρ ΠΎΠ΄ΠΈΡ ΠΊΠΎΠ½ΠΊΡΡΠ΅Π½ΡΠΎΠ² ΠΏΠΎ ΠΊΠ»ΡΡΠ΅Π²ΡΠΌ ΠΌΠ΅ΡΡΠΈΠΊΠ°ΠΌ. ΠΠ°ΠΏΡΠΈΠΌΠ΅Ρ, Face-Sim (ΡΠΎΡ ΡΠ°Π½Π΅Π½ΠΈΠ΅ ΠΈΠ΄Π΅Π½ΡΠΈΡΠ½ΠΎΡΡΠΈ Π»ΠΈΡΠ°) Ρ Tencent β 0.627 ΠΏΡΠΎΡΠΈΠ² 0.526 Ρ Hailuo, Π° Ρ Keling, Vidu, Pika ΠΈ Skyreels ΡΠ°Π·ΡΡΠ² Π΅ΡΠ΅ Π±ΠΎΠ»ΡΡΠ΅.
β οΈ ΠΠ»Ρ ΡΠ°Π±ΠΎΡΡ ΠΌΠΎΠ΄Π΅Π»Ρ ΡΡΠ΅Π±ΡΠ΅Ρ ΠΌΠΈΠ½ΠΈΠΌΡΠΌ 24 ΠΠ Π²ΠΈΠ΄Π΅ΠΎΠΏΠ°ΠΌΡΡΠΈ Π΄Π»Ρ ΡΠΎΠ»ΠΈΠΊΠΎΠ² 720p, Π½ΠΎ ΡΡΠΎΠ±Ρ ΡΠ°ΡΠΊΡΡΡΡ Π²ΡΠ΅ Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΠΎΡΡΠΈ, ΡΠ°Π·ΡΠ°Π±ΠΎΡΡΠΈΠΊΠΈ ΡΠ΅ΠΊΠΎΠΌΠ΅Π½Π΄ΡΡΡ 80 ΠΠ VRAM.
ΠΠΎΠ΄ ΠΈ ΡΠ΅ΠΊΠΏΠΎΠΈΠ½ΡΡ ΡΠΆΠ΅ Π΄ΠΎΡΡΡΠΏΠ½Ρ Π² ΠΎΡΠΊΡΡΡΠΎΠΌ Π΄ΠΎΡΡΡΠΏΠ΅, Π° Π² ΡΠ΅ΠΏΠΎΠ·ΠΈΡΠΎΡΠΈΠΈ Π΅ΡΡΡ ΠΏΡΠΈΠΌΠ΅ΡΡ Π·Π°ΠΏΡΡΠΊΠ° ΠΊΠ°ΠΊ Π½Π° Π½Π΅ΡΠΊΠΎΠ»ΡΠΊΠΈΡ GPU, ΡΠ°ΠΊ ΠΈ Π² ΡΠΊΠΎΠ½ΠΎΠΌΠ½ΠΎΠΌ ΡΠ΅ΠΆΠΈΠΌΠ΅ Π΄Π»Ρ ΠΏΠΎΡΡΠ΅Π±ΠΈΡΠ΅Π»ΡΡΠΊΠΈΡ Π²ΠΈΠ΄Π΅ΠΎΠΊΠ°ΡΡ.
@ai_machinelearning_big_data
#AI #ML #Video #HunyuanCustom #Tencent
Please open Telegram to view this post
VIEW IN TELEGRAM
Please open Telegram to view this post
VIEW IN TELEGRAM
π8π₯4π₯°2