AI with Papers - Artificial Intelligence & Deep Learning
15.5K subscribers
145 photos
256 videos
14 files
1.35K links
All the AI with papers. Every day fresh updates about #DeepLearning, #MachineLearning, LLMs and #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#artificialintelligence #machinelearning #ml #AI
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
♊️ Doppelgangers in Structures ♊️

👉A novel learning-based approach for visual disambiguation: distinguishing illusory matches to produce correct, disambiguated #3D reconstructions

😎Review https://t.ly/9yLot
😎Paper arxiv.org/pdf/2309.02420.pdf
😎Code github.com/RuojinCai/Doppelgangers
😎Project doppelgangers-3d.github.io/
🔥8👍3🤯2👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🧄FreeMan: towards #3D Humans 🧄

👉FreeMan: the first large-scale, real-world, multi-view dataset for #3D human pose estimation. 11M frames!

😎Review https://t.ly/ICxpA
😎Paper arxiv.org/pdf/2309.05073.pdf
😎Project wangjiongw.github.io/freeman
👏6🤯4🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
Decaf: 3D Face-Hand Interactions

👉The first learning-based MoCap to track human hands interacting with human faces in #3D from single monocular RGB videos

😎Review https://t.ly/070Tj
😎Paper arxiv.org/pdf/2309.16670.pdf
😎Project vcai.mpi-inf.mpg.de/projects/Decaf
👍8🤯8🔥31👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🧊 Depth Conditioning 🧊

👉LooseControl to control the generative image modeling process. Layout by boundaries and #3D box control via object locations (approximate bounding boxes)

👉Review https://t.ly/9y72m
👉Paper https://arxiv.org/pdf/2312.03079.pdf
👉Project https://shariqfarooq123.github.io/loose-control/
👉Repo https://github.com/shariqfarooq123/LooseControl
🔥146🤯4👍1🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🪮HAAR: Text-Driven Generative Hairstyles🪮

👉 HAAR: new strand-based generative model for #3D human hairstyles driven by textual input.

👉Review https://t.ly/L38iD
👉Project https://haar.is.tue.mpg.de/
👉Paper https://arxiv.org/pdf/2312.11666.pdf
👉Repo coming
🤯4🍾3👍2🔥1
This media is not supported in your browser
VIEW IN TELEGRAM
😻 GARField: Group Anything 😻

👉 GARField is a novel approach for decomposing #3D scenes into a hierarchy of semantically meaningful groups from posed image inputs.

👉Review https://t.ly/6Hkeq
👉Paper https://lnkd.in/d28mfRcZ
👉Project https://lnkd.in/dzYdRNKy
👉Repo (coming) https://lnkd.in/d2VeRJCS
👍83🥰1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
🍦Geometry Guided Depth🍦

👉Depth and #3D reconstruction which can take as input, where available, previously-made estimates of the scene’s geometry

👉Review https://lnkd.in/dMgakzWm
👉Paper https://arxiv.org/pdf/2406.18387
👉Repo (empty) https://github.com/nianticlabs/DoubleTake
👍7🔥71🥰1
This media is not supported in your browser
VIEW IN TELEGRAM
🧤GigaHands: Massive #3D Hands🧤

👉Novel massive #3D bimanual activities dataset: 34 hours of activities, 14k hand motions clips paired with 84k text annotation, 183M+ unique hand images

👉Review https://t.ly/SA0HG
👉Paper www.arxiv.org/pdf/2412.04244
👉Repo github.com/brown-ivl/gigahands
👉Project ivl.cs.brown.edu/research/gigahands.html
7👍1🤩1
This media is not supported in your browser
VIEW IN TELEGRAM
❤️‍🔥 Uncommon object in #3D ❤️‍🔥

👉#META releases uCO3D, a new object-centric dataset for 3D AI. The largest publicly-available collection of HD videos of objects with 3D annotations that ensures full-360◦ coverage. Code & data under CCA 4.0💙

👉Review https://t.ly/Z_tvA
👉Paper https://arxiv.org/pdf/2501.07574
👉Project https://uco3d.github.io/
👉Repo github.com/facebookresearch/uco3d
112😍2👍1👏1🤩1🍾1
This media is not supported in your browser
VIEW IN TELEGRAM
🏓LATTE-MV: #3D Table Tennis🏓

👉UC Berkeley unveils at #CVPR2025 a novel system for reconstructing monocular video of table tennis in 3D with uncertainty-aware controller that anticipates opponent actions. Code & Dataset announced, to be released💙

👉Review https://t.ly/qPMOU
👉Paper arxiv.org/pdf/2503.20936
👉Project sastry-group.github.io/LATTE-MV/
👉Repo github.com/sastry-group/LATTE-MV
🔥8👍2👏1🤯1
This media is not supported in your browser
VIEW IN TELEGRAM
🧊BoxDreamer Object Pose🧊

👉BoxDreamer is a generalizable RGB-based approach for #3D object pose estimation in the wild, specifically designed to address challenges in sparse-view settings. Code coming, demo released💙

👉Review https://t.ly/e-vX9
👉Paper arxiv.org/pdf/2504.07955
👉Project https://lnkd.in/djz8jqn9
👉Repo https://lnkd.in/dfuEawSA
🤗Demo https://lnkd.in/dVYaWGcS
3🔥3👏2👍1
This media is not supported in your browser
VIEW IN TELEGRAM
🥊 Pose in Combat Sports 🥊

👉The novel SOTA framework for an accurate physics-based #3D human pose estimation in combat sports w/ sparse multi-cameras setup. Dataset to be released soon💙

👉Review https://t.ly/EfcGL
👉Paper https://lnkd.in/deMMrKcA
👉Project https://lnkd.in/dkMS_UrH
👍13🔥43🤯2
This media is not supported in your browser
VIEW IN TELEGRAM
🍏PartField #3D Part Segmentation🍏

👉#Nvidia unveils PartField, a FFW approach for learning part-based 3D features, which captures the general concept of parts and their hierarchy. Suitable for single-shape decomposition, co-segm., correspondence & more. Code & Models released under Nvidia License💙

👉Review https://t.ly/fGb2O
👉Paper https://lnkd.in/dGeyKSzG
👉Code https://lnkd.in/dbe57XGH
👉Project https://lnkd.in/dhEgf7X2
2🔥2🤯2