AI with Papers - Artificial Intelligence & Deep Learning
15.6K subscribers
145 photos
260 videos
14 files
1.36K links
All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#AI #chatGPT
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿชฐ #3D Auto-Reconstruction ๐Ÿชฐ

๐Ÿ‘‰AutoRecon: automated discovery & reconstruction of objects from multi-view pics.

๐Ÿ˜ŽReview https://bit.ly/3MxI0f4
๐Ÿ˜ŽPaper arxiv.org/pdf/2305.08810.pdf
๐Ÿ˜ŽProject zju3dv.github.io/autorecon/
๐Ÿ˜ŽCode github.com/zju3dv/AutoRecon
๐Ÿ”ฅ11โค4๐Ÿคฏ3๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘๏ธ Scene Five: Through Her Eyes ๐Ÿ‘๏ธ

๐Ÿ‘‰ #3D scene reconstruction of what a person is observing using only the reflections of their eyes

๐Ÿ˜ŽReview https://t.ly/uBO6
๐Ÿ˜ŽPaper arxiv.org/pdf/2306.09348.pdf
๐Ÿ˜ŽProject https://world-from-eyes.github.io/
๐Ÿคฏ28๐Ÿ”ฅ12๐Ÿ’ฉ2๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘ฉโ€๐Ÿš€ HD Avatar via Text & Pose ๐Ÿ‘ฉโ€๐Ÿš€

๐Ÿ‘‰ Generating expressive #3D avatars from nothing but text descriptions & pose guidance

๐Ÿ˜ŽReview https://t.ly/wrSMH
๐Ÿ˜ŽPaper arxiv.org/pdf/2308.03610.pdf
๐Ÿ˜ŽProject avatarverse3d.github.io
โค7๐Ÿฅฐ4๐Ÿ‘1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒตPOCO: 3D HPS + Confidence๐ŸŒต

๐Ÿ‘‰ Novel framework for HPS: #3D human body + confidence in a single feed-forward pass

๐Ÿ˜ŽReview https://t.ly/cDePe
๐Ÿ˜ŽPaper arxiv.org/pdf/2308.12965.pdf
๐Ÿ˜ŽProject https://poco.is.tue.mpg.de
๐Ÿ”ฅ5๐Ÿ‘3โค2๐Ÿคฏ1๐Ÿ˜ฑ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ™Š๏ธ Doppelgangers in Structures โ™Š๏ธ

๐Ÿ‘‰A novel learning-based approach for visual disambiguation: distinguishing illusory matches to produce correct, disambiguated #3D reconstructions

๐Ÿ˜ŽReview https://t.ly/9yLot
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.02420.pdf
๐Ÿ˜ŽCode github.com/RuojinCai/Doppelgangers
๐Ÿ˜ŽProject doppelgangers-3d.github.io/
๐Ÿ”ฅ8๐Ÿ‘3๐Ÿคฏ2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿง„FreeMan: towards #3D Humans ๐Ÿง„

๐Ÿ‘‰FreeMan: the first large-scale, real-world, multi-view dataset for #3D human pose estimation. 11M frames!

๐Ÿ˜ŽReview https://t.ly/ICxpA
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.05073.pdf
๐Ÿ˜ŽProject wangjiongw.github.io/freeman
๐Ÿ‘6๐Ÿคฏ4๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
โ˜•Decaf: 3D Face-Hand Interactionsโ˜•

๐Ÿ‘‰The first learning-based MoCap to track human hands interacting with human faces in #3D from single monocular RGB videos

๐Ÿ˜ŽReview https://t.ly/070Tj
๐Ÿ˜ŽPaper arxiv.org/pdf/2309.16670.pdf
๐Ÿ˜ŽProject vcai.mpi-inf.mpg.de/projects/Decaf
๐Ÿ‘8๐Ÿคฏ8๐Ÿ”ฅ3โค1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸงŠ Depth Conditioning ๐ŸงŠ

๐Ÿ‘‰LooseControl to control the generative image modeling process. Layout by boundaries and #3D box control via object locations (approximate bounding boxes)

๐Ÿ‘‰Review https://t.ly/9y72m
๐Ÿ‘‰Paper https://arxiv.org/pdf/2312.03079.pdf
๐Ÿ‘‰Project https://shariqfarooq123.github.io/loose-control/
๐Ÿ‘‰Repo https://github.com/shariqfarooq123/LooseControl
๐Ÿ”ฅ14โค6๐Ÿคฏ4๐Ÿ‘1๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸชฎHAAR: Text-Driven Generative Hairstyles๐Ÿชฎ

๐Ÿ‘‰ HAAR: new strand-based generative model for #3D human hairstyles driven by textual input.

๐Ÿ‘‰Review https://t.ly/L38iD
๐Ÿ‘‰Project https://haar.is.tue.mpg.de/
๐Ÿ‘‰Paper https://arxiv.org/pdf/2312.11666.pdf
๐Ÿ‘‰Repo coming
๐Ÿคฏ4๐Ÿพ3๐Ÿ‘2๐Ÿ”ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ˜ป GARField: Group Anything ๐Ÿ˜ป

๐Ÿ‘‰ GARField is a novel approach for decomposing #3D scenes into a hierarchy of semantically meaningful groups from posed image inputs.

๐Ÿ‘‰Review https://t.ly/6Hkeq
๐Ÿ‘‰Paper https://lnkd.in/d28mfRcZ
๐Ÿ‘‰Project https://lnkd.in/dzYdRNKy
๐Ÿ‘‰Repo (coming) https://lnkd.in/d2VeRJCS
๐Ÿ‘8โค3๐Ÿฅฐ1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸฆGeometry Guided Depth๐Ÿฆ

๐Ÿ‘‰Depth and #3D reconstruction which can take as input, where available, previously-made estimates of the sceneโ€™s geometry

๐Ÿ‘‰Review https://lnkd.in/dMgakzWm
๐Ÿ‘‰Paper https://arxiv.org/pdf/2406.18387
๐Ÿ‘‰Repo (empty) https://github.com/nianticlabs/DoubleTake
๐Ÿ‘7๐Ÿ”ฅ7โค1๐Ÿฅฐ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸงคGigaHands: Massive #3D Hands๐Ÿงค

๐Ÿ‘‰Novel massive #3D bimanual activities dataset: 34 hours of activities, 14k hand motions clips paired with 84k text annotation, 183M+ unique hand images

๐Ÿ‘‰Review https://t.ly/SA0HG
๐Ÿ‘‰Paper www.arxiv.org/pdf/2412.04244
๐Ÿ‘‰Repo github.com/brown-ivl/gigahands
๐Ÿ‘‰Project ivl.cs.brown.edu/research/gigahands.html
โค7๐Ÿ‘1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
โค๏ธโ€๐Ÿ”ฅ Uncommon object in #3D โค๏ธโ€๐Ÿ”ฅ

๐Ÿ‘‰#META releases uCO3D, a new object-centric dataset for 3D AI. The largest publicly-available collection of HD videos of objects with 3D annotations that ensures full-360โ—ฆ coverage. Code & data under CCA 4.0๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/Z_tvA
๐Ÿ‘‰Paper https://arxiv.org/pdf/2501.07574
๐Ÿ‘‰Project https://uco3d.github.io/
๐Ÿ‘‰Repo github.com/facebookresearch/uco3d
โค11โšก2๐Ÿ˜2๐Ÿ‘1๐Ÿ‘1๐Ÿคฉ1๐Ÿพ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ“LATTE-MV: #3D Table Tennis๐Ÿ“

๐Ÿ‘‰UC Berkeley unveils at #CVPR2025 a novel system for reconstructing monocular video of table tennis in 3D with uncertainty-aware controller that anticipates opponent actions. Code & Dataset announced, to be released๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/qPMOU
๐Ÿ‘‰Paper arxiv.org/pdf/2503.20936
๐Ÿ‘‰Project sastry-group.github.io/LATTE-MV/
๐Ÿ‘‰Repo github.com/sastry-group/LATTE-MV
๐Ÿ”ฅ8๐Ÿ‘2๐Ÿ‘1๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸงŠBoxDreamer Object Pose๐ŸงŠ

๐Ÿ‘‰BoxDreamer is a generalizable RGB-based approach for #3D object pose estimation in the wild, specifically designed to address challenges in sparse-view settings. Code coming, demo released๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/e-vX9
๐Ÿ‘‰Paper arxiv.org/pdf/2504.07955
๐Ÿ‘‰Project https://lnkd.in/djz8jqn9
๐Ÿ‘‰Repo https://lnkd.in/dfuEawSA
๐Ÿค—Demo https://lnkd.in/dVYaWGcS
โค3๐Ÿ”ฅ3๐Ÿ‘2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸฅŠ Pose in Combat Sports ๐ŸฅŠ

๐Ÿ‘‰The novel SOTA framework for an accurate physics-based #3D human pose estimation in combat sports w/ sparse multi-cameras setup. Dataset to be released soon๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/EfcGL
๐Ÿ‘‰Paper https://lnkd.in/deMMrKcA
๐Ÿ‘‰Project https://lnkd.in/dkMS_UrH
๐Ÿ‘13๐Ÿ”ฅ4โค3๐Ÿคฏ2
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸPartField #3D Part Segmentation๐Ÿ

๐Ÿ‘‰#Nvidia unveils PartField, a FFW approach for learning part-based 3D features, which captures the general concept of parts and their hierarchy. Suitable for single-shape decomposition, co-segm., correspondence & more. Code & Models released under Nvidia License๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/fGb2O
๐Ÿ‘‰Paper https://lnkd.in/dGeyKSzG
๐Ÿ‘‰Code https://lnkd.in/dbe57XGH
๐Ÿ‘‰Project https://lnkd.in/dhEgf7X2
โค2๐Ÿ”ฅ2๐Ÿคฏ2