This media is not supported in your browser
VIEW IN TELEGRAM
π¦Geometry-Aware 4D Headπ¦
π GeoDiff4D is a novel framework that reconstructs animatable 4D head avatars from a single portrait image through geometry-aware diffusion. Code announcedπ
πReview https://t.ly/J9L-t
πPaper https://lnkd.in/ddpv-78g
πProject https://lnkd.in/d-vhukyj
πRepo https://lnkd.in/dzd6mnFv
π GeoDiff4D is a novel framework that reconstructs animatable 4D head avatars from a single portrait image through geometry-aware diffusion. Code announcedπ
πReview https://t.ly/J9L-t
πPaper https://lnkd.in/ddpv-78g
πProject https://lnkd.in/d-vhukyj
πRepo https://lnkd.in/dzd6mnFv
β€5π3π1π₯1π€―1πΎ1
This media is not supported in your browser
VIEW IN TELEGRAM
πFully Offline Mobile-VTONπ
πA novel, hq, privacy-preserving framework that enables fully offline virtual try-on on commodity mobile devices using only a single user image and a garment image. Repo announced, to be releasedπ
πReview https://t.ly/dsrIn
πPaper arxiv.org/pdf/2603.00947
πProject zhenchenwan.github.io/Mobile-VTON/
πRepo https://github.com/tmllab/2026_CVPR_Mobile-VTON
πA novel, hq, privacy-preserving framework that enables fully offline virtual try-on on commodity mobile devices using only a single user image and a garment image. Repo announced, to be releasedπ
πReview https://t.ly/dsrIn
πPaper arxiv.org/pdf/2603.00947
πProject zhenchenwan.github.io/Mobile-VTON/
πRepo https://github.com/tmllab/2026_CVPR_Mobile-VTON
β€11π€―3π2π₯1
This media is not supported in your browser
VIEW IN TELEGRAM
πͺΏAll Point Clouds-One EncoderπͺΏ
πUtonia is a step toward one-from-all and one-for-all point cloud encoder. It pretrains a single encoder on diverse point cloud data and reuses it as a reliable backbone for downstream tasks. Code under Apache 2.0π
πReview https://t.ly/yqSyZ
πPaper https://arxiv.org/pdf/2603.03283
πProject pointcept.github.io/Utonia/
πRepo https://github.com/Pointcept/Utonia
πUtonia is a step toward one-from-all and one-for-all point cloud encoder. It pretrains a single encoder on diverse point cloud data and reuses it as a reliable backbone for downstream tasks. Code under Apache 2.0π
πReview https://t.ly/yqSyZ
πPaper https://arxiv.org/pdf/2603.03283
πProject pointcept.github.io/Utonia/
πRepo https://github.com/Pointcept/Utonia
β€7π₯2π1π1
This media is not supported in your browser
VIEW IN TELEGRAM
πͺDuoMo: Dual Motion Diffusionπͺ
πDuoMo by META is a novel generative method that recovers human motion in world-space coordinates from unconstrained videos with noisy or incomplete observations. Code announcedπ
πReview https://t.ly/dnA3K
πPaper arxiv.org/pdf/2603.03265
πProject yufu-wang.github.io/duomo/
πRepo TBA
πDuoMo by META is a novel generative method that recovers human motion in world-space coordinates from unconstrained videos with noisy or incomplete observations. Code announcedπ
πReview https://t.ly/dnA3K
πPaper arxiv.org/pdf/2603.03265
πProject yufu-wang.github.io/duomo/
πRepo TBA
β€7π2π€―2π1
This media is not supported in your browser
VIEW IN TELEGRAM
πAny Resolution, Any Geometryπ
πUltra Resolution Geometry Transformer (URGT) for arbitrary resolutions (e.g. 4K, 6K, 8K) depthβnormal estimation. New SOTA. Repo under MITπ
πReview https://t.ly/HXg1n
πPaper arxiv.org/pdf/2603.03026
πProject dreamaker-mrc.github.io/Any-Resolution-Any-Geometry/
πRepo github.com/Dreamaker-MrC/Any-Resolution-Any-Geometry
πUltra Resolution Geometry Transformer (URGT) for arbitrary resolutions (e.g. 4K, 6K, 8K) depthβnormal estimation. New SOTA. Repo under MITπ
πReview https://t.ly/HXg1n
πPaper arxiv.org/pdf/2603.03026
πProject dreamaker-mrc.github.io/Any-Resolution-Any-Geometry/
πRepo github.com/Dreamaker-MrC/Any-Resolution-Any-Geometry
π₯8β€6π1π1
Could be useful for you seeing a few (verified) job posting about AI in this channel?
Anonymous Poll
63%
πYES, why not?!
37%
β NO, only damn AI & Papers
β€5
This media is not supported in your browser
VIEW IN TELEGRAM
π§Monocular 3D Clothed Humanπ§
πMultiGO++ is a novel framework for monocular 3D clothed human reconstruction via geometry-texture collaboration. New SOTA but no code announcedπ₯²
πReview https://t.ly/YKY44
πPaper arxiv.org/pdf/2603.04993
πProject 3dagentworld.github.io/multigo++
πMultiGO++ is a novel framework for monocular 3D clothed human reconstruction via geometry-texture collaboration. New SOTA but no code announcedπ₯²
πReview https://t.ly/YKY44
πPaper arxiv.org/pdf/2603.04993
πProject 3dagentworld.github.io/multigo++
β€4π1π1
This media is not supported in your browser
VIEW IN TELEGRAM
πͺSOTA Arbitrary Trackingπͺ
πTAPFormer is the novel SOTA transformer-based framework that performs asynchronous temporal-consistent fusion of frames and events for robust and high-freq point tracking. Repo & Dataset under MITπ
πReview https://t.ly/-q4wm
πPaper https://arxiv.org/pdf/2603.04989
πProject http://tapformer.github.io/
πRepo https://github.com/ljx1002/TAPFormer
πTAPFormer is the novel SOTA transformer-based framework that performs asynchronous temporal-consistent fusion of frames and events for robust and high-freq point tracking. Repo & Dataset under MITπ
πReview https://t.ly/-q4wm
πPaper https://arxiv.org/pdf/2603.04989
πProject http://tapformer.github.io/
πRepo https://github.com/ljx1002/TAPFormer
β€5π3π₯3π2πΎ1
This media is not supported in your browser
VIEW IN TELEGRAM
πReal-Time Scene Graphπ
πREACT++ by Umea University is the new state-of-the-art model for real-time SGG: 20% faster with a gain of 10% in relation prediction accuracy on average. Code under MITπ
πReview https://t.ly/c12VX
πPaper https://arxiv.org/pdf/2603.06386
πRepo https://github.com/Maelic/SGG-Benchmark
πREACT++ by Umea University is the new state-of-the-art model for real-time SGG: 20% faster with a gain of 10% in relation prediction accuracy on average. Code under MITπ
πReview https://t.ly/c12VX
πPaper https://arxiv.org/pdf/2603.06386
πRepo https://github.com/Maelic/SGG-Benchmark
π₯6β€3π3π1
This media is not supported in your browser
VIEW IN TELEGRAM
π₯Holistic 3D Spatial Intelligenceπ₯
πHoli-Spatial is the first fully automated pipeline capable of converting raw video streams into holistic 3D spatial annotations without human intervention. Code/Data announcedπ
πReview https://t.ly/PDpr9
πPaper https://lnkd.in/dTbMuZCm
πProject https://lnkd.in/d66CYB4q
πRepo https://lnkd.in/dAGzShXj
πHoli-Spatial is the first fully automated pipeline capable of converting raw video streams into holistic 3D spatial annotations without human intervention. Code/Data announcedπ
πReview https://t.ly/PDpr9
πPaper https://lnkd.in/dTbMuZCm
πProject https://lnkd.in/d66CYB4q
πRepo https://lnkd.in/dAGzShXj
β€8π₯7π2π1
This media is not supported in your browser
VIEW IN TELEGRAM
πSurface Light Tokenizerπ
πApple unveils LITO a novel latent flow matching model enables HQ image-to-3D. Latent representation that encodes a surface light field into a compact set of latent vectors. Impressive results but no codeπ₯²
πReview https://t.ly/xcWNe
πPaper https://lnkd.in/dYHwY4YX
πProject https://lnkd.in/dtJT8bXy
πApple unveils LITO a novel latent flow matching model enables HQ image-to-3D. Latent representation that encodes a surface light field into a compact set of latent vectors. Impressive results but no codeπ₯²
πReview https://t.ly/xcWNe
πPaper https://lnkd.in/dYHwY4YX
πProject https://lnkd.in/dtJT8bXy
β€8π4π₯2π2π€―1πΎ1
This media is not supported in your browser
VIEW IN TELEGRAM
βοΈ OmniStream Backbone βοΈ
πNovel unified streaming visual backbone that effectively perceives, reconstructs, and acts from diverse visual inputs. Repo/Models announcedπ
πReview https://t.ly/_zZMO
πPaper arxiv.org/pdf/2603.12265
πProject go2heart.github.io/omnistream/
πRepo github.com/Go2Heart/OmniStream
πNovel unified streaming visual backbone that effectively perceives, reconstructs, and acts from diverse visual inputs. Repo/Models announcedπ
πReview https://t.ly/_zZMO
πPaper arxiv.org/pdf/2603.12265
πProject go2heart.github.io/omnistream/
πRepo github.com/Go2Heart/OmniStream
β€6π2π€―2π©1
This media is not supported in your browser
VIEW IN TELEGRAM
π New SOTA Video Depth π
πDVD is the new Video Depth Estimation SOTA with full training suite available under Apache2.0π
πReview https://t.ly/gpCkG
πPaper https://arxiv.org/pdf/2603.12250
πProject https://dvd-project.github.io/
πRepo github.com/EnVision-Research/DVD
πDVD is the new Video Depth Estimation SOTA with full training suite available under Apache2.0π
πReview https://t.ly/gpCkG
πPaper https://arxiv.org/pdf/2603.12250
πProject https://dvd-project.github.io/
πRepo github.com/EnVision-Research/DVD
β€7π₯3π2π1
This media is not supported in your browser
VIEW IN TELEGRAM
π€Physically-Plausible Humanπ€
πPhysMoDPO is a novel direct preference optimization framework for humanoid motion generation. Repo under MITπ
πReview https://t.ly/clf8w
πPaper https://arxiv.org/pdf/2603.13228
πProject https://mael-zys.github.io/PhysMoDPO/
πRepo https://github.com/Mael-zys/PhysMoDPO
πPhysMoDPO is a novel direct preference optimization framework for humanoid motion generation. Repo under MITπ
πReview https://t.ly/clf8w
πPaper https://arxiv.org/pdf/2603.13228
πProject https://mael-zys.github.io/PhysMoDPO/
πRepo https://github.com/Mael-zys/PhysMoDPO
1β€4π₯2
This media is not supported in your browser
VIEW IN TELEGRAM
π§10,000Γ faster SAM-3Dπ§
πFast SAM 3D Body achieves up to 10.9Γ speedup, over 10,000Γ faster MHR-to-SMPL conversion -> real-time humanoid control from RGB. Repo availableπ
πReview https://t.ly/uHx84
πPaper https://arxiv.org/pdf/2603.15603
πProject yangtiming.github.io/Fast-SAM-3D-Body-Page/
πRepo https://github.com/yangtiming/Fast-SAM-3D-Body
πFast SAM 3D Body achieves up to 10.9Γ speedup, over 10,000Γ faster MHR-to-SMPL conversion -> real-time humanoid control from RGB. Repo availableπ
πReview https://t.ly/uHx84
πPaper https://arxiv.org/pdf/2603.15603
πProject yangtiming.github.io/Fast-SAM-3D-Body-Page/
πRepo https://github.com/yangtiming/Fast-SAM-3D-Body
π₯9β€2π2
This media is not supported in your browser
VIEW IN TELEGRAM
πMaterial-Aware Groupingπ
πMaterial Magic Wand (Adobe) is a tool for material-aware grouping of parts in untextured 3D meshes. Given one selected part, it automatically retrieves the other parts in the same shape by its material. Repo announcedπ
πReview https://t.ly/q00SU
πPaper https://arxiv.org/pdf/2603.17370
πProject umangi-jain.github.io/material-magic-wand/
πRepo TBA
πMaterial Magic Wand (Adobe) is a tool for material-aware grouping of parts in untextured 3D meshes. Given one selected part, it automatically retrieves the other parts in the same shape by its material. Repo announcedπ
πReview https://t.ly/q00SU
πPaper https://arxiv.org/pdf/2603.17370
πProject umangi-jain.github.io/material-magic-wand/
πRepo TBA
π₯4
This media is not supported in your browser
VIEW IN TELEGRAM
π¦ͺOccAny: Universal 3D Occupancyπ¦ͺ
πOccAny by Valeo is a novel unified framework for generalized unconstrained urban 3D occupancy prediction. Repo under Apache 2.0π
πReview https://t.ly/FFiU0
πPaper https://arxiv.org/pdf/2603.23502
πProject https://valeoai.github.io/OccAny/
πRepo https://github.com/valeoai/OccAny
πOccAny by Valeo is a novel unified framework for generalized unconstrained urban 3D occupancy prediction. Repo under Apache 2.0π
πReview https://t.ly/FFiU0
πPaper https://arxiv.org/pdf/2603.23502
πProject https://valeoai.github.io/OccAny/
πRepo https://github.com/valeoai/OccAny
π₯6π2β€1
This media is not supported in your browser
VIEW IN TELEGRAM
πPose-Appearance-Motion for HOIπ
πPAM is a novel PoseβAppearanceβMotion Engine for controllable HandβObject Interaction SOTA video generation. Repo/models availableπ
πReview https://t.ly/JU4MD
πPaper arxiv.org/pdf/2603.22193
πProject gasaiyu.github.io/PAM.github.io/
πRepo https://github.com/GasaiYU/PAM
πPAM is a novel PoseβAppearanceβMotion Engine for controllable HandβObject Interaction SOTA video generation. Repo/models availableπ
πReview https://t.ly/JU4MD
πPaper arxiv.org/pdf/2603.22193
πProject gasaiyu.github.io/PAM.github.io/
πRepo https://github.com/GasaiYU/PAM
β€7π2π₯2
Please open Telegram to view this post
VIEW IN TELEGRAM
This media is not supported in your browser
VIEW IN TELEGRAM
π₯ GaussianGPT 3D GSCπ₯
πFrom TUM, GaussianGPT: transformer-based 3D Gaussians generation via next-token prediction -> full 3D complex indoor scene. Repo announcedπ
πReview https://t.ly/bj-lL
πPaper arxiv.org/pdf/2603.26661
πProject nicolasvonluetzow.github.io/GaussianGPT/
πRepo TBA
πFrom TUM, GaussianGPT: transformer-based 3D Gaussians generation via next-token prediction -> full 3D complex indoor scene. Repo announcedπ
πReview https://t.ly/bj-lL
πPaper arxiv.org/pdf/2603.26661
πProject nicolasvonluetzow.github.io/GaussianGPT/
πRepo TBA
π₯8β€2π1π1