This media is not supported in your browser
VIEW IN TELEGRAM
๐TTSC for 3D Generative๐
๐SpaceControl is the new SOTA training-free test-time method for explicit spatial control of 3D generation. Repo announced๐
๐Review https://t.ly/1zrah
๐Paper https://lnkd.in/dEWh3vep
๐Project https://lnkd.in/dScftUmm
๐Repo TBA
๐SpaceControl is the new SOTA training-free test-time method for explicit spatial control of 3D generation. Repo announced๐
๐Review https://t.ly/1zrah
๐Paper https://lnkd.in/dEWh3vep
๐Project https://lnkd.in/dScftUmm
๐Repo TBA
โค8๐ฅ2๐1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ทLayered PSD Diffusion๐ท
๐OmniPSD produces layered PSD files with transparent alpha channels, separating text, foreground elements, and background into clean RGBA layers that can be directly edited in tools. Online Demo๐
๐Review https://t.ly/YNRAC
๐Paper arxiv.org/pdf/2512.09247
๐Project showlab.github.io/OmniPSD/
๐Demo https://www.lovart.ai/it
๐OmniPSD produces layered PSD files with transparent alpha channels, separating text, foreground elements, and background into clean RGBA layers that can be directly edited in tools. Online Demo๐
๐Review https://t.ly/YNRAC
๐Paper arxiv.org/pdf/2512.09247
๐Project showlab.github.io/OmniPSD/
๐Demo https://www.lovart.ai/it
๐ฅ9โค8๐1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐งฑPixel Art Volumetric Rendering๐งฑ
๐Voxify3D is a novel differentiable two-stage framework bridging 3D mesh optimization with 2D pixel art supervision. Repo announced๐
๐Review https://t.ly/qPyNl
๐Paper https://lnkd.in/du5ikJGN
๐Project https://lnkd.in/dpiAjj5m
๐Repo TBA
๐Voxify3D is a novel differentiable two-stage framework bridging 3D mesh optimization with 2D pixel art supervision. Repo announced๐
๐Review https://t.ly/qPyNl
๐Paper https://lnkd.in/du5ikJGN
๐Project https://lnkd.in/dpiAjj5m
๐Repo TBA
โค6๐ฅ4๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ซ MoCapAnything is out ๐ซ
๐MoCapAnything is novel a reference-guided, factorized framework that first predicts 3D joint trajectories and then recovers asset-specific rotations via constraint-aware IK fitting. No code announced ๐ฅฒ
๐Review https://t.ly/_Tw6t
๐Paper arxiv.org/pdf/2512.10881
๐Project animotionlab.github.io/MoCapAnything
๐MoCapAnything is novel a reference-guided, factorized framework that first predicts 3D joint trajectories and then recovers asset-specific rotations via constraint-aware IK fitting. No code announced ๐ฅฒ
๐Review https://t.ly/_Tw6t
๐Paper arxiv.org/pdf/2512.10881
๐Project animotionlab.github.io/MoCapAnything
โค12๐4๐ฅ4๐1๐คฏ1๐ข1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ MatAnyone 2 is out! ๐
๐MatAnyone 2 is the most advanced human video matting framework that preserves fine details by avoiding segmentation-like boundaries, while also shows enhanced robustness under challenging real-world conditions. Repo & Dataset announced๐
๐Review https://t.ly/vxOBO
๐Paper arxiv.org/pdf/2512.11782
๐Project pq-yang.github.io/projects/MatAnyone2
๐Repo github.com/pq-yang/MatAnyone2
๐MatAnyone 2 is the most advanced human video matting framework that preserves fine details by avoiding segmentation-like boundaries, while also shows enhanced robustness under challenging real-world conditions. Repo & Dataset announced๐
๐Review https://t.ly/vxOBO
๐Paper arxiv.org/pdf/2512.11782
๐Project pq-yang.github.io/projects/MatAnyone2
๐Repo github.com/pq-yang/MatAnyone2
๐ฅ5โค4๐1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ท SOTA Zero-Shot Stereo Matching๐ท
๐Fast-FoundationStereo by #Nvidia is a novel family of architectures that achieve, for the first time, strong zero-shot generalization at real-time frame rate via divide-&-conquer acceleration. Code & Data announced๐
๐Review https://t.ly/XD6pO
๐Paper https://lnkd.in/d9_YKW2A
๐Project https://lnkd.in/dKDxm7EX
๐Repo https://lnkd.in/dR4-PdsW
๐Fast-FoundationStereo by #Nvidia is a novel family of architectures that achieve, for the first time, strong zero-shot generalization at real-time frame rate via divide-&-conquer acceleration. Code & Data announced๐
๐Review https://t.ly/XD6pO
๐Paper https://lnkd.in/d9_YKW2A
๐Project https://lnkd.in/dKDxm7EX
๐Repo https://lnkd.in/dR4-PdsW
2๐ฅ10โค4๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐DriverGaze360: Driver SOTA๐
๐DriverGaze360 is a large-scale 360โฆ field of view driver attention dataset, containing โผ1M gaze-labeled frames. Code & Dataset announced๐
๐Review https://t.ly/ZcoUw
๐Paper arxiv.org/pdf/2512.14266
๐Project av.dfki.de/drivergaze360/
๐Repo github.com/dfki-av/drivergaze360
๐Data av.dfki.de/drivergaze360/dataset
๐DriverGaze360 is a large-scale 360โฆ field of view driver attention dataset, containing โผ1M gaze-labeled frames. Code & Dataset announced๐
๐Review https://t.ly/ZcoUw
๐Paper arxiv.org/pdf/2512.14266
๐Project av.dfki.de/drivergaze360/
๐Repo github.com/dfki-av/drivergaze360
๐Data av.dfki.de/drivergaze360/dataset
๐ฅ10โค5๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ซ FlexAvatar: 3D Heads๐ซ
๐TUM introduces FlexAvatar, a novel method for creating HQ and complete 3D head avatars from a single image. Code announced๐
๐Review https://t.ly/Rkdtd
๐Paper arxiv.org/pdf/2512.15599
๐Project tobias-kirschstein.github.io/flexavatar/
๐Repo TBA
๐TUM introduces FlexAvatar, a novel method for creating HQ and complete 3D head avatars from a single image. Code announced๐
๐Review https://t.ly/Rkdtd
๐Paper arxiv.org/pdf/2512.15599
๐Project tobias-kirschstein.github.io/flexavatar/
๐Repo TBA
๐ฅ8โค5๐1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐๏ธ Depth Any Panoramas ๐๏ธ
๐DAP is the new SOTA foundation model for panoramic depth estimation with a large scale dataset. Data & Repo under MIT๐
๐Review https://t.ly/LaUmd
๐Paper arxiv.org/pdf/2512.16913
๐Project https://lnkd.in/dvqNV9jx
๐Repo https://lnkd.in/dmNzhb-7
๐Demo https://lnkd.in/dDwjMF3u
๐DAP is the new SOTA foundation model for panoramic depth estimation with a large scale dataset. Data & Repo under MIT๐
๐Review https://t.ly/LaUmd
๐Paper arxiv.org/pdf/2512.16913
๐Project https://lnkd.in/dvqNV9jx
๐Repo https://lnkd.in/dmNzhb-7
๐Demo https://lnkd.in/dDwjMF3u
๐ฅ9โค6๐2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฏGenerative Refocusing is out๐ฏ
๐Generative Refocusing is a two-step process that uses DeblurNet to recover all-in-focus images from various inputs and BokehNet for creating controllable bokeh (in semi-supervised mode). Repo under Apache2.0๐
๐Review https://t.ly/8t7PA
๐Paper arxiv.org/pdf/2512.16923
๐Project generative-refocusing.github.io/
๐Repo github.com/rayray9999/Genfocus
๐Demo huggingface.co/spaces/nycu-cplab/Genfocus-Demo
๐Generative Refocusing is a two-step process that uses DeblurNet to recover all-in-focus images from various inputs and BokehNet for creating controllable bokeh (in semi-supervised mode). Repo under Apache2.0๐
๐Review https://t.ly/8t7PA
๐Paper arxiv.org/pdf/2512.16923
๐Project generative-refocusing.github.io/
๐Repo github.com/rayray9999/Genfocus
๐Demo huggingface.co/spaces/nycu-cplab/Genfocus-Demo
๐ฅ7โค3
This media is not supported in your browser
VIEW IN TELEGRAM
โค13๐พ4โก1๐ฅ1
This media is not supported in your browser
VIEW IN TELEGRAM
โญTOP 5 Papers you loved in 2025โญ
๐ In 2025 novel architectures have redefined efficiency and accuracy, and almost every day brought a new SOTA in image understanding, tracking, and GenAI. Itโs been an inspiring ride, and 2026 it will be even wilder. This community (LinkedIn + Telegram) is now around 80,000+ people.
๐๐๐ฉ๐๐ซ๐ฌ (๐๐ฒ ๐ฒ๐จ๐ฎ๐ซ ๐ฉ๐ซ๐๐๐๐ซ๐๐ง๐๐):
โญ3D LLM https://t.ly/ejr1s
โญDynOMo https://t.ly/t5pCf
โญTrack Transf. https://t.ly/NPyW4
โญYOLOv12 https://t.ly/jj1oR
โญG-Surface Tracking https://t.ly/udpMq
Thank you all๐
๐ In 2025 novel architectures have redefined efficiency and accuracy, and almost every day brought a new SOTA in image understanding, tracking, and GenAI. Itโs been an inspiring ride, and 2026 it will be even wilder. This community (LinkedIn + Telegram) is now around 80,000+ people.
๐๐๐ฉ๐๐ซ๐ฌ (๐๐ฒ ๐ฒ๐จ๐ฎ๐ซ ๐ฉ๐ซ๐๐๐๐ซ๐๐ง๐๐):
โญ3D LLM https://t.ly/ejr1s
โญDynOMo https://t.ly/t5pCf
โญTrack Transf. https://t.ly/NPyW4
โญYOLOv12 https://t.ly/jj1oR
โญG-Surface Tracking https://t.ly/udpMq
Thank you all๐
โค24๐3๐2๐ฅ1๐คฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฆ Depth as Neural Implicit ๐ฆ
๐InfiniDepth represents depth as neural implicit fields, "infinite" (i.e.16K) resolution and geometrical details. Repo under Apache 2.0๐
๐Review https://t.ly/4we5t
๐Paper https://lnkd.in/dpiHQExj
๐Project https://lnkd.in/dy3JxKye
๐Repo https://lnkd.in/dAXbnK5z
๐InfiniDepth represents depth as neural implicit fields, "infinite" (i.e.16K) resolution and geometrical details. Repo under Apache 2.0๐
๐Review https://t.ly/4we5t
๐Paper https://lnkd.in/dpiHQExj
๐Project https://lnkd.in/dy3JxKye
๐Repo https://lnkd.in/dAXbnK5z
1๐ฅ12โค2๐1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Label Any Object in 3D ๐
๐LabelAny3D: novel analysis-by-synthesis framework that reconstructs holistic 3D scenes from 2D to efficiently produce HQ 3D BBs annotations. Repo under CC-BY-4.0 license๐
๐Review https://t.ly/bO93j
๐Paper https://lnkd.in/dYb97zWG
๐Project https://lnkd.in/dJ9UKERb
๐Repo https://lnkd.in/d9SxtmiA
๐LabelAny3D: novel analysis-by-synthesis framework that reconstructs holistic 3D scenes from 2D to efficiently produce HQ 3D BBs annotations. Repo under CC-BY-4.0 license๐
๐Review https://t.ly/bO93j
๐Paper https://lnkd.in/dYb97zWG
๐Project https://lnkd.in/dJ9UKERb
๐Repo https://lnkd.in/d9SxtmiA
โค7๐ฅ7๐1๐1
๐ฅ New #AI Startups in 2026? ๐ฅ
In 2026, which area would you focus on?
๐คAgents โ workflows, copilots, etc.
๐ญVertical AI โ Pharma, Automotive, Energy ...
๐ง Infrastructure โ MLOps, Security, Cost Control ...
๐จAI for Creators/Media โ Video, avatars, contents ...
Please, help me understanding what's next with this poll on LinkedIn :)
https://www.linkedin.com/posts/visionarynet_ai-ai-deeplearning-activity-7415377341779996672-sQO1
LUV U \m/
In 2026, which area would you focus on?
๐คAgents โ workflows, copilots, etc.
๐ญVertical AI โ Pharma, Automotive, Energy ...
๐ง Infrastructure โ MLOps, Security, Cost Control ...
๐จAI for Creators/Media โ Video, avatars, contents ...
Please, help me understanding what's next with this poll on LinkedIn :)
https://www.linkedin.com/posts/visionarynet_ai-ai-deeplearning-activity-7415377341779996672-sQO1
LUV U \m/
Linkedin
#ai #ai #deeplearning #aiwithpapers #metaverse | Alessandro Ferrari
๐ฅ๐ฅ New #AI Startups in 2026? ๐ฅ๐ฅ
๐ Looking ahead to 2026, the question is no longer โcan we build it?โ but โwhere does it actually create durable value?โ in the AI field. So, if you were to launch an AI startup in 2026, which area would you focus on?
๐คAgentsโฆ
๐ Looking ahead to 2026, the question is no longer โcan we build it?โ but โwhere does it actually create durable value?โ in the AI field. So, if you were to launch an AI startup in 2026, which area would you focus on?
๐คAgentsโฆ
๐ฅ5โค1๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ฅOrient Anything V2 is out๐ฅ
๐Orient Anything V2 is a foundation model for unified understanding of object 3D orientation and rotation from single or paired images. Repo under CC-BY-4.0๐
๐Review https://t.ly/Ht7Xd
๐Paper arxiv.org/pdf/2601.05573
๐Project orient-anythingv2.github.io/
๐Repo github.com/SpatialVision/Orient-Anything-V2
๐Orient Anything V2 is a foundation model for unified understanding of object 3D orientation and rotation from single or paired images. Repo under CC-BY-4.0๐
๐Review https://t.ly/Ht7Xd
๐Paper arxiv.org/pdf/2601.05573
๐Project orient-anythingv2.github.io/
๐Repo github.com/SpatialVision/Orient-Anything-V2
โค5๐ฅ2๐1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ซActive Object Reconstruction๐ซ
๐ObjSplat (Beijing) autonomously plans viewpoints and progressively reconstructs an unknown object into a Hi-Fi Gaussian model and water-tight mesh, enabling direct use in physics simulations. Tough paper and repo announced๐
๐Review https://t.ly/au6HE
๐Paper arxiv.org/pdf/2601.06997
๐Project li-yuetao.github.io/ObjSplat-page/
๐Repo https://github.com/Li-Yuetao/ObjSplat
๐ObjSplat (Beijing) autonomously plans viewpoints and progressively reconstructs an unknown object into a Hi-Fi Gaussian model and water-tight mesh, enabling direct use in physics simulations. Tough paper and repo announced๐
๐Review https://t.ly/au6HE
๐Paper arxiv.org/pdf/2601.06997
๐Project li-yuetao.github.io/ObjSplat-page/
๐Repo https://github.com/Li-Yuetao/ObjSplat
โค6๐1
In 2026, who should we keep an eye on?
Vote: https://www.linkedin.com/posts/visionarynet_ai-deeplearning-aiwithpapers-activity-7416886610795077632-qQeP/
Vote: https://www.linkedin.com/posts/visionarynet_ai-deeplearning-aiwithpapers-activity-7416886610795077632-qQeP/
๐ฅ2โค1๐คฏ1
๐Games Workshop (Warhammer) is banning the use of AI in creative and design processes to protect IP and human creativity. A decision that goes against the current hype of widespread AI adoption.
And what about your organization? I need your help๐
Vote: https://www.linkedin.com/posts/visionarynet_ai-activity-7417106327019196417-TpGL
And what about your organization? I need your help๐
Vote: https://www.linkedin.com/posts/visionarynet_ai-activity-7417106327019196417-TpGL
โค2๐คฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Segment Anything Geometry๐
๐3AM (NYCU + #Nvidia) offers cross-view correspondence even under large viewpoint changes, cluttered scenes, and variations in capture conditions, enabling robust object tracking from both videos & casual multi-view images. Repo (coming) & Demo available๐
๐Review https://t.ly/olZwE
๐Paper https://arxiv.org/pdf/2601.08831
๐Project https://jayisaking.github.io/3AM-Page/
๐Repo https://github.com/jayisaking
๐Demo https://huggingface.co/spaces/nycu-cplab/3AM
๐3AM (NYCU + #Nvidia) offers cross-view correspondence even under large viewpoint changes, cluttered scenes, and variations in capture conditions, enabling robust object tracking from both videos & casual multi-view images. Repo (coming) & Demo available๐
๐Review https://t.ly/olZwE
๐Paper https://arxiv.org/pdf/2601.08831
๐Project https://jayisaking.github.io/3AM-Page/
๐Repo https://github.com/jayisaking
๐Demo https://huggingface.co/spaces/nycu-cplab/3AM
๐ฅ10โค4๐1