AI with Papers - Artificial Intelligence & Deep Learning
17.9K subscribers
151 photos
263 videos
14 files
1.38K links
All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#AI #chatGPT
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ˜TTSC for 3D Generative๐Ÿ˜

๐Ÿ‘‰SpaceControl is the new SOTA training-free test-time method for explicit spatial control of 3D generation. Repo announced๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/1zrah
๐Ÿ‘‰Paper https://lnkd.in/dEWh3vep
๐Ÿ‘‰Project https://lnkd.in/dScftUmm
๐Ÿ‘‰Repo TBA
โค8๐Ÿ”ฅ2๐Ÿ‘1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽทLayered PSD Diffusion๐ŸŽท

๐Ÿ‘‰OmniPSD produces layered PSD files with transparent alpha channels, separating text, foreground elements, and background into clean RGBA layers that can be directly edited in tools. Online Demo๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/YNRAC
๐Ÿ‘‰Paper arxiv.org/pdf/2512.09247
๐Ÿ‘‰Project showlab.github.io/OmniPSD/
๐Ÿ‘‰Demo https://www.lovart.ai/it
๐Ÿ”ฅ9โค8๐Ÿ‘1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸงฑPixel Art Volumetric Rendering๐Ÿงฑ

๐Ÿ‘‰Voxify3D is a novel differentiable two-stage framework bridging 3D mesh optimization with 2D pixel art supervision. Repo announced๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/qPyNl
๐Ÿ‘‰Paper https://lnkd.in/du5ikJGN
๐Ÿ‘‰Project https://lnkd.in/dpiAjj5m
๐Ÿ‘‰Repo TBA
โค6๐Ÿ”ฅ4๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸซŽ MoCapAnything is out ๐ŸซŽ

๐Ÿ‘‰MoCapAnything is novel a reference-guided, factorized framework that first predicts 3D joint trajectories and then recovers asset-specific rotations via constraint-aware IK fitting. No code announced ๐Ÿฅฒ

๐Ÿ‘‰Review https://t.ly/_Tw6t
๐Ÿ‘‰Paper arxiv.org/pdf/2512.10881
๐Ÿ‘‰Project animotionlab.github.io/MoCapAnything
โค12๐Ÿ‘4๐Ÿ”ฅ4๐Ÿ‘1๐Ÿคฏ1๐Ÿ˜ข1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ’š MatAnyone 2 is out! ๐Ÿ’š

๐Ÿ‘‰MatAnyone 2 is the most advanced human video matting framework that preserves fine details by avoiding segmentation-like boundaries, while also shows enhanced robustness under challenging real-world conditions. Repo & Dataset announced๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/vxOBO
๐Ÿ‘‰Paper arxiv.org/pdf/2512.11782
๐Ÿ‘‰Project pq-yang.github.io/projects/MatAnyone2
๐Ÿ‘‰Repo github.com/pq-yang/MatAnyone2
๐Ÿ”ฅ5โค4๐Ÿ‘1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ’ท SOTA Zero-Shot Stereo Matching๐Ÿ’ท

๐Ÿ‘‰Fast-FoundationStereo by #Nvidia is a novel family of architectures that achieve, for the first time, strong zero-shot generalization at real-time frame rate via divide-&-conquer acceleration. Code & Data announced๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/XD6pO
๐Ÿ‘‰Paper https://lnkd.in/d9_YKW2A
๐Ÿ‘‰Project https://lnkd.in/dKDxm7EX
๐Ÿ‘‰Repo https://lnkd.in/dR4-PdsW
2๐Ÿ”ฅ10โค4๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ‘€DriverGaze360: Driver SOTA๐Ÿ‘€

๐Ÿ‘‰DriverGaze360 is a large-scale 360โ—ฆ field of view driver attention dataset, containing โˆผ1M gaze-labeled frames. Code & Dataset announced๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/ZcoUw
๐Ÿ‘‰Paper arxiv.org/pdf/2512.14266
๐Ÿ‘‰Project av.dfki.de/drivergaze360/
๐Ÿ‘‰Repo github.com/dfki-av/drivergaze360
๐Ÿ‘‰Data av.dfki.de/drivergaze360/dataset
๐Ÿ”ฅ10โค5๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿซ FlexAvatar: 3D Heads๐Ÿซ 

๐Ÿ‘‰TUM introduces FlexAvatar, a novel method for creating HQ and complete 3D head avatars from a single image. Code announced๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/Rkdtd
๐Ÿ‘‰Paper arxiv.org/pdf/2512.15599
๐Ÿ‘‰Project tobias-kirschstein.github.io/flexavatar/
๐Ÿ‘‰Repo TBA
๐Ÿ”ฅ8โค5๐Ÿ‘1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿœ๏ธ Depth Any Panoramas ๐Ÿœ๏ธ

๐Ÿ‘‰DAP is the new SOTA foundation model for panoramic depth estimation with a large scale dataset. Data & Repo under MIT๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/LaUmd
๐Ÿ‘‰Paper arxiv.org/pdf/2512.16913
๐Ÿ‘‰Project https://lnkd.in/dvqNV9jx
๐Ÿ‘‰Repo https://lnkd.in/dmNzhb-7
๐Ÿ‘‰Demo https://lnkd.in/dDwjMF3u
๐Ÿ”ฅ9โค6๐Ÿ‘2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŽฏGenerative Refocusing is out๐ŸŽฏ

๐Ÿ‘‰Generative Refocusing is a two-step process that uses DeblurNet to recover all-in-focus images from various inputs and BokehNet for creating controllable bokeh (in semi-supervised mode). Repo under Apache2.0๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/8t7PA
๐Ÿ‘‰Paper arxiv.org/pdf/2512.16923
๐Ÿ‘‰Project generative-refocusing.github.io/
๐Ÿ‘‰Repo github.com/rayray9999/Genfocus
๐Ÿ‘‰Demo huggingface.co/spaces/nycu-cplab/Genfocus-Demo
๐Ÿ”ฅ7โค3
This media is not supported in your browser
VIEW IN TELEGRAM
โญTOP 5 Papers you loved in 2025โญ

๐Ÿ‘‰ In 2025 novel architectures have redefined efficiency and accuracy, and almost every day brought a new SOTA in image understanding, tracking, and GenAI. Itโ€™s been an inspiring ride, and 2026 it will be even wilder. This community (LinkedIn + Telegram) is now around 80,000+ people.

๐๐š๐ฉ๐ž๐ซ๐ฌ (๐›๐ฒ ๐ฒ๐จ๐ฎ๐ซ ๐ฉ๐ซ๐ž๐Ÿ๐ž๐ซ๐ž๐ง๐œ๐ž):
โญ3D LLM https://t.ly/ejr1s
โญDynOMo https://t.ly/t5pCf
โญTrack Transf. https://t.ly/NPyW4
โญYOLOv12 https://t.ly/jj1oR
โญG-Surface Tracking https://t.ly/udpMq

Thank you all๐Ÿ’™
โค24๐Ÿ‘3๐Ÿ‘2๐Ÿ”ฅ1๐Ÿคฉ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿฆ™ Depth as Neural Implicit ๐Ÿฆ™

๐Ÿ‘‰InfiniDepth represents depth as neural implicit fields, "infinite" (i.e.16K) resolution and geometrical details. Repo under Apache 2.0๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/4we5t
๐Ÿ‘‰Paper https://lnkd.in/dpiHQExj
๐Ÿ‘‰Project https://lnkd.in/dy3JxKye
๐Ÿ‘‰Repo https://lnkd.in/dAXbnK5z
1๐Ÿ”ฅ12โค2๐Ÿ‘1๐Ÿ‘1
๐Ÿ”ฅ Back from Holidays mood ๐Ÿ”ฅ
๐Ÿคฃ24โค4๐Ÿ”ฅ2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐ŸŒLabel Any Object in 3D ๐ŸŒ

๐Ÿ‘‰LabelAny3D: novel analysis-by-synthesis framework that reconstructs holistic 3D scenes from 2D to efficiently produce HQ 3D BBs annotations. Repo under CC-BY-4.0 license๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/bO93j
๐Ÿ‘‰Paper https://lnkd.in/dYb97zWG
๐Ÿ‘‰Project https://lnkd.in/dJ9UKERb
๐Ÿ‘‰Repo https://lnkd.in/d9SxtmiA
โค7๐Ÿ”ฅ7๐Ÿ‘1๐Ÿ‘1
๐Ÿ”ฅ New #AI Startups in 2026? ๐Ÿ”ฅ

In 2026, which area would you focus on?
๐Ÿค–Agents โ†’ workflows, copilots, etc.
๐ŸญVertical AI โ†’ Pharma, Automotive, Energy ...
๐Ÿง Infrastructure โ†’ MLOps, Security, Cost Control ...
๐ŸŽจAI for Creators/Media โ†’ Video, avatars, contents ...

Please, help me understanding what's next with this poll on LinkedIn :)

https://www.linkedin.com/posts/visionarynet_ai-ai-deeplearning-activity-7415377341779996672-sQO1

LUV U \m/
๐Ÿ”ฅ5โค1๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ”ฅOrient Anything V2 is out๐Ÿ”ฅ

๐Ÿ‘‰Orient Anything V2 is a foundation model for unified understanding of object 3D orientation and rotation from single or paired images. Repo under CC-BY-4.0๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/Ht7Xd
๐Ÿ‘‰Paper arxiv.org/pdf/2601.05573
๐Ÿ‘‰Project orient-anythingv2.github.io/
๐Ÿ‘‰Repo github.com/SpatialVision/Orient-Anything-V2
โค5๐Ÿ”ฅ2๐Ÿ‘1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿซ›Active Object Reconstruction๐Ÿซ›

๐Ÿ‘‰ObjSplat (Beijing) autonomously plans viewpoints and progressively reconstructs an unknown object into a Hi-Fi Gaussian model and water-tight mesh, enabling direct use in physics simulations. Tough paper and repo announced๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/au6HE
๐Ÿ‘‰Paper arxiv.org/pdf/2601.06997
๐Ÿ‘‰Project li-yuetao.github.io/ObjSplat-page/
๐Ÿ‘‰Repo https://github.com/Li-Yuetao/ObjSplat
โค6๐Ÿ‘1
๐Ÿ‘‰Games Workshop (Warhammer) is banning the use of AI in creative and design processes to protect IP and human creativity. A decision that goes against the current hype of widespread AI adoption.

And what about your organization? I need your help๐Ÿ‘‡

Vote: https://www.linkedin.com/posts/visionarynet_ai-activity-7417106327019196417-TpGL
โค2๐Ÿคฏ1
This media is not supported in your browser
VIEW IN TELEGRAM
๐Ÿ’šSegment Anything Geometry๐Ÿ’š

๐Ÿ‘‰3AM (NYCU + #Nvidia) offers cross-view correspondence even under large viewpoint changes, cluttered scenes, and variations in capture conditions, enabling robust object tracking from both videos & casual multi-view images. Repo (coming) & Demo available๐Ÿ’™

๐Ÿ‘‰Review https://t.ly/olZwE
๐Ÿ‘‰Paper https://arxiv.org/pdf/2601.08831
๐Ÿ‘‰Project https://jayisaking.github.io/3AM-Page/
๐Ÿ‘‰Repo https://github.com/jayisaking
๐Ÿ‘‰Demo https://huggingface.co/spaces/nycu-cplab/3AM
๐Ÿ”ฅ10โค4๐Ÿ‘1