AI with Papers - Artificial Intelligence & Deep Learning

🌈Segment Any Events by Language🌈

👉SEAL (by NUS) is the first Semantic-aware Segment Any Events framework that addresses Open-Vocabulary Event Instance Segmentation. Code announced💙

👉Review https://t.ly/1ZMF0
👉Paper https://arxiv.org/pdf/2601.23159
👉Project https://0nandon.github.io/SEAL/
👉Repo https://github.com/0nandon/SEAL

🔥7❤4👏1🤯1

4.68K views08:06

👉RAM prices skyrocketing

👉Me acting like a rich kid.

Let's talk: https://www.linkedin.com/posts/visionarynet_ai-ram-ddr5-activity-7424127924020072448-NbaO

🤣24❤4🔥1

4.51K viewsedited 16:36

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🐮CoWTracker: Track-Warping🐮

👉CoWTracker (VGG + META) is a novel dense point tracker that eschews cost volumes in favor of warping. Code/Models under FAIR NC💙

👉Review https://t.ly/6bAn9
👉Paper https://arxiv.org/pdf/2602.04877
👉Project https://cowtracker.github.io/
👉Repo https://github.com/facebookresearch/cowtracker

🔥4❤1👍1

4.43K viewsedited 07:12

AI with Papers - Artificial Intelligence & Deep Learning

0:02

This media is not supported in your browser

VIEW IN TELEGRAM

🌈TrajVG Trajectory-Geometry🌈

👉TrajVG is a novel reconstruction framework that makes cross-frame 3D correspondence an explicit prediction by estimating camera-coordinate 3D trajectories. Code announced💙

👉Review https://t.ly/yVi01
👉Paper arxiv.org/pdf/2602.04439
👉Project xingy038.github.io/TrajVG/
👉Repo github.com/xingy038/TrajVG

❤7🔥1👏1

4.76K viewsedited 09:26

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🪙MOMENTUM #NeurIPS 2025 🪙

👉MOMENTUM by Google (H/T Huguens Jean, Ph.D.) is a production multimodal agent architecture built on the Google ADK. It orchestrates 22 specialized tools (Gemini for reasoning, Imagen 4.0 for image generation, and Veo 3.1 for synthesis). Code announced💙

👉Review https://t.ly/06h7Q
👉Paper https://momentum-project-page-232993426383.us-central1.run.app/momentum_paper.pdf
👉Project https://momentum-project-page-232993426383.us-central1.run.app/
👉Repo TBA

👍3❤1🔥1

3.53K views13:46

AI with Papers - Artificial Intelligence & Deep Learning

😶‍🌫️ SOTA Full-Head Synthesis 😶‍🌫️

👉HyPlaneHead, the new SOTA in full-head image synthesis, delivering HQ results with significantly fewer artifacts compared to existing 3D-aware models. Repo announced💙

👉Review https://t.ly/WYfP3
👉Paper arxiv.org/pdf/2509.16748
👉Project https://lhyfst.github.io/hyplanehead/
👉Repo github.com/lhyfst/HyPlaneHead

❤3🔥3👍2👏1😢1

3.52K viewsedited 13:33

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🍟 AnyTouch 2 is out 🍟

👉AnyTouch 2 is a general tactile representation learning framework for diverse optical tactile sensors that unifies object-level understanding with fine-grained, force-aware dynamic perception. Repo, Model & Data💙

👉Review https://t.ly/fP4dP
👉Paper https://arxiv.org/pdf/2602.09617
👉Project gewu-lab.github.io/AnyTouch2/
👉Repo github.com/GeWu-Lab/AnyTouch2

❤6🔥1

3.33K views09:36

AI with Papers - Artificial Intelligence & Deep Learning

Vote here please 💙

https://www.linkedin.com/posts/visionarynet_py4ai-2026-coming-soon-activity-7427290532034265088-y69e

❤2🔥1

3.1K views10:02

AI with Papers - Artificial Intelligence & Deep Learning

🍌 AGENT BANANA (SOTA) 🍌

👉Agent Banana is the novel SOTA agentic system for HD, native-resolution image editing through reasoning-based NL interaction, where each edit is context-aware, logically dependent, and locally precise. Code announced💙

👉Review https://t.ly/EXaCH
👉Paper https://arxiv.org/pdf/2602.09084
👉Project https://agent-banana.github.io/
👉Repo https://github.com/taco-group/agent-banana

❤12👏1

3.17K views13:14

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🛠️ IndustryShapes 6D Pose 🛠️

👉IndustryShapes by NTUA is a new RGB-D dataset of industrial tools, designed for both instance-level and novel object 6D pose estimation. Dataset available💙

👉Review https://t.ly/KKcuH
👉Paper https://arxiv.org/pdf/2602.05555
👉Project https://pose-lab.github.io/IndustryShapes/
👉Dataset https://huggingface.co/datasets/POSE-Lab/IndustryShapes

❤8🔥2👏1

3.31K viewsedited 07:58

AI with Papers - Artificial Intelligence & Deep Learning

0:03

This media is not supported in your browser

VIEW IN TELEGRAM

🤖Generalized Human Tracking🤖

👉Beijing Institute of Technology & Humanoid Robotics Shangai present a novel learning framework for general humanoid whole-body control. Impressive results in imitation.

👉Review https://t.ly/ucmuB
👉Paper arxiv.org/pdf/2601.23080
👉Project zeonsunlightyu.github.io/RGMT.github.io

🔥11❤2🤯2👏1

3.33K viewsedited 08:14

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🫧SurfPhase: 3D Interfacial Dynamics🫧

👉SurfPhase is a novel model for reconstructing 3D interfacial dynamics from sparse camera views. Repo/Dataset announced💙

👉Review https://t.ly/g2P5F
👉Paper https://arxiv.org/pdf/2602.11154
👉Project https://yuegao.me/SurfPhase/
👉Repo github.com/yuegao/SurfPhase

❤4🔥2👍1🤯1

3.02K views09:29

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🪿Teaching AI to illusions🪿

👉Stroke of Surprise by NYCU is a novel generative framework that optimizes vector strokes to satisfy distinct semantic interpretations at different drawing stages. As strokes are progressively added, the sketch reveals a completely different subject. Code released💙

👉Review https://t.ly/98Oim
👉Paper https://lnkd.in/dTA7iuce
👉Project https://lnkd.in/dhTMGw23
👉Repo https://lnkd.in/deQyDGFu

❤7👍1👏1

2.85K viewsedited 09:13

AI with Papers - Artificial Intelligence & Deep Learning

0:03

This media is not supported in your browser

VIEW IN TELEGRAM

🥝Conversational Segmentation🥝

👉CIS grounds abstract, intent-oriented concepts into pixel-accurate masks, reasoning about affordances, physics, and functional properties. Code/Demo released💙

👉Review https://t.ly/SsG57
👉Paper arxiv.org/pdf/2602.13195
👉Project glab-caltech.github.io/converseg/
👉Repo github.com/AadSah/ConverSeg
👉Demo glab-caltech.github.io/converseg/#interactive-demo

❤5🔥3👍1👏1

3.03K viewsedited 14:31

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

📲 Efficient VLMs 📲

👉CoPE-VideoLM is a codec-aware tokenization framework for VLM to replace dense RGB encoding w/ light structured representations derived from codec primitives. Token -93% / time-to-first-token -86%! Code announced💙

👉Review https://t.ly/3_GqN
👉Paper https://arxiv.org/pdf/2602.13191
👉Project https://sayands.github.io/cope/
👉Repo TBA

🔥11❤4👏1

3.44K viewsedited 07:38

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🐙Dex4D: Task-Agnostic Track🐙

👉Dex4D by CMU is a novel approach for unseen objects and poses, scene layouts, backgrounds, & task trajectories. Code under Apache 2.0💙

👉Review https://t.ly/ZGx9T
👉Paper arxiv.org/pdf/2602.15828
👉Project dex4d.github.io/
👉Sim github.com/Dex4D/Dex4D-Simulation
👉Vision github.com/Dex4D/Dex4D-Vision
👉HW https://github.com/Dex4D/Dex4D-Hardware

❤8🔥1👏1

3.52K viewsedited 07:44

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🚤Video Neural Compression🚤

👉TeCoNeRV: adapting INR hypernetworks to compress videos efficiently at higher resolutions. Impressive: +5.35dB PSNR, -36% bitrates & 1.5-3× faster. Code announced💙

👉Review https://t.ly/0AtCK
👉Paper arxiv.org/pdf/2602.16711
👉Project namithap10.github.io/teconerv/
👉Repo github.com/namithap10/TeCoNeRV/

🔥9❤4👍1👏1

3.55K viewsedited 12:44

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🔥New SOTA Planar Tracking🔥

👉WOFTSAM by the Visual Recognition Group (CTU) is a novel planar tracker that combine robust long-term segmentation by SAM2 with 8 degrees-of-freedom homography pose estimation. Repo under BY-NC-SA 4.0💙

👉Review https://t.ly/VUOe5
👉Paper https://lnkd.in/dZfc_DhQ
👉Repo https://lnkd.in/dAcneJGn

🔥8👍3❤2👏1🤯1🤣1🍾1

2.1K views07:06

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🫸 World-Grounded Hand-Obj🫸

👉WHOLE jointly reconstructs coherent hand and object motion in the world space by guiding a generative motion prior. Code announced💙

👉Review https://t.ly/c5w8h
👉Paper https://arxiv.org/pdf/2602.22209
👉Project https://judyye.github.io/whole-www/
👉Repo TBA

❤2👍2🔥1😍1

1.8K viewsedited 07:26

AI with Papers - Artificial Intelligence & Deep Learning

This media is not supported in your browser

VIEW IN TELEGRAM

🧱Solaris: generative #Minecraft🧱

👉NYU unveils Solaris, multiplayer video world model in Minecraft, which generates consistent first-person observations for two players simultaneously. Impressive work. Repo & Dataset💙

👉Review https://t.ly/VrcrT
👉Paper https://arxiv.org/pdf/2602.22208
👉Project https://solaris-wm.github.io/
👉Repo https://github.com/solaris-wm/

🔥6❤2👍2👏1

1.71K views10:21

About

Blog

Apps

Platform