AI with Papers - Artificial Intelligence & Deep Learning
17.5K subscribers
155 photos
264 videos
14 files
1.39K links
All the AI with papers. Every day fresh updates about #DeepLearning #MachineLearning #LLM & #ComputerVision

Curated by Alessandro Ferrari | https://www.linkedin.com/in/visionarynet/

#AI #chatGPT
Download Telegram
This media is not supported in your browser
VIEW IN TELEGRAM
🌈Segment Any Events by Language🌈

👉SEAL (by NUS) is the first Semantic-aware Segment Any Events framework that addresses Open-Vocabulary Event Instance Segmentation. Code announced💙

👉Review https://t.ly/1ZMF0
👉Paper https://arxiv.org/pdf/2601.23159
👉Project https://0nandon.github.io/SEAL/
👉Repo https://github.com/0nandon/SEAL
🔥74👏1🤯1
👉RAM prices skyrocketing

👉Me acting like a rich kid.

Let's talk: https://www.linkedin.com/posts/visionarynet_ai-ram-ddr5-activity-7424127924020072448-NbaO
🤣224🔥1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🐮CoWTracker: Track-Warping🐮


👉CoWTracker (VGG + META) is a novel dense point tracker that eschews cost volumes in favor of warping. Code/Models under FAIR NC💙

👉Review https://t.ly/6bAn9
👉Paper https://arxiv.org/pdf/2602.04877
👉Project https://cowtracker.github.io/
👉Repo https://github.com/facebookresearch/cowtracker
🔥42
This media is not supported in your browser
VIEW IN TELEGRAM
🌈TrajVG Trajectory-Geometry🌈

👉TrajVG is a novel reconstruction framework that makes cross-frame 3D correspondence an explicit prediction by estimating camera-coordinate 3D trajectories. Code announced💙

👉Review https://t.ly/yVi01
👉Paper arxiv.org/pdf/2602.04439
👉Project xingy038.github.io/TrajVG/
👉Repo github.com/xingy038/TrajVG
7🔥1👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🪙MOMENTUM #NeurIPS 2025 🪙

👉MOMENTUM by Google (H/T Huguens Jean, Ph.D.) is a production multimodal agent architecture built on the Google ADK. It orchestrates 22 specialized tools (Gemini for reasoning, Imagen 4.0 for image generation, and Veo 3.1 for synthesis). Code announced💙

👉Review https://t.ly/06h7Q
👉Paper https://momentum-project-page-232993426383.us-central1.run.app/momentum_paper.pdf
👉Project https://momentum-project-page-232993426383.us-central1.run.app/
👉Repo TBA
👍31
😶‍🌫️ SOTA Full-Head Synthesis 😶‍🌫️

👉HyPlaneHead, the new SOTA in full-head image synthesis, delivering HQ results with significantly fewer artifacts compared to existing 3D-aware models. Repo announced💙

👉Review https://t.ly/WYfP3
👉Paper arxiv.org/pdf/2509.16748
👉Project https://lhyfst.github.io/hyplanehead/
👉Repo github.com/lhyfst/HyPlaneHead
4🔥3👍1😢1
This media is not supported in your browser
VIEW IN TELEGRAM
🍟 AnyTouch 2 is out 🍟

👉AnyTouch 2 is a general tactile representation learning framework for diverse optical tactile sensors that unifies object-level understanding with fine-grained, force-aware dynamic perception. Repo, Model & Data💙

👉Review https://t.ly/fP4dP
👉Paper https://arxiv.org/pdf/2602.09617
👉Project gewu-lab.github.io/AnyTouch2/
👉Repo github.com/GeWu-Lab/AnyTouch2
5🔥1
🍌 AGENT BANANA (SOTA) 🍌

👉Agent Banana is the novel SOTA agentic system for HD, native-resolution image editing through reasoning-based NL interaction, where each edit is context-aware, logically dependent, and locally precise. Code announced💙

👉Review https://t.ly/EXaCH
👉Paper https://arxiv.org/pdf/2602.09084
👉Project https://agent-banana.github.io/
👉Repo https://github.com/taco-group/agent-banana
11👏1
This media is not supported in your browser
VIEW IN TELEGRAM
🛠️ IndustryShapes 6D Pose 🛠️

👉IndustryShapes by NTUA is a new RGB-D dataset of industrial tools, designed for both instance-level and novel object 6D pose estimation. Dataset available💙

👉Review https://t.ly/KKcuH
👉Paper https://arxiv.org/pdf/2602.05555
👉Project https://pose-lab.github.io/IndustryShapes/
👉Dataset https://huggingface.co/datasets/POSE-Lab/IndustryShapes
4🔥1👏1