AI & ML Papers
32.8K subscribers
7.07K photos
523 videos
24 files
7.72K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Reconstruction-Guided Slot Curriculum: Addressing Object Over-Fragmentation in Video Object-Centric Learning

📝 Summary:
SlotCurri addresses video object over-fragmentation using a reconstruction-guided slot curriculum. It progressively allocates slots, employs a structure-aware loss for sharp boundaries, and uses cyclic inference for temporal consistency. This method significantly improves object decomposition.

🔹 Publication Date: Published on Mar 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22758
• PDF: https://arxiv.org/pdf/2603.22758
• Github: https://github.com/wjun0830/SlotCurri

🔹 Models citing this paper:
https://huggingface.co/WJ0830/SlotCurri

==================================

For more data science resources:
https://xn--r1a.website/DataScienceT

#VideoAI #ObjectCentricLearning #ComputerVision #DeepLearning #ObjectSegmentation
AI & ML Papers
Photo
🔥 SAM 3: Segment Anything with Concepts

💡 The paper introduces Segment Anything Model 3, a unified model that detects, segments, and tracks objects in images and videos based on concept prompts. The model achieves state-of-the-art performance in promptable concept segmentation and tracking by leveraging a unified model architecture with decoupled recognition and localization. The concept prompts can be short noun phrases, image exemplars, or a combination of both, and the model returns segmentation masks and unique identities for all matching object instances.

To advance promptable concept segmentation, the authors built a scalable data engine that produces a high-quality dataset with 4 million unique concept labels, including hard negatives, across images and videos. The model consists of an image-level detector and a memory-based video tracker that share a single backbone. The recognition and localization are decoupled with a presence head, which boosts detection accuracy.

The results show that Segment Anything Model 3 doubles the accuracy of existing systems in both image and video promptable concept segmentation, and improves previous capabilities on visual segmentation tasks. The authors also open source Segment Anything Model 3 along with a new benchmark for promptable concept segmentation, called Segment Anything with Concepts.

The main contributions of the paper are the introduction of a unified model architecture that achieves state-of-the-art performance in promptable concept segmentation and tracking, the creation of a large-scale dataset with unique concept labels, and the development of a new benchmark for evaluating promptable concept segmentation models. Overall, the paper presents a significant advancement in the field of computer vision and object segmentation, enabling more accurate and efficient detection, segmentation, and tracking of objects in images and videos based on concept prompts.


📅 Published on Nov 20, 2025

🔗 Links:
• GitHub: https://github.com/huggingface
• arXiv: https://arxiv.org/abs/2511.16719
• PDF: https://arxiv.org/pdf/2511.16719
• Project Page: https://ai.meta.com/sam3/

🤖 Models citing this paper:
https://huggingface.co/AllanVester/SAM3.1-CoreML-FP16
https://huggingface.co/AllanVester/SAM3.1-CoreML
https://huggingface.co/embedl/sam3

🚀 Spaces citing this paper:
https://huggingface.co/spaces/kith777/rag_agent

━━━━━━━━━━━━━━━━━━━━━━━━
📢 By: https://xn--r1a.website/PaperNexus

#ComputerVision #ObjectSegmentation #ConceptLearning #ImageTracking #PromptableSegmentation