AI & ML Papers

✨VQ-Seg: Vector-Quantized Token Perturbation for Semi-Supervised Medical Image Segmentation

📝 Summary:
VQ-Seg introduces vector quantization to replace dropout with a controllable perturbation module for semi-supervised medical image segmentation. It uses a dual-branch architecture and foundation model guidance to maintain performance. VQ-Seg outperforms state-of-the-art methods on various medical...

🔹 Publication Date: Published on Jan 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10124
• PDF: https://arxiv.org/pdf/2601.10124
• Project Page: https://github.com/script-Yang/VQ-Seg
• Github: https://github.com/script-Yang/VQ-Seg

✨ Datasets citing this paper:
• https://huggingface.co/datasets/yscript/ACDC-PNG

==================================

For more data science resources:
✓ https://xn--r1a.website/DataScienceT

#MedicalImageSegmentation #SemiSupervisedLearning #VectorQuantization #DeepLearning #ComputerVision

319 views11:05

✨ Explore Data Science 📝 Write your paper

AI & ML Papers

Photo

🔥 GEAR: Guided End-to-End AutoRegression for Image Synthesis

💡 The paper introduces GEAR, a method for training a vector-quantized tokenizer and an autoregressive generator jointly and end-to-end for image synthesis. Typically, these models are trained in two stages, where the tokenizer is first trained and then frozen, and then the generator is trained on its output. However, this approach has a limitation, as the tokenizer is not aware of what the generator finds easy to model.

GEAR overcomes this limitation by training the tokenizer and generator jointly, guided by representation alignment. The key challenge is that the output of the tokenizer is non-differentiable, making it difficult to train the tokenizer and generator jointly. To address this, GEAR uses a dual read-out approach, where the tokenizer output is used in two different ways. A hard, one-hot branch is used to train the autoregressive generator, while a differentiable soft branch is used to carry a representation-alignment loss that guides the tokenizer.

This approach allows the autoregressive generator to steer the tokenizer towards an index distribution that it can predict more easily. As a result, the tokenizer's features become less complex, while the autoregressive generator's features become more complex and semantic. The paper demonstrates that GEAR speeds up convergence by up to 10 times relative to a strong baseline, and learns better patch-level and spatially-coherent features. Additionally, GEAR generalizes across different quantizers and can be applied to text-to-image generation. Overall, GEAR provides a new approach for training visual generative models, and achieves state-of-the-art results in image synthesis.

📅 Published on Jun 30

🔗 Links:
• GitHub: https://github.com/huggingface
• arXiv: https://arxiv.org/abs/2606.32039
• PDF: https://arxiv.org/pdf/2606.32039
• Project Page: https://linb203.github.io/gear

🤖 Models citing this paper:
• https://huggingface.co/BinLin203/Warmup-LFQ
• https://huggingface.co/BinLin203/Warmup-IBQ
• https://huggingface.co/BinLin203/GEAR-VQ

━━━━━━━━━━━━━━━━━━━━━━━━
📢 By: https://xn--r1a.website/PaperNexus

#ImageSynthesis #AutoRegression #VectorQuantization #EndToEndLearning #AutoregressiveGenerators

GitHub

Hugging Face

The AI community building the future. Hugging Face has 443 repositories available. Follow their code on GitHub.

457 views19:54

✨ Join Best TG Channels

👋 Join Our WhatsApp Channel

📝 Contact / Collaborate

About

Blog

Apps

Platform