davidbau/rewriting
Rewriting a Deep Generative Model, ECCV 2020 (oral). Interactive tool to directly edit the rules of a GAN to synthesize scenes with objects added, removed, or altered. Change StyleGANv2 to make extravagant eyebrows, or horses wearing hats.
Language: Python
#deep_learning #gans #graphics #hci #machine_learning #research #vision
Stars: 107 Issues: 0 Forks: 10
https://github.com/davidbau/rewriting
  
  Rewriting a Deep Generative Model, ECCV 2020 (oral). Interactive tool to directly edit the rules of a GAN to synthesize scenes with objects added, removed, or altered. Change StyleGANv2 to make extravagant eyebrows, or horses wearing hats.
Language: Python
#deep_learning #gans #graphics #hci #machine_learning #research #vision
Stars: 107 Issues: 0 Forks: 10
https://github.com/davidbau/rewriting
GitHub
  
  GitHub - davidbau/rewriting: Rewriting a Deep Generative Model, ECCV 2020 (oral).  Interactive tool to directly edit the rules…
  Rewriting a Deep Generative Model, ECCV 2020 (oral).  Interactive tool to directly edit the rules of a GAN to synthesize scenes with objects added, removed, or altered.  Change StyleGANv2 to make e...
  lucidrains/bottleneck-transformer-pytorch
Implementation of Bottleneck Transformer - Pytorch
Language: Python
#artificial_intelligence #attention_mechanism #deep_learning #image_classification #transformers #vision
Stars: 122 Issues: 1 Forks: 7
https://github.com/lucidrains/bottleneck-transformer-pytorch
  
  Implementation of Bottleneck Transformer - Pytorch
Language: Python
#artificial_intelligence #attention_mechanism #deep_learning #image_classification #transformers #vision
Stars: 122 Issues: 1 Forks: 7
https://github.com/lucidrains/bottleneck-transformer-pytorch
GitHub
  
  GitHub - lucidrains/bottleneck-transformer-pytorch: Implementation of Bottleneck Transformer in Pytorch
  Implementation of Bottleneck Transformer in Pytorch - lucidrains/bottleneck-transformer-pytorch
  zihangJiang/TokenLabeling
Pytorch implementation of "Training a 85.4% Top-1 Accuracy Vision Transformer with 56M Parameters on ImageNet"
Language: Python
#imagenet #transformer #vision
Stars: 110 Issues: 1 Forks: 6
https://github.com/zihangJiang/TokenLabeling
  
  Pytorch implementation of "Training a 85.4% Top-1 Accuracy Vision Transformer with 56M Parameters on ImageNet"
Language: Python
#imagenet #transformer #vision
Stars: 110 Issues: 1 Forks: 6
https://github.com/zihangJiang/TokenLabeling
GitHub
  
  GitHub - zihangJiang/TokenLabeling: Pytorch implementation of "All Tokens Matter: Token Labeling for Training Better Vision Transformers"
  Pytorch implementation of "All Tokens Matter: Token Labeling for Training Better Vision Transformers" - zihangJiang/TokenLabeling
  lucidrains/mlp-mixer-pytorch
An All-MLP solution for Vision, from Google AI
Language: Python
#deep_learning #vision
Stars: 159 Issues: 1 Forks: 8
https://github.com/lucidrains/mlp-mixer-pytorch
  
  An All-MLP solution for Vision, from Google AI
Language: Python
#deep_learning #vision
Stars: 159 Issues: 1 Forks: 8
https://github.com/lucidrains/mlp-mixer-pytorch
GitHub
  
  GitHub - lucidrains/mlp-mixer-pytorch: An All-MLP solution for Vision, from Google AI
  An All-MLP solution for Vision, from Google AI. Contribute to lucidrains/mlp-mixer-pytorch development by creating an account on GitHub.
  rishikksh20/MLP-Mixer-pytorch
Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision
Language: Python
#computer_vision #transformer #vision #image_classification #mlp_vision
Stars: 101 Issues: 0 Forks: 9
https://github.com/rishikksh20/MLP-Mixer-pytorch
  
  Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision
Language: Python
#computer_vision #transformer #vision #image_classification #mlp_vision
Stars: 101 Issues: 0 Forks: 9
https://github.com/rishikksh20/MLP-Mixer-pytorch
GitHub
  
  GitHub - rishikksh20/MLP-Mixer-pytorch: Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision
  Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision - rishikksh20/MLP-Mixer-pytorch
  hustvl/YOLOS
You Only Look at One Sequence (https://arxiv.org/abs/2106.00666)
Language: Python
#computer_vision #transformer #object_detection #vision_transformer
Stars: 128 Issues: 0 Forks: 4
https://github.com/hustvl/YOLOS
  
  You Only Look at One Sequence (https://arxiv.org/abs/2106.00666)
Language: Python
#computer_vision #transformer #object_detection #vision_transformer
Stars: 128 Issues: 0 Forks: 4
https://github.com/hustvl/YOLOS
GitHub
  
  GitHub - hustvl/YOLOS: [NeurIPS 2021] You Only Look at One Sequence
  [NeurIPS 2021] You Only Look at One Sequence. Contribute to hustvl/YOLOS development by creating an account on GitHub.
  czczup/ViT-Adapter
Vision Transformer Adapter for Dense Predictions
#adapter #object_detection #semantic_segmentation #vision_transformer
Stars: 89 Issues: 1 Forks: 3
https://github.com/czczup/ViT-Adapter
  
  Vision Transformer Adapter for Dense Predictions
#adapter #object_detection #semantic_segmentation #vision_transformer
Stars: 89 Issues: 1 Forks: 3
https://github.com/czczup/ViT-Adapter
GitHub
  
  GitHub - czczup/ViT-Adapter: [ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
  [ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions - czczup/ViT-Adapter
  OFA-Sys/Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Language: Python
#chinese #computer_vision #multi_modal_learning #nlp #pytorch #vision_and_language_pre_training
Stars: 80 Issues: 0 Forks: 7
https://github.com/OFA-Sys/Chinese-CLIP
  
  Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Language: Python
#chinese #computer_vision #multi_modal_learning #nlp #pytorch #vision_and_language_pre_training
Stars: 80 Issues: 0 Forks: 7
https://github.com/OFA-Sys/Chinese-CLIP
GitHub
  
  GitHub - OFA-Sys/Chinese-CLIP: Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
  Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation. - OFA-Sys/Chinese-CLIP
👍1🔥1
  NVlabs/prismer
The implementation of "Prismer: A Vision-Language Model with An Ensemble of Experts".
Language: Python
#image_captioning #language_model #multi_modal_learning #multi_task_learning #vision_and_language #vision_language_model #vqa
Stars: 479 Issues: 6 Forks: 21
https://github.com/NVlabs/prismer
  
  The implementation of "Prismer: A Vision-Language Model with An Ensemble of Experts".
Language: Python
#image_captioning #language_model #multi_modal_learning #multi_task_learning #vision_and_language #vision_language_model #vqa
Stars: 479 Issues: 6 Forks: 21
https://github.com/NVlabs/prismer
GitHub
  
  GitHub - NVlabs/prismer: The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".
  The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts". - NVlabs/prismer
🔥3
  open-mmlab/Multimodal-GPT
Multimodal-GPT
Language: Python
#flamingo #gpt #gpt_4 #llama #multimodal #transformer #vision_and_language
Stars: 244 Issues: 1 Forks: 12
https://github.com/open-mmlab/Multimodal-GPT
  
  Multimodal-GPT
Language: Python
#flamingo #gpt #gpt_4 #llama #multimodal #transformer #vision_and_language
Stars: 244 Issues: 1 Forks: 12
https://github.com/open-mmlab/Multimodal-GPT
GitHub
  
  GitHub - open-mmlab/Multimodal-GPT: Multimodal-GPT
  Multimodal-GPT. Contribute to open-mmlab/Multimodal-GPT development by creating an account on GitHub.
👎1
  OFA-Sys/ONE-PEACE
A general representation modal across vision, audio, language modalities.
Language: Python
#audio_language #foundation_models #multimodal #representation_learning #vision_language
Stars: 185 Issues: 2 Forks: 5
https://github.com/OFA-Sys/ONE-PEACE
  
  A general representation modal across vision, audio, language modalities.
Language: Python
#audio_language #foundation_models #multimodal #representation_learning #vision_language
Stars: 185 Issues: 2 Forks: 5
https://github.com/OFA-Sys/ONE-PEACE
GitHub
  
  GitHub - OFA-Sys/ONE-PEACE: A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring…
  A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities - OFA-Sys/ONE-PEACE
  roboflow/multimodal-maestro
Effective prompting for Large Multimodal Models like GPT-4 Vision or LLaVA. 🔥
Language: Python
#cross_modal #gpt_4 #gpt_4_vision #instance_segmentation #llava #lmm #multimodality #object_detection #prompt_engineering #segment_anything #vision_language_model #visual_prompting
Stars: 367 Issues: 1 Forks: 23
https://github.com/roboflow/multimodal-maestro
  
  Effective prompting for Large Multimodal Models like GPT-4 Vision or LLaVA. 🔥
Language: Python
#cross_modal #gpt_4 #gpt_4_vision #instance_segmentation #llava #lmm #multimodality #object_detection #prompt_engineering #segment_anything #vision_language_model #visual_prompting
Stars: 367 Issues: 1 Forks: 23
https://github.com/roboflow/multimodal-maestro
GitHub
  
  GitHub - roboflow/maestro: streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
  streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL - roboflow/maestro
  