enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E, WIP
Language: Python
#audio_lm #pytorch #text_to_speech #tts #vall_e #valle
Stars: 212 Issues: 2 Forks: 32
https://github.com/enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E, WIP
Language: Python
#audio_lm #pytorch #text_to_speech #tts #vall_e #valle
Stars: 212 Issues: 2 Forks: 32
https://github.com/enhuiz/vall-e
GitHub
GitHub - enhuiz/vall-e: An unofficial PyTorch implementation of the audio LM VALL-E
An unofficial PyTorch implementation of the audio LM VALL-E - GitHub - enhuiz/vall-e: An unofficial PyTorch implementation of the audio LM VALL-E
lukasHoel/text2room
Text2Room generates textured 3D meshes from a given text prompt using 2D text-to-image models.
Language: Python
#3d_generation #diffusion_models #mesh_generation #text_to_image
Stars: 426 Issues: 1 Forks: 22
https://github.com/lukasHoel/text2room
Text2Room generates textured 3D meshes from a given text prompt using 2D text-to-image models.
Language: Python
#3d_generation #diffusion_models #mesh_generation #text_to_image
Stars: 426 Issues: 1 Forks: 22
https://github.com/lukasHoel/text2room
GitHub
GitHub - lukasHoel/text2room: Text2Room generates textured 3D meshes from a given text prompt using 2D text-to-image models (ICCV2023).
Text2Room generates textured 3D meshes from a given text prompt using 2D text-to-image models (ICCV2023). - lukasHoel/text2room
OpenLMLab/MOSS
An open-source tool-augmented conversational language model from Fudan University
Language: Python
#chatgpt #deep_learning #dialogue_systems #large_language_models #natural_language_processing #text_generation
Stars: 800 Issues: 11 Forks: 49
https://github.com/OpenLMLab/MOSS
An open-source tool-augmented conversational language model from Fudan University
Language: Python
#chatgpt #deep_learning #dialogue_systems #large_language_models #natural_language_processing #text_generation
Stars: 800 Issues: 11 Forks: 49
https://github.com/OpenLMLab/MOSS
GitHub
GitHub - OpenMOSS/MOSS: An open-source tool-augmented conversational language model from Fudan University
An open-source tool-augmented conversational language model from Fudan University - OpenMOSS/MOSS
thu-ml/prolificdreamer
ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation
#diffusion_model #dreamfusion #nerf #prolificdreamer #stablediffusion #text_to_3d
Stars: 255 Issues: 3 Forks: 1
https://github.com/thu-ml/prolificdreamer
ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation
#diffusion_model #dreamfusion #nerf #prolificdreamer #stablediffusion #text_to_3d
Stars: 255 Issues: 3 Forks: 1
https://github.com/thu-ml/prolificdreamer
GitHub
GitHub - thu-ml/prolificdreamer: ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation…
ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation (NeurIPS 2023 Spotlight) - thu-ml/prolificdreamer
omerbt/TokenFlow
Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" presenting "TokenFlow"
#stable_diffusion #text_to_image #text_to_video #tokenflow #video_editing
Stars: 310 Issues: 4 Forks: 13
https://github.com/omerbt/TokenFlow
Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" presenting "TokenFlow"
#stable_diffusion #text_to_image #text_to_video #tokenflow #video_editing
Stars: 310 Issues: 4 Forks: 13
https://github.com/omerbt/TokenFlow
GitHub
GitHub - omerbt/TokenFlow: Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing"…
Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" presenting "TokenFlow" (ICLR 2024) - omerbt/TokenFlow
google/break-a-scene
Official implementation for "Break-A-Scene: Extracting Multiple Concepts from a Single Image" [SIGGRAPH Asia 2023]
Language: Python
#deep_learning #diffusion_models #generative_ai #multimodal #text_to_image
Stars: 164 Issues: 1 Forks: 4
https://github.com/google/break-a-scene
Official implementation for "Break-A-Scene: Extracting Multiple Concepts from a Single Image" [SIGGRAPH Asia 2023]
Language: Python
#deep_learning #diffusion_models #generative_ai #multimodal #text_to_image
Stars: 164 Issues: 1 Forks: 4
https://github.com/google/break-a-scene
GitHub
GitHub - google/break-a-scene: Official implementation for "Break-A-Scene: Extracting Multiple Concepts from a Single Image" [SIGGRAPH…
Official implementation for "Break-A-Scene: Extracting Multiple Concepts from a Single Image" [SIGGRAPH Asia 2023] - google/break-a-scene
dreamgaussian/dreamgaussian
Generative Gaussian Splatting for Efficient 3D Content Creation
Language: Python
#image_to_3d #text_to_3d
Stars: 307 Issues: 2 Forks: 17
https://github.com/dreamgaussian/dreamgaussian
Generative Gaussian Splatting for Efficient 3D Content Creation
Language: Python
#image_to_3d #text_to_3d
Stars: 307 Issues: 2 Forks: 17
https://github.com/dreamgaussian/dreamgaussian
GitHub
GitHub - dreamgaussian/dreamgaussian: [ICLR 2024 Oral] Generative Gaussian Splatting for Efficient 3D Content Creation
[ICLR 2024 Oral] Generative Gaussian Splatting for Efficient 3D Content Creation - dreamgaussian/dreamgaussian
mihirp1998/AlignProp
AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods (PPO) for finetuning Stable Diffusion
Language: Python
#alignment #diffusion_models #reinforcement_learning #stable_diffusion #text_to_image
Stars: 104 Issues: 4 Forks: 1
https://github.com/mihirp1998/AlignProp
AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods (PPO) for finetuning Stable Diffusion
Language: Python
#alignment #diffusion_models #reinforcement_learning #stable_diffusion #text_to_image
Stars: 104 Issues: 4 Forks: 1
https://github.com/mihirp1998/AlignProp
GitHub
GitHub - mihirp1998/AlignProp: AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion…
AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods...
hustvl/GaussianDreamer
GaussianDreamer: Fast Generation from Text to 3D Gaussian Splatting with Point Cloud Priors
Language: Python
#aigc #computer_vision #diffusion_models #dreamfusion #gaussian_splatting #nerf #radiance_field #text_to_3d
Stars: 134 Issues: 0 Forks: 1
https://github.com/hustvl/GaussianDreamer
GaussianDreamer: Fast Generation from Text to 3D Gaussian Splatting with Point Cloud Priors
Language: Python
#aigc #computer_vision #diffusion_models #dreamfusion #gaussian_splatting #nerf #radiance_field #text_to_3d
Stars: 134 Issues: 0 Forks: 1
https://github.com/hustvl/GaussianDreamer
GitHub
GitHub - hustvl/GaussianDreamer: GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models…
GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models (CVPR 2024) - hustvl/GaussianDreamer
netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Language: Python
#ai #deep_learning #emotion #emotivoice #multi_speaker #prompt #python #pytorch #speech #speech_synthesis #style #text_to_speech #tts
Stars: 432 Issues: 3 Forks: 38
https://github.com/netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Language: Python
#ai #deep_learning #emotion #emotivoice #multi_speaker #prompt #python #pytorch #speech #speech_synthesis #style #text_to_speech #tts
Stars: 432 Issues: 3 Forks: 38
https://github.com/netease-youdao/EmotiVoice
GitHub
GitHub - netease-youdao/EmotiVoice: EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine - netease-youdao/EmotiVoice
baaivision/GeoDream
GeoDream: Disentangling 2D and Geometric Priors for High-Fidelity and Consistent 3D Generation
Language: Python
#3d #3d_aigc #3d_generation #text_to_3d
Stars: 244 Issues: 1 Forks: 4
https://github.com/baaivision/GeoDream
GeoDream: Disentangling 2D and Geometric Priors for High-Fidelity and Consistent 3D Generation
Language: Python
#3d #3d_aigc #3d_generation #text_to_3d
Stars: 244 Issues: 1 Forks: 4
https://github.com/baaivision/GeoDream
GitHub
GitHub - baaivision/GeoDream: GeoDream: Disentangling 2D and Geometric Priors for High-Fidelity and Consistent 3D Generation
GeoDream: Disentangling 2D and Geometric Priors for High-Fidelity and Consistent 3D Generation - baaivision/GeoDream
TianxingWu/FreeInit
FreeInit: Bridging Initialization Gap in Video Diffusion Models
Language: Python
#aigc #text_to_video #video_diffusion_model #video_generation
Stars: 162 Issues: 4 Forks: 7
https://github.com/TianxingWu/FreeInit
FreeInit: Bridging Initialization Gap in Video Diffusion Models
Language: Python
#aigc #text_to_video #video_diffusion_model #video_generation
Stars: 162 Issues: 4 Forks: 7
https://github.com/TianxingWu/FreeInit
GitHub
GitHub - TianxingWu/FreeInit: [ECCV 2024] FreeInit: Bridging Initialization Gap in Video Diffusion Models
[ECCV 2024] FreeInit: Bridging Initialization Gap in Video Diffusion Models - TianxingWu/FreeInit
YangLing0818/RPG-DiffusionMaster
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)
Language: Python
#image_editing #large_language_models #multimodal_large_language_models #text_to_image_diffusion
Stars: 272 Issues: 5 Forks: 14
https://github.com/YangLing0818/RPG-DiffusionMaster
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)
Language: Python
#image_editing #large_language_models #multimodal_large_language_models #text_to_image_diffusion
Stars: 272 Issues: 5 Forks: 14
https://github.com/YangLing0818/RPG-DiffusionMaster
GitHub
GitHub - YangLing0818/RPG-DiffusionMaster: [ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating…
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG) - YangLing0818/RPG-DiffusionMaster
reqable/re-editor
Re-Editor is a powerful lightweight text and code editor widget.
Language: Dart
#code_editor #flutter #syntax_highlighting #text_editor
Stars: 315 Issues: 0 Forks: 18
https://github.com/reqable/re-editor
Re-Editor is a powerful lightweight text and code editor widget.
Language: Dart
#code_editor #flutter #syntax_highlighting #text_editor
Stars: 315 Issues: 0 Forks: 18
https://github.com/reqable/re-editor
GitHub
GitHub - reqable/re-editor: Re-Editor is a powerful lightweight text and code editor widget.
Re-Editor is a powerful lightweight text and code editor widget. - reqable/re-editor
3DTopia/LGM
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.
Language: Python
#gaussian_splatting #image_to_3d #text_to_3d
Stars: 308 Issues: 7 Forks: 15
https://github.com/3DTopia/LGM
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.
Language: Python
#gaussian_splatting #image_to_3d #text_to_3d
Stars: 308 Issues: 7 Forks: 15
https://github.com/3DTopia/LGM
GitHub
GitHub - 3DTopia/LGM: [ECCV 2024 Oral] LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.
[ECCV 2024 Oral] LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation. - 3DTopia/LGM
PKU-YuanGroup/MagicTime
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Language: Python
#diffusion_models #long_video_generation #metamorphic_video_generation #open_sora_plan #text_to_video #time_lapse #time_lapse_dataset #video_generation
Stars: 281 Issues: 4 Forks: 16
https://github.com/PKU-YuanGroup/MagicTime
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Language: Python
#diffusion_models #long_video_generation #metamorphic_video_generation #open_sora_plan #text_to_video #time_lapse #time_lapse_dataset #video_generation
Stars: 281 Issues: 4 Forks: 16
https://github.com/PKU-YuanGroup/MagicTime
GitHub
GitHub - PKU-YuanGroup/MagicTime: MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators - PKU-YuanGroup/MagicTime
AdityaNG/kan-gpt
The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling
Language: Python
#gpt #kanformers #kolmogorov_arnold_networks #kolmogorov_arnold_representation #llm #text_generation #transformers
Stars: 217 Issues: 2 Forks: 11
https://github.com/AdityaNG/kan-gpt
The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling
Language: Python
#gpt #kanformers #kolmogorov_arnold_networks #kolmogorov_arnold_representation #llm #text_generation #transformers
Stars: 217 Issues: 2 Forks: 11
https://github.com/AdityaNG/kan-gpt
GitHub
GitHub - AdityaNG/kan-gpt: The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks…
The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling - AdityaNG/kan-gpt
jishengpeng/WavTokenizer
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
Language: Python
#acoustic #audio_representation #codec #dac #encodec #gpt4o #music_representation_learning #semantic #soundstream #speech_language_model #speech_representation #text_to_speech
Stars: 332 Issues: 6 Forks: 20
https://github.com/jishengpeng/WavTokenizer
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
Language: Python
#acoustic #audio_representation #codec #dac #encodec #gpt4o #music_representation_learning #semantic #soundstream #speech_language_model #speech_representation #text_to_speech
Stars: 332 Issues: 6 Forks: 20
https://github.com/jishengpeng/WavTokenizer
GitHub
GitHub - jishengpeng/WavTokenizer: SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling - GitHub - jishengpeng/WavTokenizer: SOTA discrete acoustic codec models with 40 tokens per second for aud...
lucasnewman/f5-tts-mlx
Implementation of F5-TTS in MLX
Language: Python
#diffusion_transformer #flow_matching #mlx #text_to_speech #tts
Stars: 193 Issues: 2 Forks: 17
https://github.com/lucasnewman/f5-tts-mlx
Implementation of F5-TTS in MLX
Language: Python
#diffusion_transformer #flow_matching #mlx #text_to_speech #tts
Stars: 193 Issues: 2 Forks: 17
https://github.com/lucasnewman/f5-tts-mlx
GitHub
GitHub - lucasnewman/f5-tts-mlx: Implementation of F5-TTS in MLX
Implementation of F5-TTS in MLX. Contribute to lucasnewman/f5-tts-mlx development by creating an account on GitHub.
edwko/OuteTTS
Interface for OuteTTS models.
Language: Python
#gguf #llama #text_to_speech #transformers #tts
Stars: 278 Issues: 6 Forks: 13
https://github.com/edwko/OuteTTS
Interface for OuteTTS models.
Language: Python
#gguf #llama #text_to_speech #transformers #tts
Stars: 278 Issues: 6 Forks: 13
https://github.com/edwko/OuteTTS
GitHub
GitHub - edwko/OuteTTS: Interface for OuteTTS models.
Interface for OuteTTS models. Contribute to edwko/OuteTTS development by creating an account on GitHub.