ali-vilab/dreamtalk
Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models
Language: Python
#audio_visual_learning #face_animation #talking_head #video_generation
Stars: 217 Issues: 7 Forks: 20
https://github.com/ali-vilab/dreamtalk
  
  Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models
Language: Python
#audio_visual_learning #face_animation #talking_head #video_generation
Stars: 217 Issues: 7 Forks: 20
https://github.com/ali-vilab/dreamtalk
GitHub
  
  GitHub - ali-vilab/dreamtalk: Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion…
  Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models - ali-vilab/dreamtalk
  Lessica/TrollRecorder
WIP: A simple audio recorder for TrollStore.
Language: Objective-C++
#audio_recorder #ios #jailbreak #trollstore #tweak
Stars: 282 Issues: 1 Forks: 10
https://github.com/Lessica/TrollRecorder
  
  WIP: A simple audio recorder for TrollStore.
Language: Objective-C++
#audio_recorder #ios #jailbreak #trollstore #tweak
Stars: 282 Issues: 1 Forks: 10
https://github.com/Lessica/TrollRecorder
GitHub
  
  GitHub - Lessica/TrollRecorder: (i18n/CLI) Not the first, but the best phone call recorder with TrollStore.
  (i18n/CLI) Not the first, but the best phone call recorder with TrollStore. - Lessica/TrollRecorder
👍5
  jishengpeng/WavTokenizer
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
Language: Python
#acoustic #audio_representation #codec #dac #encodec #gpt4o #music_representation_learning #semantic #soundstream #speech_language_model #speech_representation #text_to_speech
Stars: 332 Issues: 6 Forks: 20
https://github.com/jishengpeng/WavTokenizer
  
  SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
Language: Python
#acoustic #audio_representation #codec #dac #encodec #gpt4o #music_representation_learning #semantic #soundstream #speech_language_model #speech_representation #text_to_speech
Stars: 332 Issues: 6 Forks: 20
https://github.com/jishengpeng/WavTokenizer
GitHub
  
  GitHub - jishengpeng/WavTokenizer: [ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language…
  [ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling  - GitHub - jishengpeng/WavTokenizer: [ICLR 2025] SOTA discrete acoustic codec models with 4...
  antgroup/echomimic_v2
EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
Language: Python
#audio_driven_portrait_animations #audio_driven_talking_face #human_animation #talking_face_generation #talking_head
Stars: 307 Issues: 5 Forks: 28
https://github.com/antgroup/echomimic_v2
  
  EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
Language: Python
#audio_driven_portrait_animations #audio_driven_talking_face #human_animation #talking_face_generation #talking_head
Stars: 307 Issues: 5 Forks: 28
https://github.com/antgroup/echomimic_v2
GitHub
  
  GitHub - antgroup/echomimic_v2: [CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
  [CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation - antgroup/echomimic_v2
  Tencent/HunyuanCustom
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
Language: Python
#audio_driven #diffusion_models #image_to_video #image_to_video_generation #video_editing #video_generation
Stars: 360 Issues: 4 Forks: 14
https://github.com/Tencent/HunyuanCustom
  
  HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
Language: Python
#audio_driven #diffusion_models #image_to_video #image_to_video_generation #video_editing #video_generation
Stars: 360 Issues: 4 Forks: 14
https://github.com/Tencent/HunyuanCustom
GitHub
  
  GitHub - Tencent-Hunyuan/HunyuanCustom: HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
  HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation - Tencent-Hunyuan/HunyuanCustom
❤1
  wildminder/ComfyUI-VoxCPM
ComfyUI node for highly expressive speech and realistic zero-shot voice cloning
Language: Python
#ai_voice #audio #comfyui_node #t2s #text_to_speech #tts #voice_cloning #voice_generation
Stars: 198 Issues: 2 Forks: 21
https://github.com/wildminder/ComfyUI-VoxCPM
  
  ComfyUI node for highly expressive speech and realistic zero-shot voice cloning
Language: Python
#ai_voice #audio #comfyui_node #t2s #text_to_speech #tts #voice_cloning #voice_generation
Stars: 198 Issues: 2 Forks: 21
https://github.com/wildminder/ComfyUI-VoxCPM
GitHub
  
  GitHub - wildminder/ComfyUI-VoxCPM: ComfyUI node for highly expressive speech and realistic zero-shot voice cloning
  ComfyUI node for highly expressive speech and realistic zero-shot voice cloning - wildminder/ComfyUI-VoxCPM
❤2
  