lucidrains/audiolm-pytorch
Implementation of AudioLM, a Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
Language: Python
#artificial_intelligence #attention_mechanisms #audio_synthesis #deep_learning #transformers
Stars: 121 Issues: 1 Forks: 1
https://github.com/lucidrains/audiolm-pytorch
  
  Implementation of AudioLM, a Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
Language: Python
#artificial_intelligence #attention_mechanisms #audio_synthesis #deep_learning #transformers
Stars: 121 Issues: 1 Forks: 1
https://github.com/lucidrains/audiolm-pytorch
GitHub
  
  GitHub - lucidrains/audiolm-pytorch: Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google…
  Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch - lucidrains/audiolm-pytorch
🔥2
  enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E, WIP
Language: Python
#audio_lm #pytorch #text_to_speech #tts #vall_e #valle
Stars: 212 Issues: 2 Forks: 32
https://github.com/enhuiz/vall-e
  
  An unofficial PyTorch implementation of the audio LM VALL-E, WIP
Language: Python
#audio_lm #pytorch #text_to_speech #tts #vall_e #valle
Stars: 212 Issues: 2 Forks: 32
https://github.com/enhuiz/vall-e
GitHub
  
  GitHub - enhuiz/vall-e: An unofficial PyTorch implementation of the audio LM VALL-E
  An unofficial PyTorch implementation of the audio LM VALL-E  - GitHub - enhuiz/vall-e: An unofficial PyTorch implementation of the audio LM VALL-E
👍5😐1
  jafarlihi/sysm
sysm makes your system play custom sounds when any configured system or external event happens
Language: C++
#audio #linux #music #system_monitor #system_monitoring
Stars: 160 Issues: 0 Forks: 6
https://github.com/jafarlihi/sysm
  
  sysm makes your system play custom sounds when any configured system or external event happens
Language: C++
#audio #linux #music #system_monitor #system_monitoring
Stars: 160 Issues: 0 Forks: 6
https://github.com/jafarlihi/sysm
GitHub
  
  GitHub - h2337/sysm: sysm makes your system play custom sounds when any configured system or external event happens
  sysm makes your system play custom sounds when any configured system or external event happens - h2337/sysm
👍3😐1
  archinetai/audio-ai-timeline
A timeline of the latest AI models for audio generation, starting in 2023!
#artificial_intelligence #audio_generation #machine_learning
Stars: 319 Issues: 0 Forks: 7
https://github.com/archinetai/audio-ai-timeline
  
  A timeline of the latest AI models for audio generation, starting in 2023!
#artificial_intelligence #audio_generation #machine_learning
Stars: 319 Issues: 0 Forks: 7
https://github.com/archinetai/audio-ai-timeline
GitHub
  
  GitHub - archinetai/audio-ai-timeline: A timeline of the latest AI models for audio generation, starting in 2023!
  A timeline of the latest AI models for audio generation, starting in 2023! - archinetai/audio-ai-timeline
🤔6👍5
  samim23/polymath
Convert any music library into a music production sample-library with ML
Language: Python
#audio #machine_learning #ml #music #python
Stars: 725 Issues: 1 Forks: 39
https://github.com/samim23/polymath
  
  Convert any music library into a music production sample-library with ML
Language: Python
#audio #machine_learning #ml #music #python
Stars: 725 Issues: 1 Forks: 39
https://github.com/samim23/polymath
GitHub
  
  GitHub - samim23/polymath: Convert any music library into a music production sample-library with ML
  Convert any music library into a music production sample-library with ML - samim23/polymath
🤯7❤2🔥2
  StanGirard/quiver
Dump all your files and thoughts into your GenerativeAI brain and chat with it
Language: Python
#audio #chat #chatgpt #csv #embeddings #generativeai #obsidian #pdf #second_brain #vectorstore #whisper
Stars: 185 Issues: 6 Forks: 18
https://github.com/StanGirard/quiver
  
  Dump all your files and thoughts into your GenerativeAI brain and chat with it
Language: Python
#audio #chat #chatgpt #csv #embeddings #generativeai #obsidian #pdf #second_brain #vectorstore #whisper
Stars: 185 Issues: 6 Forks: 18
https://github.com/StanGirard/quiver
GitHub
  
  GitHub - QuivrHQ/quivr: Opiniated RAG for integrating GenAI in your apps 🧠   Focus on your product rather than the RAG. Easy integration…
  Opiniated RAG for integrating GenAI in your apps 🧠   Focus on your product rather than the RAG. Easy integration in existing products with customisation!  Any LLM: GPT4, Groq, Llama. Any Vectorstor...
👍1
  lucidrains/soundstorm-pytorch
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
Language: Python
#artificial_intelligence #attention_mechanism #audio_generation #deep_learning #non_autoregressive #transformers
Stars: 181 Issues: 0 Forks: 6
https://github.com/lucidrains/soundstorm-pytorch
  
  Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
Language: Python
#artificial_intelligence #attention_mechanism #audio_generation #deep_learning #non_autoregressive #transformers
Stars: 181 Issues: 0 Forks: 6
https://github.com/lucidrains/soundstorm-pytorch
GitHub
  
  GitHub - lucidrains/soundstorm-pytorch: Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind…
  Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch - lucidrains/soundstorm-pytorch
😁1
  OFA-Sys/ONE-PEACE
A general representation modal across vision, audio, language modalities.
Language: Python
#audio_language #foundation_models #multimodal #representation_learning #vision_language
Stars: 185 Issues: 2 Forks: 5
https://github.com/OFA-Sys/ONE-PEACE
  
  A general representation modal across vision, audio, language modalities.
Language: Python
#audio_language #foundation_models #multimodal #representation_learning #vision_language
Stars: 185 Issues: 2 Forks: 5
https://github.com/OFA-Sys/ONE-PEACE
GitHub
  
  GitHub - OFA-Sys/ONE-PEACE: A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring…
  A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities - OFA-Sys/ONE-PEACE
  VASTDynamics/Vaporizer2
Vaporizer2 hybrid wavetable additive / subtractive VST / AU / AAX synthesizer / sampler workstation plugin
Language: C++
#aax #audio #audiounit_plugins #cpp #daw #music #plugin #sampler #synthesizer #vst #vst3 #vst3_plugin #wavetable
Stars: 186 Issues: 5 Forks: 9
https://github.com/VASTDynamics/Vaporizer2
  
  Vaporizer2 hybrid wavetable additive / subtractive VST / AU / AAX synthesizer / sampler workstation plugin
Language: C++
#aax #audio #audiounit_plugins #cpp #daw #music #plugin #sampler #synthesizer #vst #vst3 #vst3_plugin #wavetable
Stars: 186 Issues: 5 Forks: 9
https://github.com/VASTDynamics/Vaporizer2
GitHub
  
  GitHub - VASTDynamics/Vaporizer2: Vaporizer2 hybrid wavetable additive / subtractive VST / AU / AAX synthesizer / sampler workstation…
  Vaporizer2 hybrid wavetable additive / subtractive VST / AU / AAX synthesizer / sampler workstation plugin - VASTDynamics/Vaporizer2
🔥1
  huggingface/distil-whisper
#audio #speech_recognition #whisper
Stars: 261 Issues: 2 Forks: 9
https://github.com/huggingface/distil-whisper
  
  #audio #speech_recognition #whisper
Stars: 261 Issues: 2 Forks: 9
https://github.com/huggingface/distil-whisper
GitHub
  
  GitHub - huggingface/distil-whisper: Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word…
  Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate. - huggingface/distil-whisper
  ZiqiaoPeng/SyncTalk
This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"
#audio_driven_talking_face #talking_face #talking_face_generation #talking_head
Stars: 180 Issues: 5 Forks: 2
https://github.com/ZiqiaoPeng/SyncTalk
  
  This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"
#audio_driven_talking_face #talking_face #talking_face_generation #talking_head
Stars: 180 Issues: 5 Forks: 2
https://github.com/ZiqiaoPeng/SyncTalk
GitHub
  
  GitHub - ZiqiaoPeng/SyncTalk: [CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization…
  [CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis" - ZiqiaoPeng/SyncTalk
  TuneNN/TuneNN
A transformer-based network model for pitch detection
Language: Python
#audio #machine_learning #music #pitch_detection #pitch_estimation
Stars: 142 Issues: 0 Forks: 3
https://github.com/TuneNN/TuneNN
  
  A transformer-based network model for pitch detection
Language: Python
#audio #machine_learning #music #pitch_detection #pitch_estimation
Stars: 142 Issues: 0 Forks: 3
https://github.com/TuneNN/TuneNN
GitHub
  
  GitHub - TuneNN/TuneNN: A transformer-based network model for pitch detection
  A transformer-based network model for pitch detection - TuneNN/TuneNN
👍1
  ali-vilab/dreamtalk
Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models
Language: Python
#audio_visual_learning #face_animation #talking_head #video_generation
Stars: 217 Issues: 7 Forks: 20
https://github.com/ali-vilab/dreamtalk
  
  Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models
Language: Python
#audio_visual_learning #face_animation #talking_head #video_generation
Stars: 217 Issues: 7 Forks: 20
https://github.com/ali-vilab/dreamtalk
GitHub
  
  GitHub - ali-vilab/dreamtalk: Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion…
  Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models - ali-vilab/dreamtalk
  Lessica/TrollRecorder
WIP: A simple audio recorder for TrollStore.
Language: Objective-C++
#audio_recorder #ios #jailbreak #trollstore #tweak
Stars: 282 Issues: 1 Forks: 10
https://github.com/Lessica/TrollRecorder
  
  WIP: A simple audio recorder for TrollStore.
Language: Objective-C++
#audio_recorder #ios #jailbreak #trollstore #tweak
Stars: 282 Issues: 1 Forks: 10
https://github.com/Lessica/TrollRecorder
GitHub
  
  GitHub - Lessica/TrollRecorder: (i18n/CLI) Not the first, but the best phone call recorder with TrollStore.
  (i18n/CLI) Not the first, but the best phone call recorder with TrollStore. - Lessica/TrollRecorder
👍5
  jishengpeng/WavTokenizer
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
Language: Python
#acoustic #audio_representation #codec #dac #encodec #gpt4o #music_representation_learning #semantic #soundstream #speech_language_model #speech_representation #text_to_speech
Stars: 332 Issues: 6 Forks: 20
https://github.com/jishengpeng/WavTokenizer
  
  SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
Language: Python
#acoustic #audio_representation #codec #dac #encodec #gpt4o #music_representation_learning #semantic #soundstream #speech_language_model #speech_representation #text_to_speech
Stars: 332 Issues: 6 Forks: 20
https://github.com/jishengpeng/WavTokenizer
GitHub
  
  GitHub - jishengpeng/WavTokenizer: [ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language…
  [ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling  - GitHub - jishengpeng/WavTokenizer: [ICLR 2025] SOTA discrete acoustic codec models with 4...
  antgroup/echomimic_v2
EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
Language: Python
#audio_driven_portrait_animations #audio_driven_talking_face #human_animation #talking_face_generation #talking_head
Stars: 307 Issues: 5 Forks: 28
https://github.com/antgroup/echomimic_v2
  
  EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
Language: Python
#audio_driven_portrait_animations #audio_driven_talking_face #human_animation #talking_face_generation #talking_head
Stars: 307 Issues: 5 Forks: 28
https://github.com/antgroup/echomimic_v2
GitHub
  
  GitHub - antgroup/echomimic_v2: [CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
  [CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation - antgroup/echomimic_v2
  Tencent/HunyuanCustom
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
Language: Python
#audio_driven #diffusion_models #image_to_video #image_to_video_generation #video_editing #video_generation
Stars: 360 Issues: 4 Forks: 14
https://github.com/Tencent/HunyuanCustom
  
  HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
Language: Python
#audio_driven #diffusion_models #image_to_video #image_to_video_generation #video_editing #video_generation
Stars: 360 Issues: 4 Forks: 14
https://github.com/Tencent/HunyuanCustom
GitHub
  
  GitHub - Tencent-Hunyuan/HunyuanCustom: HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
  HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation - Tencent-Hunyuan/HunyuanCustom
❤1
  wildminder/ComfyUI-VoxCPM
ComfyUI node for highly expressive speech and realistic zero-shot voice cloning
Language: Python
#ai_voice #audio #comfyui_node #t2s #text_to_speech #tts #voice_cloning #voice_generation
Stars: 198 Issues: 2 Forks: 21
https://github.com/wildminder/ComfyUI-VoxCPM
  
  ComfyUI node for highly expressive speech and realistic zero-shot voice cloning
Language: Python
#ai_voice #audio #comfyui_node #t2s #text_to_speech #tts #voice_cloning #voice_generation
Stars: 198 Issues: 2 Forks: 21
https://github.com/wildminder/ComfyUI-VoxCPM
GitHub
  
  GitHub - wildminder/ComfyUI-VoxCPM: ComfyUI node for highly expressive speech and realistic zero-shot voice cloning
  ComfyUI node for highly expressive speech and realistic zero-shot voice cloning - wildminder/ComfyUI-VoxCPM
❤2
  