myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
Language:Python
Total stars: 127
Stars trend:
#python
#chinese, #english, #french, #japanese, #korean, #multilingual, #spanish, #texttospeech, #tts
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
Language:Python
Total stars: 127
Stars trend:
26 Feb 2024
2pm ▊ +6
3pm ▎ +2
4pm +0
5pm +0
6pm +0
7pm +0
8pm ▏ +1
9pm +0
10pm ▌ +4
11pm ▋ +5
27 Feb 2024
12am █▌ +12
1am ███▌ +28
#python
#chinese, #english, #french, #japanese, #korean, #multilingual, #spanish, #texttospeech, #tts
espeak-ng/espeak-ng
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
Language:C
Total stars: 2997
Stars trend:
#c
#android, #espeak, #espeakng, #speechsynthesis, #texttospeech
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
Language:C
Total stars: 2997
Stars trend:
2 May 2024
12am ▏ +1
1am +0
2am ██ +16
3am █▌ +12
4am ██▋ +21
5am ███▍ +27
#c
#android, #espeak, #espeakng, #speechsynthesis, #texttospeech
6drf21e/ChatTTS_colab
🚀 One-click deployment (including offline integration package)! Based on ChatTTS, it supports timbre drawing, long audio generation and role-based reading. Simple and easy to use, no complicated installation required.
Language:Python
Total stars: 306
Stars trend:
#python
#chattts, #colabnotebook, #texttospeech
🚀 One-click deployment (including offline integration package)! Based on ChatTTS, it supports timbre drawing, long audio generation and role-based reading. Simple and easy to use, no complicated installation required.
Language:Python
Total stars: 306
Stars trend:
4 Jun 2024
6pm ▍ +3
7pm ▏ +1
8pm ▏ +1
9pm ▏ +1
10pm ▏ +1
11pm █▏ +9
5 Jun 2024
12am ▉ +7
1am ██▎ +18
2am █ +8
3am █▌ +12
4am █▏ +9
5am █▏ +9
#python
#chattts, #colabnotebook, #texttospeech
k2-fsa/sherpa-onnx
Speech-to-text, text-to-speech, and speaker recongition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift
Language:C++
Total stars: 1120
Stars trend:
#cplusplus
#aarch64, #android, #arm32, #asr, #cpp, #csharp, #dotnet, #ios, #linux, #macos, #mfc, #onnx, #openkylin, #raspberrypi, #riscv, #speechtotext, #texttospeech, #vits, #windows
Speech-to-text, text-to-speech, and speaker recongition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift
Language:C++
Total stars: 1120
Stars trend:
6 Jun 2024
4pm ▏ +1
5pm ▏ +1
6pm +0
7pm +0
8pm +0
9pm +0
10pm +0
11pm +0
7 Jun 2024
12am ▍ +3
1am ████▌ +36
2am ███▎ +26
3am ██ +16
#cplusplus
#aarch64, #android, #arm32, #asr, #cpp, #csharp, #dotnet, #ios, #linux, #macos, #mfc, #onnx, #openkylin, #raspberrypi, #riscv, #speechtotext, #texttospeech, #vits, #windows
lenML/ChatTTS-Forge
🍦 ChatTTS-Forge is a project developed around the TTS generation model ChatTTS, implementing an API Server and a Gradio-based WebUI.
Language:Python
Total stars: 249
Stars trend:
#python
#agent, #chattts, #chatttsforge, #colab, #gpt, #llm, #ssml, #texttospeech, #tts
🍦 ChatTTS-Forge is a project developed around the TTS generation model ChatTTS, implementing an API Server and a Gradio-based WebUI.
Language:Python
Total stars: 249
Stars trend:
11 Jun 2024
5am ▎ +2
6am ▋ +5
7am ▍ +3
8am ▋ +5
9am ██▏ +17
10am █▌ +12
11am ▋ +5
12pm ▉ +7
1pm ▋ +5
2pm ▊ +6
3pm ▊ +6
4pm ▍ +3
#python
#agent, #chattts, #chatttsforge, #colab, #gpt, #llm, #ssml, #texttospeech, #tts
ictnlp/StreamSpeech
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Language:Python
Total stars: 389
Stars trend:
#python
#allinone, #asr, #audioprocessing, #machinetranslation, #nonautoregressive, #seamless, #simultaneoustranslation, #speech, #speechenhancement, #speechprocessing, #speechrecognition, #speechsynthesis, #speechtotext, #speechtranslation, #streamingaudio, #texttoaudio, #texttospeech, #translation, #tts, #voice
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Language:Python
Total stars: 389
Stars trend:
17 Jun 2024
9pm ▏ +1
10pm ▏ +1
11pm ▎ +2
18 Jun 2024
12am +0
1am ▋ +5
2am ▍ +3
3am █▍ +11
4am ███ +24
5am █▋ +13
6am █ +8
7am ▉ +7
#python
#allinone, #asr, #audioprocessing, #machinetranslation, #nonautoregressive, #seamless, #simultaneoustranslation, #speech, #speechenhancement, #speechprocessing, #speechrecognition, #speechsynthesis, #speechtotext, #speechtranslation, #streamingaudio, #texttoaudio, #texttospeech, #translation, #tts, #voice
DigitalPhonetics/IMS-Toucan
Multilingual and Controllable Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart.
Language:Python
Total stars: 556
Stars trend:
#python
#deeplearning, #pytorch, #speech, #speechprocessing, #speechsynthesis, #texttospeech, #toolkit, #tts
Multilingual and Controllable Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart.
Language:Python
Total stars: 556
Stars trend:
19 Jun 2024
8pm ▍ +3
9pm ██▌ +20
10pm ██▊ +22
11pm ▊ +6
20 Jun 2024
12am █▋ +13
1am █▋ +13
#python
#deeplearning, #pytorch, #speech, #speechprocessing, #speechsynthesis, #texttospeech, #toolkit, #tts
mezbaul-h/june
Local voice assistant combining the power of Ollama, Hugging Face Transformers, and the Coqui TTS Toolkit
Language:Python
Total stars: 137
Stars trend:
#python
#ai, #assistantchatbots, #chatbot, #cliapp, #commandlinetool, #coquitts, #huggingface, #largelanguagemodels, #llm, #python, #speechrecognition, #speechtotext, #texttospeech, #whisper
Local voice assistant combining the power of Ollama, Hugging Face Transformers, and the Coqui TTS Toolkit
Language:Python
Total stars: 137
Stars trend:
21 Jun 2024
12am ▍ +3
1am ██ +16
2am █▊ +14
3am █ +8
4am █▊ +14
5am ██▎ +18
6am ██▎ +18
7am ██▍ +19
#python
#ai, #assistantchatbots, #chatbot, #cliapp, #commandlinetool, #coquitts, #huggingface, #largelanguagemodels, #llm, #python, #speechrecognition, #speechtotext, #texttospeech, #whisper
Camb-ai/MARS5-TTS
MARS5 speech model (TTS) from CAMB.AI
Language:Python
Total stars: 738
Stars trend:
#python
#prosody, #speech, #speechsynthesis, #texttospeech, #voicecloneai, #voicecloning
MARS5 speech model (TTS) from CAMB.AI
Language:Python
Total stars: 738
Stars trend:
24 Jun 2024
7pm ▏ +1
8pm ██▎ +18
9pm ████▉ +39
10pm ████▌ +36
#python
#prosody, #speech, #speechsynthesis, #texttospeech, #voicecloneai, #voicecloning
rany2/edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
Language:Python
Total stars: 4321
Stars trend:
#python
#speechsynthesis, #texttospeech, #tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
Language:Python
Total stars: 4321
Stars trend:
29 Jun 2024
12pm ▍ +3
1pm █▉ +15
2pm █▋ +13
3pm █▌ +12
4pm ██▏ +17
5pm ▊ +6
6pm ▍ +3
7pm ▍ +3
8pm ▏ +1
9pm +0
10pm ▌ +4
#python
#speechsynthesis, #texttospeech, #tts
abus-aikorea/voice-pro
Gradio WebUI for whisper, faster-whisper, whisper-timestamped. Supports YouTube Downloader, Vocal Remover, Transcription, Text-to-Speech, and Translation.
Language:Python
Total stars: 385
Stars trend:
#python
#asr, #demucs, #fasterwhisper, #gradio, #speechrecognition, #speechsynthesis, #speechtotext, #stt, #subtitles, #texttospeech, #transcription, #translate, #translation, #translator, #tts, #uvr5, #webui, #webui, #whisper, #ytdlp
Gradio WebUI for whisper, faster-whisper, whisper-timestamped. Supports YouTube Downloader, Vocal Remover, Transcription, Text-to-Speech, and Translation.
Language:Python
Total stars: 385
Stars trend:
9 Nov 2024
10pm ▏ +1
11pm ▌ +4
10 Nov 2024
12am ▎ +2
1am █▏ +9
2am ██▏ +17
3am █▎ +10
4am ▉ +7
5am ▊ +6
6am ▍ +3
7am ▌ +4
8am ▌ +4
9am █ +8
#python
#asr, #demucs, #fasterwhisper, #gradio, #speechrecognition, #speechsynthesis, #speechtotext, #stt, #subtitles, #texttospeech, #transcription, #translate, #translation, #translator, #tts, #uvr5, #webui, #webui, #whisper, #ytdlp
abus-aikorea/voice-pro
Comprehensive Gradio WebUI for audio processing, powered by Whisper engines (Whisper, Faster-Whisper, Whisper-Timestamped). Features Voice Changer, zero-shot Voice Cloning (E2, F5-TTS), YouTube downloading, vocal isolation(UVR5), Text-to-Speech (Edge-TTS), and multi-language translation. Perfect for content creators and developers.
Language:Python
Total stars: 2643
Stars trend:
#python
#audiobook, #fasterwhisper, #gradio, #podcasts, #speechrecognition, #speechsynthesis, #speechtotext, #subtitles, #texttospeech, #transcription, #translator, #tts, #voicecloning, #webui, #whisper, #ytdlp
Comprehensive Gradio WebUI for audio processing, powered by Whisper engines (Whisper, Faster-Whisper, Whisper-Timestamped). Features Voice Changer, zero-shot Voice Cloning (E2, F5-TTS), YouTube downloading, vocal isolation(UVR5), Text-to-Speech (Edge-TTS), and multi-language translation. Perfect for content creators and developers.
Language:Python
Total stars: 2643
Stars trend:
21 Jan 2025
6am ██▍ +19
7am ▎ +2
8am ▌ +4
9am ▍ +3
10am +0
11am ▋ +5
12pm ▌ +4
1pm ▊ +6
2pm █▋ +13
3pm ▉ +7
4pm ▉ +7
5pm ▊ +6
#python
#audiobook, #fasterwhisper, #gradio, #podcasts, #speechrecognition, #speechsynthesis, #speechtotext, #subtitles, #texttospeech, #transcription, #translator, #tts, #voicecloning, #webui, #whisper, #ytdlp
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Python
Total stars: 8221
Stars trend:
#python
#audiogeneration, #audiosynthesis, #audioldm, #audit, #emilia, #fastspeech2, #maskgct, #musicgeneration, #naturalspeech2, #singingvoiceconversion, #speechsynthesis, #texttoaudio, #texttospeech, #valle, #vits, #vocoder, #voiceconversion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Python
Total stars: 8221
Stars trend:
21 Jan 2025
10am ▍ +3
11am ▋ +5
12pm █▎ +10
1pm ▌ +4
2pm █ +8
3pm ▊ +6
4pm █▏ +9
5pm █▋ +13
6pm ▌ +4
7pm ▋ +5
8pm ▎ +2
9pm █ +8
#python
#audiogeneration, #audiosynthesis, #audioldm, #audit, #emilia, #fastspeech2, #maskgct, #musicgeneration, #naturalspeech2, #singingvoiceconversion, #speechsynthesis, #texttoaudio, #texttospeech, #valle, #vits, #vocoder, #voiceconversion
rany2/edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
Language:Python
Total stars: 6978
Stars trend:
#python
#speechsynthesis, #texttospeech, #tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
Language:Python
Total stars: 6978
Stars trend:
23 Jan 2025
2am ▏ +1
3am ▊ +6
4am █▊ +14
5am █▉ +15
6am █▍ +11
7am ██ +16
8am █▏ +9
9am █▎ +10
#python
#speechsynthesis, #texttospeech, #tts
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Language:Python
Total stars: 39404
Stars trend:
#python
#texttospeech, #tts, #vits, #voiceclone, #voicecloneai, #voicecloning
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Language:Python
Total stars: 39404
Stars trend:
28 Jan 2025
12am █▌ +12
1am ▉ +7
2am ▋ +5
3am ▊ +6
4am ▋ +5
5am ▋ +5
6am ▉ +7
7am ▌ +4
8am ▉ +7
9am ▉ +7
10am █ +8
11am ▌ +4
#python
#texttospeech, #tts, #vits, #voiceclone, #voicecloneai, #voicecloning
freddyaboulton/fastrtc
The python library for real-time communication
Language:Python
Total stars: 1312
Stars trend:
#python
#artificialintelligence, #llm, #python, #realtime, #speechtotext, #texttospeech
The python library for real-time communication
Language:Python
Total stars: 1312
Stars trend:
28 Feb 2025
6pm █▏ +9
7pm █▍ +11
8pm █▎ +10
9pm ▉ +7
10pm █▍ +11
11pm █▏ +9
1 Mar 2025
12am █▍ +11
1am █▍ +11
2am █▍ +11
3am █▋ +13
4am █▊ +14
5am ██ +16
#python
#artificialintelligence, #llm, #python, #realtime, #speechtotext, #texttospeech
nari-labs/dia
A TTS model capable of generating ultra-realistic dialogue in one pass.
Language:Python
Total stars: 352
Stars trend:
#python
#ai, #openweight, #texttospeech
A TTS model capable of generating ultra-realistic dialogue in one pass.
Language:Python
Total stars: 352
Stars trend:
21 Apr 2025
3pm ▍ +3
4pm +0
5pm ██████▎ +50
6pm ██████████▎ +82
7pm ████████████████▍ +131
#python
#ai, #openweight, #texttospeech
Blaizzy/mlx-audio
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
Language:Python
Total stars: 1009
Stars trend:
#python
#applesilicon, #audioprocessing, #mlx, #multimodal, #speechrecognition, #speechsynthesis, #speechtotext, #texttospeech, #transformers
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
Language:Python
Total stars: 1009
Stars trend:
8 May 2025
5am ▍ +3
6am ▍ +3
7am ▏ +1
8am ▍ +3
9am +0
10am ▋ +5
11am █▍ +11
12pm █▍ +11
1pm ▍ +3
2pm █▊ +14
3pm █▊ +14
4pm █ +8
#python
#applesilicon, #audioprocessing, #mlx, #multimodal, #speechrecognition, #speechsynthesis, #speechtotext, #texttospeech, #transformers
Capsize-Games/airunner
Offline inference engine for art, real-time voice conversations, LLM powered chatbots and automated workflows
Language:Python
Total stars: 948
Stars trend:
#python
#ai, #aiart, #art, #assetgenerator, #chatbot, #deeplearning, #desktopapp, #imagegeneration, #mistral, #multimodal, #privacy, #pygame, #pyside6, #python, #selfhosted, #speechtotext, #stablediffusion, #texttoimage, #texttospeech, #texttospeechapp
Offline inference engine for art, real-time voice conversations, LLM powered chatbots and automated workflows
Language:Python
Total stars: 948
Stars trend:
17 May 2025
8am ▎ +2
9am ▎ +2
10am ▍ +3
11am █▏ +9
12pm █▏ +9
1pm █▏ +9
2pm █▎ +10
3pm █▏ +9
4pm ▋ +5
5pm ▋ +5
6pm █ +8
7pm ▊ +6
#python
#ai, #aiart, #art, #assetgenerator, #chatbot, #deeplearning, #desktopapp, #imagegeneration, #mistral, #multimodal, #privacy, #pygame, #pyside6, #python, #selfhosted, #speechtotext, #stablediffusion, #texttoimage, #texttospeech, #texttospeechapp
abus-aikorea/voice-pro
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.
Language:Python
Total stars: 3929
Stars trend:
#python
#audiobook, #fasterwhisper, #gradio, #karaoke, #podcasts, #speechrecognition, #speechsynthesis, #speechtotext, #subtitles, #texttospeech, #transcription, #translator, #tts, #voicecloning, #voiceconversion, #webui, #whisper, #whisperx, #ytdlp
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.
Language:Python
Total stars: 3929
Stars trend:
27 Jun 2025
8am ▍ +3
9am █▊ +14
10am ▊ +6
11am █▉ +15
12pm █▉ +15
1pm █▉ +15
2pm █▊ +14
3pm ▉ +7
4pm █ +8
5pm ▌ +4
#python
#audiobook, #fasterwhisper, #gradio, #karaoke, #podcasts, #speechrecognition, #speechsynthesis, #speechtotext, #subtitles, #texttospeech, #transcription, #translator, #tts, #voicecloning, #voiceconversion, #webui, #whisper, #whisperx, #ytdlp