GitHub repos – Telegram

GitHub repos

26.3K subscribers

18 photos

2 videos

11.8K links

Welcome to GitHub repos. Here you'll find valuable information on the latest trending projects. Subscribe to stay informed and gain insights from the thriving GitHub community.

Download Telegram

About

Blog

Apps

Platform

26.3K subscribers

netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Language: Python
#ai #deep_learning #emotion #emotivoice #multi_speaker #prompt #python #pytorch #speech #speech_synthesis #style #text_to_speech #tts
Stars: 432 Issues: 3 Forks: 38
https://github.com/netease-youdao/EmotiVoice

GitHub - netease-youdao/EmotiVoice: EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine - netease-youdao/EmotiVoice

👍1

2.15K views11:21

alesaccoia/VoiceStreamAI
Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS
Language: Python
#ai #speech_recognition #speech_to_text #websocket
Stars: 139 Issues: 2 Forks: 13
https://github.com/alesaccoia/VoiceStreamAI

GitHub - alesaccoia/VoiceStreamAI: Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS

Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS - alesaccoia/VoiceStreamAI

❤3👍2

2.33K views23:24

jishengpeng/WavTokenizer
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
Language: Python
#acoustic #audio_representation #codec #dac #encodec #gpt4o #music_representation_learning #semantic #soundstream #speech_language_model #speech_representation #text_to_speech
Stars: 332 Issues: 6 Forks: 20
https://github.com/jishengpeng/WavTokenizer

GitHub - jishengpeng/WavTokenizer: [ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language…

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling - GitHub - jishengpeng/WavTokenizer: [ICLR 2025] SOTA discrete acoustic codec models with 4...

2.39K views04:00

ictnlp/LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Language: Python
#large_language_models #multimodal_large_language_models #speech_interaction #speech_language_model #speech_to_speech #speech_to_text
Stars: 274 Issues: 1 Forks: 16
https://github.com/ictnlp/LLaMA-Omni

GitHub - ictnlp/LLaMA-Omni: LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1…

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level. - ictnlp/LLaMA-Omni

2.16K views16:00

amanvirparhar/chaplin
A real-time silent speech recognition tool.
Language: Python
#auto_avsr #avsr #llm #ollama #speech_recognition #speech_to_text #vsr
Stars: 279 Issues: 2 Forks: 22
https://github.com/amanvirparhar/chaplin

GitHub - amanvirparhar/chaplin: A real-time silent speech recognition tool.

A real-time silent speech recognition tool. Contribute to amanvirparhar/chaplin development by creating an account on GitHub.

1.89K views05:00

FunAudioLLM/Fun-ASR
Fun-ASR is an end-to-end speech recognition large model launched by Tongyi Lab.
Language: Python
#audio #audio_language_model #audio_understanding #fun_asr #multimodal_large_language_models #pytorch #speaker_diarization #speech_recognition
Stars: 264 Issues: 4 Forks: 8
https://github.com/FunAudioLLM/Fun-ASR

GitHub - FunAudioLLM/Fun-ASR: Fun-ASR is an end-to-end speech recognition large model launched by Tongyi Lab.

Fun-ASR is an end-to-end speech recognition large model launched by Tongyi Lab. - GitHub - FunAudioLLM/Fun-ASR: Fun-ASR is an end-to-end speech recognition large model launched by Tongyi Lab.

1.1K views23:00