GitHub repos – Telegram

GitHub repos

26.2K subscribers

18 photos

2 videos

11.6K links

Welcome to GitHub repos. Here you'll find valuable information on the latest trending projects. Subscribe to stay informed and gain insights from the thriving GitHub community.

Download Telegram

About

Blog

Apps

Platform

26.2K subscribers

huggingface/distil-whisper
#audio #speech_recognition #whisper
Stars: 261 Issues: 2 Forks: 9
https://github.com/huggingface/distil-whisper

GitHub - huggingface/distil-whisper: Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word…

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate. - huggingface/distil-whisper

2.19K views10:20

netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Language: Python
#ai #deep_learning #emotion #emotivoice #multi_speaker #prompt #python #pytorch #speech #speech_synthesis #style #text_to_speech #tts
Stars: 432 Issues: 3 Forks: 38
https://github.com/netease-youdao/EmotiVoice

GitHub - netease-youdao/EmotiVoice: EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine - netease-youdao/EmotiVoice

👍1

2.14K views11:21

alesaccoia/VoiceStreamAI
Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS
Language: Python
#ai #speech_recognition #speech_to_text #websocket
Stars: 139 Issues: 2 Forks: 13
https://github.com/alesaccoia/VoiceStreamAI

GitHub - alesaccoia/VoiceStreamAI: Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS

Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS - alesaccoia/VoiceStreamAI

❤3👍2

2.32K views23:24

jishengpeng/WavTokenizer
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
Language: Python
#acoustic #audio_representation #codec #dac #encodec #gpt4o #music_representation_learning #semantic #soundstream #speech_language_model #speech_representation #text_to_speech
Stars: 332 Issues: 6 Forks: 20
https://github.com/jishengpeng/WavTokenizer

GitHub - jishengpeng/WavTokenizer: [ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language…

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling - GitHub - jishengpeng/WavTokenizer: [ICLR 2025] SOTA discrete acoustic codec models with 4...

2.38K views04:00

ictnlp/LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Language: Python
#large_language_models #multimodal_large_language_models #speech_interaction #speech_language_model #speech_to_speech #speech_to_text
Stars: 274 Issues: 1 Forks: 16
https://github.com/ictnlp/LLaMA-Omni

GitHub - ictnlp/LLaMA-Omni: LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1…

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level. - ictnlp/LLaMA-Omni

2.15K views16:00

amanvirparhar/chaplin
A real-time silent speech recognition tool.
Language: Python
#auto_avsr #avsr #llm #ollama #speech_recognition #speech_to_text #vsr
Stars: 279 Issues: 2 Forks: 22
https://github.com/amanvirparhar/chaplin

GitHub - amanvirparhar/chaplin: A real-time silent speech recognition tool.

A real-time silent speech recognition tool. Contribute to amanvirparhar/chaplin development by creating an account on GitHub.

1.87K views05:00