GitHub repos

nari-labs/dia
A TTS model capable of generating ultra-realistic dialogue in one pass.
Language: Python
#ai #open_weight #text_to_speech
Stars: 2047 Issues: 8 Forks: 92
https://github.com/nari-labs/dia

GitHub

GitHub - nari-labs/dia: A TTS model capable of generating ultra-realistic dialogue in one pass.

A TTS model capable of generating ultra-realistic dialogue in one pass. - nari-labs/dia

1.71K views10:00

GitHub repos

JAMESYJL/ShapeLLM-Omni
A Native Multimodal LLM for 3D Generation and Understanding
Language: Python
#3d_captioning #3d_editing #image_to_3d #llm #text_to_3d
Stars: 223 Issues: 3 Forks: 6
https://github.com/JAMESYJL/ShapeLLM-Omni

GitHub

GitHub - JAMESYJL/ShapeLLM-Omni: [NeurIPS 2025 Spotlight] A Native Multimodal LLM for 3D Generation and Understanding

[NeurIPS 2025 Spotlight] A Native Multimodal LLM for 3D Generation and Understanding - JAMESYJL/ShapeLLM-Omni

1.89K views22:00

GitHub repos

Tencent-Hunyuan/Hunyuan3D-2.1
From Images to High-Fidelity 3D Assets with Production-Ready PBR Material
Language: Python
#3d #3d_aigc #3d_generation #hunyuan3d #image_to_3d #shape #shape_generation #text_to_3d #texture_genertaion
Stars: 427 Issues: 13 Forks: 28
https://github.com/Tencent-Hunyuan/Hunyuan3D-2.1

GitHub

GitHub - Tencent-Hunyuan/Hunyuan3D-2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material

From Images to High-Fidelity 3D Assets with Production-Ready PBR Material - Tencent-Hunyuan/Hunyuan3D-2.1

1.94K views10:00

GitHub repos

krea-ai/flux-krea
Official GitHub repository for FLUX.1 Krea [dev].
Language: Python
#diffusion_models #flux #machine_learning #text_to_image
Stars: 199 Issues: 3 Forks: 7
https://github.com/krea-ai/flux-krea

GitHub

GitHub - krea-ai/flux-krea: Official GitHub repository for FLUX.1 Krea [dev].

Official GitHub repository for FLUX.1 Krea [dev]. Contribute to krea-ai/flux-krea development by creating an account on GitHub.

1.67K views22:00

GitHub repos

SkyworkAI/Matrix-3D
Generate large-scale explorable 3D scenes with high-quality panorama videos from a single image or text prompt.
Language: Python
#3d_generation #3d_reconstruction #3d_scene_generation #aigc #aigc3d #genie #genie3 #graphics #image_to_3d #image_to_video #panorama_synthesis #scene_generation #text_to_3d #text_to_video #video_generation #world_models
Stars: 284 Issues: 7 Forks: 14
https://github.com/SkyworkAI/Matrix-3D

GitHub

GitHub - SkyworkAI/Matrix-3D: Generate large-scale explorable 3D scenes with high-quality panorama videos from a single image or…

Generate large-scale explorable 3D scenes with high-quality panorama videos from a single image or text prompt. - SkyworkAI/Matrix-3D

1.64K views16:00

GitHub repos

superstarryeyes/lue
Terminal eBook Reader with Text-to-Speech
Language: Python
#book #cli #doc #docx #ebook #epub #modular #pdf #reader #terminal #text_to_speech #tts #txt #voice
Stars: 325 Issues: 2 Forks: 9
https://github.com/superstarryeyes/lue

GitHub

GitHub - paulilaaso/lue: Terminal eBook Reader with Audiobook-Quality Text-to-Speech — Supports EPUB, PDF, DOCX, HTML, RTF, TXT…

Terminal eBook Reader with Audiobook-Quality Text-to-Speech — Supports EPUB, PDF, DOCX, HTML, RTF, TXT, and MD. - paulilaaso/lue

1.7K views04:00

GitHub repos

High-Logic/Genie
GPT-SoVITS ONNX Inference Engine & Model Converter
Language: Python
#gpt_sovits #text_to_speech #tts #vits #voice_clone #voice_cloning
Stars: 212 Issues: 1 Forks: 10
https://github.com/High-Logic/Genie

GitHub

GitHub - High-Logic/Genie-TTS: GPT-SoVITS ONNX Inference Engine & Model Converter

GPT-SoVITS ONNX Inference Engine & Model Converter - High-Logic/Genie-TTS

❤1

1.51K views10:00

GitHub repos

Tencent-Hunyuan/HunyuanImage-2.1
HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation
Language: Python
#aigc #diffusion_models #diffusion_transformer #image_generation #text_to_image
Stars: 255 Issues: 7 Forks: 16
https://github.com/Tencent-Hunyuan/HunyuanImage-2.1

GitHub

GitHub - Tencent-Hunyuan/HunyuanImage-2.1: HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image…

HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation - Tencent-Hunyuan/HunyuanImage-2.1

❤1

1.68K views16:00

GitHub repos

wildminder/ComfyUI-VoxCPM
ComfyUI node for highly expressive speech and realistic zero-shot voice cloning
Language: Python
#ai_voice #audio #comfyui_node #t2s #text_to_speech #tts #voice_cloning #voice_generation
Stars: 198 Issues: 2 Forks: 21
https://github.com/wildminder/ComfyUI-VoxCPM

GitHub

GitHub - wildminder/ComfyUI-VoxCPM: ComfyUI node for highly expressive speech and realistic zero-shot voice cloning

ComfyUI node for highly expressive speech and realistic zero-shot voice cloning - wildminder/ComfyUI-VoxCPM

❤2

1.63K views16:00

GitHub repos

supertone-inc/supertonic
Lightning-fast, on-device TTS — running natively via ONNX.
Language: Swift
#cpp #csharp #go #ios #java #lightweight #nodejs #on_device #python #rust #swift #text_to_speech #tt #tts #web
Stars: 263 Issues: 5 Forks: 18
https://github.com/supertone-inc/supertonic

GitHub

GitHub - supertone-inc/supertonic: Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX.

Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX. - supertone-inc/supertonic

1.62K views17:00

GitHub repos

Tencent-Hunyuan/HunyuanVideo-1.5
HunyuanVideo-1.5: A leading lightweight video generation model
Language: Python
#image_to_video #text_to_video #video_generation
Stars: 360 Issues: 5 Forks: 17
https://github.com/Tencent-Hunyuan/HunyuanVideo-1.5

GitHub

GitHub - Tencent-Hunyuan/HunyuanVideo-1.5: HunyuanVideo-1.5: A leading lightweight video generation model

HunyuanVideo-1.5: A leading lightweight video generation model - Tencent-Hunyuan/HunyuanVideo-1.5

1.67K views17:00

GitHub repos

nari-labs/dia2
TTS model capable of streaming conversational audio in realtime.
Language: Python
#open_weight #text_to_speech
Stars: 245 Issues: 3 Forks: 25
https://github.com/nari-labs/dia2

GitHub

GitHub - nari-labs/dia2: TTS model capable of streaming conversational audio in realtime.

TTS model capable of streaming conversational audio in realtime. - nari-labs/dia2

❤1

1.57K views05:00

GitHub repos

Lulzx/zpdf
Zero-copy PDF text extraction library written in Zig. High-performance, memory-mapped parsing with SIMD acceleration.
Language: Zig
#high_performance #parser #pdf #simd #text_extraction #zero_copy #zero_dependency #zig
Stars: 315 Issues: 0 Forks: 6
https://github.com/Lulzx/zpdf

GitHub

GitHub - Lulzx/zpdf: Zero-copy PDF text extraction library written in Zig. High-performance, memory-mapped parsing with SIMD acceleration.

Zero-copy PDF text extraction library written in Zig. High-performance, memory-mapped parsing with SIMD acceleration. - Lulzx/zpdf

❤1

1.53K views23:00

GitHub repos

PKU-YuanGroup/Helios
Helios: Real Real-Time Long Video Generation Model
Language: Python
#acceleration #diffusion #diffusion_model #diffusion_models #efficient_tuning #high__quality #image_to_video #image2video #interactive #long_context #long_video_generation #real_time #text_to_video #text2video #video_generation #video_generator #video_to_video #video2video #world_model #world_models
Stars: 712 Issues: 5 Forks: 46
https://github.com/PKU-YuanGroup/Helios

GitHub

GitHub - PKU-YuanGroup/Helios: Helios: Real Real-Time Long Video Generation Model

Helios: Real Real-Time Long Video Generation Model - PKU-YuanGroup/Helios

1.61K views11:00

GitHub repos

RunanywhereAI/RCLI
Talk to your Mac, query your docs, no cloud required. On-device voice AI + RAG
Language: C++
#ai_assistant #apple_silicon #kitten_tts #kokoro_tts #lfm2 #llama_cpp #llm #local_ai #metal #on_device_ai #parakeet #qwen3 #rag #speech_to_text #text_to_speech #tool_calling #voice_assistant
Stars: 627 Issues: 10 Forks: 19
https://github.com/RunanywhereAI/RCLI

GitHub

GitHub - RunanywhereAI/RCLI: Talk to your Mac, query your docs, no cloud required. On-device voice AI + RAG

Talk to your Mac, query your docs, no cloud required. On-device voice AI + RAG - RunanywhereAI/RCLI

❤3

1.63K views10:00

GitHub repos

fikrikarim/parlor
On-device, real-time multimodal AI. Have natural voice and vision conversations with an AI that runs entirely on your machine. Powered by Gemma 4 E2B and Kokoro.
Language: HTML
#apple_silicon #gemma #kokoro #litert_lm #local_llm #mlx #multimodal #on_device_ai #python #real_time #speech_recognition #text_to_speech #voice_assistant
Stars: 1183 Issues: 3 Forks: 114
https://github.com/fikrikarim/parlor

GitHub

GitHub - fikrikarim/parlor: On-device, real-time multimodal AI. Have natural voice and vision conversations with an AI that runs…

On-device, real-time multimodal AI. Have natural voice and vision conversations with an AI that runs entirely on your machine. Powered by Gemma 4 E2B and Kokoro. - fikrikarim/parlor

1.65K views04:00

GitHub repos

earthtojake/text-to-cad
An open source harness for generating CAD models
Language: JavaScript
#agents #ai #ai_agents #cad #text_to_cad #wasm
Stars: 745 Issues: 0 Forks: 120
https://github.com/earthtojake/text-to-cad

GitHub

GitHub - earthtojake/text-to-cad: A library of agent skills for CAD, CAE and CAM

A library of agent skills for CAD, CAE and CAM. Contribute to earthtojake/text-to-cad development by creating an account on GitHub.

👍2

1.68K views16:00

GitHub repos

wuyoscar/gpt_image_2_skill
GPT Image 2 prompt gallery, image prompt library, agentic skill, and CLI for OpenAI image generation/editing
Language: Python
#agent_skills #ai_image_prompts #claude_code_skill #cli #codex_skill #gpt_image #gpt_image_2 #gpt_image_2_prompts #image_editing #image_generation #image_prompt #openai #prompt_library #prompt_templates #research_figures #text_to_image
Stars: 721 Issues: 0 Forks: 64
https://github.com/wuyoscar/gpt_image_2_skill

GitHub

GitHub - wuyoscar/GPT-Image2-Skill: GPT Image 2 prompt gallery, image prompt library, agentic skill, and CLI for OpenAI image …

GPT Image 2 prompt gallery, image prompt library, agentic skill, and CLI for OpenAI image generation/editing - wuyoscar/GPT-Image2-Skill

❤1

1.72K views22:00

GitHub repos

khrisat/text-humanizer
Advanced open-source AI humanizer designed to make AI-generated text undetectable by Turnitin, GPTZero, and other AI detection services.
Language: Python
#ai_detection #ai_detection_bypasser #ai_humanizer #humanizer #humanizer_ai #humanizer_de #text_humanizer
Stars: 436 Issues: 0 Forks: 54
https://github.com/khrisat/text-humanizer

1.44K views22:00

GitHub repos

privatenumber/mac-ocr
macOS CLI for OCR and searchable PDFs using Apple's Vision framework.
Language: Swift
#apple_vision #cli #command_line_tool #image_to_text #macos #nodejs #ocr #pdf #searchable_pdf #swift #text_recognition #vision_framework
Stars: 367 Issues: 0 Forks: 16
https://github.com/privatenumber/mac-ocr

GitHub

GitHub - privatenumber/mac-ocr: macOS CLI for OCR and searchable PDFs using Apple's Vision framework

macOS CLI for OCR and searchable PDFs using Apple's Vision framework - privatenumber/mac-ocr

1.34K views04:00

About

Blog

Apps

Platform