THUDM/GLM-4.1V-Thinking
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.
Language: Python
#image2text #reasoning #video_understanding #vlm
Stars: 449 Issues: 9 Forks: 8
https://github.com/THUDM/GLM-4.1V-Thinking
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.
Language: Python
#image2text #reasoning #video_understanding #vlm
Stars: 449 Issues: 9 Forks: 8
https://github.com/THUDM/GLM-4.1V-Thinking
GitHub
GitHub - zai-org/GLM-V: GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning - zai-org/GLM-V
❤1
liuff19/LangScene-X
[ICCV 2025] LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion
Language: Python
#3d_reconstruction #diffusion #unified_model #video_generation
Stars: 197 Issues: 1 Forks: 12
https://github.com/liuff19/LangScene-X
[ICCV 2025] LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion
Language: Python
#3d_reconstruction #diffusion #unified_model #video_generation
Stars: 197 Issues: 1 Forks: 12
https://github.com/liuff19/LangScene-X
GitHub
GitHub - liuff19/LangScene-X: [ICCV 2025] LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video…
[ICCV 2025] LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion - liuff19/LangScene-X
Wan-Video/Wan2.2
Wan: Open and Advanced Large-Scale Video Generative Models
Language: Python
#aigc #video_generation
Stars: 1285 Issues: 21 Forks: 26
https://github.com/Wan-Video/Wan2.2
Wan: Open and Advanced Large-Scale Video Generative Models
Language: Python
#aigc #video_generation
Stars: 1285 Issues: 21 Forks: 26
https://github.com/Wan-Video/Wan2.2
GitHub
GitHub - Wan-Video/Wan2.2: Wan: Open and Advanced Large-Scale Video Generative Models
Wan: Open and Advanced Large-Scale Video Generative Models - Wan-Video/Wan2.2
SkyworkAI/Matrix-3D
Generate large-scale explorable 3D scenes with high-quality panorama videos from a single image or text prompt.
Language: Python
#3d_generation #3d_reconstruction #3d_scene_generation #aigc #aigc3d #genie #genie3 #graphics #image_to_3d #image_to_video #panorama_synthesis #scene_generation #text_to_3d #text_to_video #video_generation #world_models
Stars: 284 Issues: 7 Forks: 14
https://github.com/SkyworkAI/Matrix-3D
Generate large-scale explorable 3D scenes with high-quality panorama videos from a single image or text prompt.
Language: Python
#3d_generation #3d_reconstruction #3d_scene_generation #aigc #aigc3d #genie #genie3 #graphics #image_to_3d #image_to_video #panorama_synthesis #scene_generation #text_to_3d #text_to_video #video_generation #world_models
Stars: 284 Issues: 7 Forks: 14
https://github.com/SkyworkAI/Matrix-3D
GitHub
GitHub - SkyworkAI/Matrix-3D: Generate large-scale explorable 3D scenes with high-quality panorama videos from a single image or…
Generate large-scale explorable 3D scenes with high-quality panorama videos from a single image or text prompt. - SkyworkAI/Matrix-3D
showlab/Code2Video
Video generation via code
Language: Python
#coding #multi_agent #video_generation
Stars: 256 Issues: 0 Forks: 31
https://github.com/showlab/Code2Video
Video generation via code
Language: Python
#coding #multi_agent #video_generation
Stars: 256 Issues: 0 Forks: 31
https://github.com/showlab/Code2Video
GitHub
GitHub - showlab/Code2Video: Video generation via code
Video generation via code. Contribute to showlab/Code2Video development by creating an account on GitHub.
OpenImagingLab/FlashVSR
Towards Real-Time Diffusion-Based Streaming Video Super-Resolution — An efficient one-step diffusion framework for streaming VSR with locality-constrained sparse attention and a tiny conditional decoder.
Language: Python
#diffusion_models #video_super_resolution
Stars: 218 Issues: 5 Forks: 4
https://github.com/OpenImagingLab/FlashVSR
Towards Real-Time Diffusion-Based Streaming Video Super-Resolution — An efficient one-step diffusion framework for streaming VSR with locality-constrained sparse attention and a tiny conditional decoder.
Language: Python
#diffusion_models #video_super_resolution
Stars: 218 Issues: 5 Forks: 4
https://github.com/OpenImagingLab/FlashVSR
GitHub
GitHub - OpenImagingLab/FlashVSR: Towards Real-Time Diffusion-Based Streaming Video Super-Resolution — An efficient one-step diffusion…
Towards Real-Time Diffusion-Based Streaming Video Super-Resolution — An efficient one-step diffusion framework for streaming VSR with locality-constrained sparse attention and a tiny conditional de...
EzioBy/Ditto
[Preprint 2025] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset
Language: Python
#diffusion_models #synthetic_data #video_editing
Stars: 333 Issues: 7 Forks: 28
https://github.com/EzioBy/Ditto
[Preprint 2025] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset
Language: Python
#diffusion_models #synthetic_data #video_editing
Stars: 333 Issues: 7 Forks: 28
https://github.com/EzioBy/Ditto
GitHub
GitHub - EzioBy/Ditto: [Preprint 2025] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset
[Preprint 2025] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset - EzioBy/Ditto