sniklaus/3d-ken-burns
an implementation of 3D Ken Burns Effect from a Single Image using PyTorch
Language: Python
#cuda #cupy #deep_learning #python #pytorch
Stars: 139 Issues: 2 Forks: 19
https://github.com/sniklaus/3d-ken-burns
an implementation of 3D Ken Burns Effect from a Single Image using PyTorch
Language: Python
#cuda #cupy #deep_learning #python #pytorch
Stars: 139 Issues: 2 Forks: 19
https://github.com/sniklaus/3d-ken-burns
GitHub
GitHub - sniklaus/3d-ken-burns: an implementation of 3D Ken Burns Effect from a Single Image using PyTorch
an implementation of 3D Ken Burns Effect from a Single Image using PyTorch - sniklaus/3d-ken-burns
Tencent/Forward
a library for high performance deep learning inference on NVIDIA GPUs.
Language: C++
#cuda #deep_learning #forward #gpu #inference #inference_engine #keras #neural_network #pytorch #tensorflow #tensorrt
Stars: 102 Issues: 0 Forks: 8
https://github.com/Tencent/Forward
a library for high performance deep learning inference on NVIDIA GPUs.
Language: C++
#cuda #deep_learning #forward #gpu #inference #inference_engine #keras #neural_network #pytorch #tensorflow #tensorrt
Stars: 102 Issues: 0 Forks: 8
https://github.com/Tencent/Forward
GitHub
GitHub - Tencent/Forward: A library for high performance deep learning inference on NVIDIA GPUs.
A library for high performance deep learning inference on NVIDIA GPUs. - GitHub - Tencent/Forward: A library for high performance deep learning inference on NVIDIA GPUs.
👍1
teddykoker/torchsort
Fast, differentiable sorting and ranking in PyTorch
Language: Cuda
#cuda_kernel #pytorch #ranking #sort
Stars: 173 Issues: 1 Forks: 2
https://github.com/teddykoker/torchsort
Fast, differentiable sorting and ranking in PyTorch
Language: Cuda
#cuda_kernel #pytorch #ranking #sort
Stars: 173 Issues: 1 Forks: 2
https://github.com/teddykoker/torchsort
GitHub
GitHub - teddykoker/torchsort: Fast, differentiable sorting and ranking in PyTorch
Fast, differentiable sorting and ranking in PyTorch - teddykoker/torchsort
ricosjp/monolish
monolish: MONOlithic Liner equation Solvers for Highly-parallel architecture
Language: C++
#blas #cpp14 #cpu #cuda #gpu #hpc #lapack #linear_algebra #linear_algebra_library #matrix #matrix_structures #mkl #openmp #scientific_computing #sparse_matrix
Stars: 75 Issues: 33 Forks: 5
https://github.com/ricosjp/monolish
monolish: MONOlithic Liner equation Solvers for Highly-parallel architecture
Language: C++
#blas #cpp14 #cpu #cuda #gpu #hpc #lapack #linear_algebra #linear_algebra_library #matrix #matrix_structures #mkl #openmp #scientific_computing #sparse_matrix
Stars: 75 Issues: 33 Forks: 5
https://github.com/ricosjp/monolish
GitHub
GitHub - ricosjp/monolish: monolish: MONOlithic LInear equation Solvers for Highly-parallel architecture
monolish: MONOlithic LInear equation Solvers for Highly-parallel architecture - ricosjp/monolish
kwea123/ngp_pl
Instant-ngp in pytorch+cuda trained with pytorch-lightning (with only few lines of legible code)
Language: Jupyter Notebook
#3d_reconstruction #cuda #instant_ngp #nerf #pytorch #pytorch_lightning
Stars: 114 Issues: 0 Forks: 6
https://github.com/kwea123/ngp_pl
Instant-ngp in pytorch+cuda trained with pytorch-lightning (with only few lines of legible code)
Language: Jupyter Notebook
#3d_reconstruction #cuda #instant_ngp #nerf #pytorch #pytorch_lightning
Stars: 114 Issues: 0 Forks: 6
https://github.com/kwea123/ngp_pl
GitHub
GitHub - kwea123/ngp_pl: Instant-ngp in pytorch+cuda trained with pytorch-lightning (high quality with high speed, with only few…
Instant-ngp in pytorch+cuda trained with pytorch-lightning (high quality with high speed, with only few lines of legible code) - kwea123/ngp_pl
👍1🔥1
chengzeyi/stable-fast
An ultra lightweight inference performance optimization library for HuggingFace Diffusers on NVIDIA GPUs.
Language: Python
#cuda #deep_learning #deeplearning #diffusers #inference #inference_engine #performance_optimization #pytorch #stable_diffusion #triton
Stars: 134 Issues: 3 Forks: 5
https://github.com/chengzeyi/stable-fast
An ultra lightweight inference performance optimization library for HuggingFace Diffusers on NVIDIA GPUs.
Language: Python
#cuda #deep_learning #deeplearning #diffusers #inference #inference_engine #performance_optimization #pytorch #stable_diffusion #triton
Stars: 134 Issues: 3 Forks: 5
https://github.com/chengzeyi/stable-fast
GitHub
GitHub - chengzeyi/stable-fast: https://wavespeed.ai/ Best inference performance optimization framework for HuggingFace Diffusers…
https://wavespeed.ai/ Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs. - chengzeyi/stable-fast
🔥2❤1
zhihu/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.
Language: C++
#cpm #cuda #gpt #inference_engine #llama #llm #llm_serving #minicpm #pytorch #qwen
Stars: 192 Issues: 1 Forks: 16
https://github.com/zhihu/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.
Language: C++
#cpm #cuda #gpt #inference_engine #llama #llm #llm_serving #minicpm #pytorch #qwen
Stars: 192 Issues: 1 Forks: 16
https://github.com/zhihu/ZhiLight
GitHub
GitHub - zhihu/ZhiLight: A highly optimized LLM inference acceleration engine for Llama and its variants.
A highly optimized LLM inference acceleration engine for Llama and its variants. - zhihu/ZhiLight
👍1
LegNeato/rust-gpu-chimera
Demo project showing a single Rust codebase running on CPU and directly on GPUs
Language: Rust
#cuda #gpu #rust #rust_cuda #rust_gpu #vulkan
Stars: 218 Issues: 1 Forks: 5
https://github.com/LegNeato/rust-gpu-chimera
Demo project showing a single Rust codebase running on CPU and directly on GPUs
Language: Rust
#cuda #gpu #rust #rust_cuda #rust_gpu #vulkan
Stars: 218 Issues: 1 Forks: 5
https://github.com/LegNeato/rust-gpu-chimera
GitHub
GitHub - LegNeato/rust-gpu-chimera: Demo project showing a single Rust codebase running on CPU and directly on GPUs
Demo project showing a single Rust codebase running on CPU and directly on GPUs - LegNeato/rust-gpu-chimera
yassa9/qwen600
Static suckless single batch CUDA-only qwen3-0.6B mini inference engine
Language: Cuda
#cuda #cuda_programming #gpu #llamacpp #llm #llm_inference #qwen #qwen3 #transformer
Stars: 287 Issues: 1 Forks: 17
https://github.com/yassa9/qwen600
Static suckless single batch CUDA-only qwen3-0.6B mini inference engine
Language: Cuda
#cuda #cuda_programming #gpu #llamacpp #llm #llm_inference #qwen #qwen3 #transformer
Stars: 287 Issues: 1 Forks: 17
https://github.com/yassa9/qwen600
GitHub
GitHub - yassa9/qwen600: Static suckless single batch CUDA-only qwen3-0.6B mini inference engine
Static suckless single batch CUDA-only qwen3-0.6B mini inference engine - yassa9/qwen600
❤1
psalias2006/gpu-hot
🔥 Real-time NVIDIA GPU dashboard
Language: JavaScript
#charts #cuda #dashboard #docker #flask #gpu #gpu_monitoring #nvidia #nvidia_docker #nvidia_gpu #nvidia_smi #python #real_time #real_time_monitoring #socker_io #system_monitoring
Stars: 320 Issues: 3 Forks: 18
https://github.com/psalias2006/gpu-hot
🔥 Real-time NVIDIA GPU dashboard
Language: JavaScript
#charts #cuda #dashboard #docker #flask #gpu #gpu_monitoring #nvidia #nvidia_docker #nvidia_gpu #nvidia_smi #python #real_time #real_time_monitoring #socker_io #system_monitoring
Stars: 320 Issues: 3 Forks: 18
https://github.com/psalias2006/gpu-hot
GitHub
GitHub - psalias2006/gpu-hot: 🔥 Real-time NVIDIA GPU dashboard
🔥 Real-time NVIDIA GPU dashboard. Contribute to psalias2006/gpu-hot development by creating an account on GitHub.
❤2
Zaneham/BarraCUDA
Open-source CUDA compiler targeting AMD GPUs (and more in the future!). Compiles .cu to GFX11 machine code.
Language: C
#c99 #compiler #cuda #gpu #ml
Stars: 927 Issues: 16 Forks: 27
https://github.com/Zaneham/BarraCUDA
Open-source CUDA compiler targeting AMD GPUs (and more in the future!). Compiles .cu to GFX11 machine code.
Language: C
#c99 #compiler #cuda #gpu #ml
Stars: 927 Issues: 16 Forks: 27
https://github.com/Zaneham/BarraCUDA
GitHub
GitHub - Zaneham/BarraCUDA: Open-source CUDA compiler targeting multiple GPU architectures. Compiles .cu to AMD and Tenstorrent…
Open-source CUDA compiler targeting multiple GPU architectures. Compiles .cu to AMD and Tenstorrent GPU's - Zaneham/BarraCUDA
🔥1
RightNow-AI/autokernel
Autoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton kernels.
Language: Python
#autoresearch #cuda #gpu #kernel_optimization #pytorch #triton
Stars: 602 Issues: 4 Forks: 45
https://github.com/RightNow-AI/autokernel
Autoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton kernels.
Language: Python
#autoresearch #cuda #gpu #kernel_optimization #pytorch #triton
Stars: 602 Issues: 4 Forks: 45
https://github.com/RightNow-AI/autokernel
GitHub
GitHub - RightNow-AI/autokernel: Autoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton…
Autoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton kernels. - RightNow-AI/autokernel