GitHub repos

sniklaus/3d-ken-burns
an implementation of 3D Ken Burns Effect from a Single Image using PyTorch
Language: Python
#cuda #cupy #deep_learning #python #pytorch
Stars: 139 Issues: 2 Forks: 19
https://github.com/sniklaus/3d-ken-burns

GitHub

GitHub - sniklaus/3d-ken-burns: an implementation of 3D Ken Burns Effect from a Single Image using PyTorch

an implementation of 3D Ken Burns Effect from a Single Image using PyTorch - sniklaus/3d-ken-burns

1.46K views22:55

Add a comment

GitHub repos

Tencent/Forward
a library for high performance deep learning inference on NVIDIA GPUs.
Language: C++
#cuda #deep_learning #forward #gpu #inference #inference_engine #keras #neural_network #pytorch #tensorflow #tensorrt
Stars: 102 Issues: 0 Forks: 8
https://github.com/Tencent/Forward

GitHub

GitHub - Tencent/Forward: A library for high performance deep learning inference on NVIDIA GPUs.

A library for high performance deep learning inference on NVIDIA GPUs. - GitHub - Tencent/Forward: A library for high performance deep learning inference on NVIDIA GPUs.

👍1

2.46K views21:52

GitHub repos

teddykoker/torchsort
Fast, differentiable sorting and ranking in PyTorch
Language: Cuda
#cuda_kernel #pytorch #ranking #sort
Stars: 173 Issues: 1 Forks: 2
https://github.com/teddykoker/torchsort

GitHub

GitHub - teddykoker/torchsort: Fast, differentiable sorting and ranking in PyTorch

Fast, differentiable sorting and ranking in PyTorch - teddykoker/torchsort

2.58K views15:51

GitHub repos

ricosjp/monolish
monolish: MONOlithic Liner equation Solvers for Highly-parallel architecture
Language: C++
#blas #cpp14 #cpu #cuda #gpu #hpc #lapack #linear_algebra #linear_algebra_library #matrix #matrix_structures #mkl #openmp #scientific_computing #sparse_matrix
Stars: 75 Issues: 33 Forks: 5
https://github.com/ricosjp/monolish

GitHub

GitHub - ricosjp/monolish: monolish: MONOlithic LInear equation Solvers for Highly-parallel architecture

monolish: MONOlithic LInear equation Solvers for Highly-parallel architecture - ricosjp/monolish

2.36K views03:51

GitHub repos

kwea123/ngp_pl
Instant-ngp in pytorch+cuda trained with pytorch-lightning (with only few lines of legible code)
Language: Jupyter Notebook
#3d_reconstruction #cuda #instant_ngp #nerf #pytorch #pytorch_lightning
Stars: 114 Issues: 0 Forks: 6
https://github.com/kwea123/ngp_pl

GitHub

GitHub - kwea123/ngp_pl: Instant-ngp in pytorch+cuda trained with pytorch-lightning (high quality with high speed, with only few…

Instant-ngp in pytorch+cuda trained with pytorch-lightning (high quality with high speed, with only few lines of legible code) - kwea123/ngp_pl

👍1🔥1

2.2K views16:17

GitHub repos

chengzeyi/stable-fast
An ultra lightweight inference performance optimization library for HuggingFace Diffusers on NVIDIA GPUs.
Language: Python
#cuda #deep_learning #deeplearning #diffusers #inference #inference_engine #performance_optimization #pytorch #stable_diffusion #triton
Stars: 134 Issues: 3 Forks: 5
https://github.com/chengzeyi/stable-fast

GitHub

GitHub - chengzeyi/stable-fast: https://wavespeed.ai/ Best inference performance optimization framework for HuggingFace Diffusers…

https://wavespeed.ai/ Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs. - chengzeyi/stable-fast

🔥2❤1

2.06K views22:20

GitHub repos

zhihu/ZhiLight
A highly optimized inference acceleration engine for Llama and its variants.
Language: C++
#cpm #cuda #gpt #inference_engine #llama #llm #llm_serving #minicpm #pytorch #qwen
Stars: 192 Issues: 1 Forks: 16
https://github.com/zhihu/ZhiLight

GitHub

GitHub - zhihu/ZhiLight: A highly optimized LLM inference acceleration engine for Llama and its variants.

A highly optimized LLM inference acceleration engine for Llama and its variants. - zhihu/ZhiLight

👍1

1.85K views17:00

GitHub repos

LegNeato/rust-gpu-chimera
Demo project showing a single Rust codebase running on CPU and directly on GPUs
Language: Rust
#cuda #gpu #rust #rust_cuda #rust_gpu #vulkan
Stars: 218 Issues: 1 Forks: 5
https://github.com/LegNeato/rust-gpu-chimera

GitHub

GitHub - LegNeato/rust-gpu-chimera: Demo project showing a single Rust codebase running on CPU and directly on GPUs

Demo project showing a single Rust codebase running on CPU and directly on GPUs - LegNeato/rust-gpu-chimera

1.75K views04:00

GitHub repos

yassa9/qwen600
Static suckless single batch CUDA-only qwen3-0.6B mini inference engine
Language: Cuda
#cuda #cuda_programming #gpu #llamacpp #llm #llm_inference #qwen #qwen3 #transformer
Stars: 287 Issues: 1 Forks: 17
https://github.com/yassa9/qwen600

GitHub

GitHub - yassa9/qwen600: Static suckless single batch CUDA-only qwen3-0.6B mini inference engine

Static suckless single batch CUDA-only qwen3-0.6B mini inference engine - yassa9/qwen600

❤1

1.64K views10:00

GitHub repos

psalias2006/gpu-hot
🔥 Real-time NVIDIA GPU dashboard
Language: JavaScript
#charts #cuda #dashboard #docker #flask #gpu #gpu_monitoring #nvidia #nvidia_docker #nvidia_gpu #nvidia_smi #python #real_time #real_time_monitoring #socker_io #system_monitoring
Stars: 320 Issues: 3 Forks: 18
https://github.com/psalias2006/gpu-hot

GitHub

GitHub - psalias2006/gpu-hot: 🔥 Real-time NVIDIA GPU dashboard

🔥 Real-time NVIDIA GPU dashboard. Contribute to psalias2006/gpu-hot development by creating an account on GitHub.

❤2

1.55K views10:00

GitHub repos

Zaneham/BarraCUDA
Open-source CUDA compiler targeting AMD GPUs (and more in the future!). Compiles .cu to GFX11 machine code.
Language: C
#c99 #compiler #cuda #gpu #ml
Stars: 927 Issues: 16 Forks: 27
https://github.com/Zaneham/BarraCUDA

GitHub

GitHub - Zaneham/Booth: Open-source CUDA, Triton and HIP compiler targeting multiple GPU and CPU architectures.

Open-source CUDA, Triton and HIP compiler targeting multiple GPU and CPU architectures. - Zaneham/Booth

🔥1

1.55K views17:00

GitHub repos

RightNow-AI/autokernel
Autoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton kernels.
Language: Python
#autoresearch #cuda #gpu #kernel_optimization #pytorch #triton
Stars: 602 Issues: 4 Forks: 45
https://github.com/RightNow-AI/autokernel

GitHub

GitHub - RightNow-AI/autokernel: Autoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton…

Autoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton kernels. - RightNow-AI/autokernel

1.64K views04:00

GitHub repos

AmmarkoV/SAM3DBody-cpp
Real-time 3D full-body reconstruction from a single camera, Multiperson BVH output, Pure C++ runtime, ONNX + ggml, 70-joint skeleton with hands.
Language: C
#3d_human_pose #bvh #computer_vision #cpp #cuda #ggml #motion_capture #multi_person_pose_estimation #onnx #opengl #pose_detection #pose_estimation #pose_tracking #real_time #sam_3d_body
Stars: 382 Issues: 1 Forks: 46
https://github.com/AmmarkoV/SAM3DBody-cpp

GitHub

GitHub - AmmarkoV/SAM3DBody-cpp: Real-time 3D full-body reconstruction from a single camera, Multiperson BVH output, Pure C++ runtime…

Real-time 3D full-body reconstruction from a single camera, Multiperson BVH output, Pure C++ runtime, ONNX + ggml, 70-joint skeleton with hands. - AmmarkoV/SAM3DBody-cpp

1.58K views04:00

GitHub repos

c0deJedi/nbd-vram
Use your NVIDIA GPU's VRAM as swap space on Linux. Built for laptops with soldered memory and no upgrade path. If you have an RTX card sitting there with 8GB of VRAM and you're getting swapped to SSD, this puts that VRAM to work
Language: Shell
#cuda #gpu #laptop #linux #memory #nbd #nvidia #swap #vram
Stars: 408 Issues: 2 Forks: 11
https://github.com/c0deJedi/nbd-vram

GitHub

GitHub - c0deJedi/nbd-vram: Use your NVIDIA GPU's VRAM as swap space on Linux. Built for laptops with soldered memory and no upgrade…

Use your NVIDIA GPU's VRAM as swap space on Linux. Built for laptops with soldered memory and no upgrade path. If you have an RTX card sitting there with 8GB of VRAM and you're gett...

1.51K views16:00

About

Blog

Apps

Platform