🎲
-
12:17
(UTC +01:00)
Stars
A collection of out-of-tree LLVM passes for teaching and learning
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.
A collection of as simple as possible, modern CMake projects
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

