The Model Cache Utils (MCU) (formerly Triton Kernel Development Kit (TKDK)) is a suite of tools designed to streamline and enhance the development workflow for Model Kernel developers. Whether you're optimizing cache usage, monitoring kernel performance, or distributing your builds securely, MCU has you covered. MCU supports Triton and vLLM.
Organize, index, and monitor your Model kernel caches. This tool provides detailed reports on cache usage, offering data-driven insights into compilation performance and cache effectiveness. For more information please see the MCM readme.
MCV has been moved to its own repository: https://github.com/redhat-et/GKM
MCV (Model Cache Vault) packages Model/GPU kernel caches into OCI-compliant container images with cryptographic signing for secure cache distribution. Please refer to the new repository for the latest features and documentation.
-
Clone this repository:
git clone https://github.com/redhat-et/MCU.git cd MCU -
Follow setup instructions for each tool in its respective directory.
MCU/
├── mcm/ # Model Cache Manager
└── README.md # You're here!- Improve Triton/vLLM kernel cache management with MCM
- For packaging and sharing caches, see GKM (formerly MCV)
We welcome contributions! If you find bugs, have feature suggestions, or want to contribute code, please open an issue or submit a pull request.
Apache License Version 2.0. See LICENSE for details.
