NVIDIA cuDNN#

The NVIDIA CUDA Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. It provides highly tuned implementations of operations arising frequently in deep neural network (DNN) applications:

  • Convolution forward and backward, including cross-correlation

  • Matrix multiplication

  • Pooling forward and backward

  • Softmax forward and backward

  • Arithmetic, mathematical, relational, and logical pointwise operations (including various flavors of forward and backward neuron activations)

  • Tensor transformation functions

  • LRN, LCN, batch normalization, instance normalization, and layer normalization forward and backward

Beyond just providing high-performance implementations of individual operations, cuDNN also supports a flexible set of multi-operation fusion patterns for further optimization. The goal is to achieve the best available performance on NVIDIA GPUs for important deep learning use cases.

In cuDNN, both single-operation and multi-operation computations are expressed as operation graphs. The following API layers available for constructing these graphs:

  • Python frontend API

  • C++ frontend API

  • C backend API

The NVIDIA cuDNN frontend API provides a simplified programming model that is sufficient for most use cases.

Use the NVIDIA cuDNN backend API only if you want to use the legacy fixed-function routines that are not graph-based interfaces and are not exposed by the frontend API layers, or if you need a C-only interface.

Block diagram showing the relationships between the cuDNN frontend and backend API layers and the intended audience for each layer