NVIDIA cuDNN#

The NVIDIA CUDA Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. It provides highly tuned implementations of operations arising frequently in deep neural network (DNN) applications:

Convolution forward and backward, including cross-correlation

Matrix multiplication

Pooling forward and backward

Softmax forward and backward

Arithmetic, mathematical, relational, and logical pointwise operations (including various flavors of forward and backward neuron activations)

Tensor transformation functions

LRN, LCN, batch normalization, instance normalization, and layer normalization forward and backward

Beyond just providing high-performance implementations of individual operations, cuDNN also supports a flexible set of multi-operation fusion patterns for further optimization. The goal is to achieve the best available performance on NVIDIA GPUs for important deep learning use cases.

In cuDNN, both single-operation and multi-operation computations are expressed as operation graphs. The following API layers available for constructing these graphs:

Python frontend API
C++ frontend API
C backend API

The NVIDIA cuDNN frontend API provides a simplified programming model that is sufficient for most use cases.

Use the NVIDIA cuDNN backend API only if you want to use the legacy fixed-function routines that are not graph-based interfaces and are not exposed by the frontend API layers, or if you need a C-only interface.

Block diagram showing the relationships between the cuDNN frontend and backend API layers and the intended audience for each layer