0% found this document useful (0 votes)
295 views23 pages

Accelerate Computing Vision and Image Processing Using VPI 1.1 by Rodolfo Lima

VPI 1.1 introduces new features for accelerating computer vision and image processing workloads. These include new algorithms like background subtraction, image histograms, and dense optical flow. Stereo disparity estimation algorithms now have CUDA, PVA+NVENC+VIC backends for improved performance. VPI also now supports Python bindings for easier programming through an API inspired by Pillow. The Python bindings allow creating and manipulating images and arrays directly from NumPy and allow specifying the execution backend.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
295 views23 pages

Accelerate Computing Vision and Image Processing Using VPI 1.1 by Rodolfo Lima

VPI 1.1 introduces new features for accelerating computer vision and image processing workloads. These include new algorithms like background subtraction, image histograms, and dense optical flow. Stereo disparity estimation algorithms now have CUDA, PVA+NVENC+VIC backends for improved performance. VPI also now supports Python bindings for easier programming through an API inspired by Pillow. The Python bindings allow creating and manipulating images and arrays directly from NumPy and allow specifying the execution backend.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

ACCELERATE COMPUTER

VISION AND IMAGE


PROCESSING USING VPI 1.1
Rodolfo Lima, September 1st, 2021
AGENDA

Introduction to VPI
Quick introduction to VPI and what it is used for

Overview of new features in VPI 1.1


Go through the new features added in VPI 1.1

VPI programming with Python


Introduce the Developer Preview of VPI Python bindings and
how to use them effectively

2
INTRODUCTION TO VPI
3
VISION PROGRAMING INTERFACE - VPI
NVIDIA's next-gen API for high-performance Computer Vision processing

Create efficient Computer Vision pipelines with all computing accelerators GPU
NvEnc
First time exposing PVA and VIC processors for general use CPU

Easily load balance CV workloads at the system level


PVA VIC
Accelerate on both Tegra and PCs
Jetson AGX Xavier

Seamless interface to different HW

Relatively easy to use GPU

Deployed with NVIDIA JetPack CPU

Geforce RTX 2070 + CPU

4
COMPUTING PIPELINE EXAMPLE
Stereo Disparity Estimation

VIC
Stereo Disparity Estimation Pipeline
VPI
L LDC +
CUDA
Stereo Camera

NV12->U16
downscale
PVA
PVA NvEnc
Stereo
Stereo Optical
postprocess
preprocess Flow
+ upscale
VIC
CUDA
LDC +
R downscale
NV12->U16

CUDA
OpenGL
Depth Map
Display
from Disparity

5
ALGORITHMS
Overview

Ref: VPI documentation: https://docs.nvidia.com/vpi/index.html

New algorithms available in vpi 1.1, released with JetPack 4.6


6
PERFORMANCE BENCHMARK

Up to 20x faster than OpenCV on GPU, 50x faster on CPU

7
NEW FEATURES OF VPI 1.1
8
NEW ALGORITHMS
Background Subtraction

Uses Gaussian Mixture Models technique

Works on image sequences (videos)

Implemented on CPU and CUDA backends

Optional shadow detection and background output

Accepts both grayscale and color image formats


9
NEW ALGORITHMS
Image Histogram and Equalize Histogram
Input Equalized
Commonly used in input pre-processing

Implemented on CPU and CUDA backends

Image Histogram on CPU is 3.3x faster than


OpenCV/CPU

Equalize Histogram on CUDA is 3.8x faster than


OpenCV/CUDA

10
NEW ALGORITHMS
Dense Optical Flow

Used for motion detection and object tracking

Implemented by NVENC backend, only available on


Jetson AGX Xavier devices

Operates on NV12 block-linear image sequences

Output is a S10.5 signed fixed-point 2D vector field


(2S16 block-linear image)

Output resolution is 1/4th of input

1920x1080 input performance:

Low quality: 1.7 ms per frame

High quality: 3.1 ms per frame

11
NEW ALGORITHMS
Laplacian Pyramid Generator
Input
Used for image decomposition into frequency bands

Implemented by CUDA and CPU backends

Optional output of corresponding Gaussian pyramid


representation

Inverse operation, Laplacian Reconstruction,


planned for a future VPI release

4-level Laplacian Pyramid

12
STEREO DISPARITY ESTIMATION
New CUDA / PVA+NVENC+VIC backend implementations

Better output quality, less noisy Left Right

Supports up to 256 disparity levels

CUDA backend 2.2x faster than previous


implementation on Jetson AGX Xavier

480x270x16bpp, max 64 disparities: 2.61ms

Outputs confidence map


Disparity Confidence Map
Detects invalid disparities

Bonus: stereo sample updated with color output:

13
PYTHON BINDINGS
Easier interface to VPI

14
VPI PYTHON PROGRAMING
15
PROGRAMMING MODEL
How VPI Python looks

Supports Python 2.7 and 3.6. Edge detection with Sobel filter

Easy interoperability with numpy and OpenCV.

Allows for quick image processing pipeline prototyping.

Pseudo-immediate mode API inspired by Pillow library.

Efficient, multi-backend algorithm execution.

First released as a Developer Preview

API might slightly change when it’s production-ready.

Only allows use of global processing stream.

Multi-stream processing planned for production release.

16
CREATING IMAGES
Allocating new images or wrapping existing ones

• Allocating a new Image


img = vpi.Image(size, format)

size: (width,height) tuple


format: vpi.Format enumeration
vpi.Format.RGB8
vpi.Format.Y8
vpi.Format.NV12

• Wrapping an existing 2D numpy array


img = vpi.asimage(buffer [,format])

buffer: numpy array, for single plane images, or


buffer: list of numpy arrays, for multi-plane images

• When wrapping single plane, format can be deduced.


(h,w,4) np.uint8 → vpi.Format.RGBA8
(h,w,3) np.uint8 → vpi.Format.RGB8
(h,w,2) np.float32 → vpi.Format.2F32
(h,w,1) np.int16 → vpi.Format.S16
(h,w) np.uint32 → vpi.Format.U32

17
CREATING ARRAYS AND PYRAMIDS
Allocating new arrays and pyramids, or wrapping existing arrays

• Allocating a new Array


arr = vpi.Array(capacity, type)

type: vpi.Type enumeration


vpi.Type.U8
vpi.Type.KEYPOINT
vpi.Type.HOMOGRAPHY_TRANSFORM_2D

Created arrays are initially empty (size == 0)

• Wrapping an existing 1D numpy array


arr = vpi.asarray(buffer [,type])

• When wrapping, type can be deduced. Examples:


(size) np.uint8 → vpi.Type.U8
(size,1) np.int16 → vpi.Type.S16
(size,2) np.float32 → vpi.Type.KEYPOINT
(size,3,3) np.float32 → vpi.Type.HOMOGRAPHY_TRANSFORM_2D

• Allocating a new Pyramid


arr = vpi.Pyramid(size, format, levels, scale=0.5)

18
USING ALGORITHMS
Specify execution backend

19
USING ALGORITHMS
Composition

20
USING RESULTS
Lock memory buffers

Single-plane images Multi-plane images

21
MORE EXAMPLES
Excerpts from online documentation

More at: https://docs.nvidia.com/vpi/

22
Q&A

You might also like