0% found this document useful (0 votes)
9 views4 pages

VPX6 4955 6U OpenVPX 22.4 TFLOP GPGPU Processor Card Product Sheet 1

The VPX6-4955 is a rugged 6U OpenVPX GPGPU processor card featuring dual NVIDIA Quadro Turing RTX5000E GPUs, delivering up to 22.4 TFLOPS for high-performance embedded computing applications. It includes advanced capabilities such as 6144 CUDA cores, 768 Tensor Cores, and 32 GB of GDDR6 memory, supporting high-speed data processing and AI inference. The module is designed for military and aerospace environments, offering options for air-cooled or conduction-cooled configurations and multiple video output capabilities.

Uploaded by

appalanaidug
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views4 pages

VPX6 4955 6U OpenVPX 22.4 TFLOP GPGPU Processor Card Product Sheet 1

The VPX6-4955 is a rugged 6U OpenVPX GPGPU processor card featuring dual NVIDIA Quadro Turing RTX5000E GPUs, delivering up to 22.4 TFLOPS for high-performance embedded computing applications. It includes advanced capabilities such as 6144 CUDA cores, 768 Tensor Cores, and 32 GB of GDDR6 memory, supporting high-speed data processing and AI inference. The module is designed for military and aerospace environments, offering options for air-cooled or conduction-cooled configurations and multiple video output capabilities.

Uploaded by

appalanaidug
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

VPX6-4955

6U OpenVPX ™ 22.4 TFLOP GPGPU Processor Card with Dual


NVIDIA ® Quadro ® Turing ™ RTX5000E GPUs with Tensor Cores CURTISSWRIGHTDS.COM

Overview
Providing up to 22.4 TFLOPS, this rugged VPX6-4955 GPGPU board
features a chip-down design to meet the requirements of rugged
military and aerospace environments. Designed and manufactured
by WOLF Advanced Technology, this board provides a top tier
module for intense processing and artificial intelligence (AI) in High
Performance Embedded Computing (HPEC) systems. In addition to
3072 CUDA cores for parallel processing, each TU104 also features
384 Tensor Cores for dedicated AI inference and 48 ray-tracing (RT)
cores for superior rendering speeds. Designed to work in conjunction
Key Features with TensorRT ™, CUDA, and CUDA Deep Neural Network (cuDNN),
the Turing Tensor Cores add INT8 and INT4 matrix operation while
• Dual NVIDIA Quadro Turing TU104
continuing support for high-precision workloads.
(RTX5000E) GPUs for 22 TFLOPS and
22 TIPS Moving data quickly and efficiently is important with high-speed
GPUs. The VPX6-4955 uses GDDR6 memory, which provides twice
• 6144 CUDA cores, 768 Tensor Cores,
®
the bandwidth of the GDDR5 memory used in previous generations.
96 RT cores Incorporating a PCIe switch, this module is configurable for
• 32 GB GDDR6 256-bit memory compatibility with various OpenVPX slot profiles.

• Max memory bandwidth each: This board includes eight DisplayPort 1.4 outputs, which support
448 GB/s High Dynamic Range (HDR) video at resolutions of 4K at 120 Hz
or 5K at 60 HZ with 10-bit color depth. The GPUs on the VPX6-
• PCIe® Gen 3 x16 switch 4955 also features an improved NVENC/NVDEC accelerator for
• 8 independent Display Port++ video HEVC (H.265) and AVC (H.264) with up to 8K encode resolution and
outputs B-frame support. The optional WOLF FGX for each GPU provides
video conversions to formats not native to the NVIDIA Turing, such
Applications as SDI and analog formats.
• ISR and EW applications requiring The rugged VPX6-4955 is available in air-cooled and conduction-
the highest performing GPGPU cooled versions, as well as with options for front and rear I/O
processing configurations.
• SWaP-constrained deep learning
inference that can benefit from the
largest number of Tensor Cores in
6U OpenVPX
• High-performance radar, SIGINT, EO/
IR, sensor fusion, processing and
display, and autonomous vehicles

INFO: CURTISSWRIGHTDS.COM
EMAIL: [email protected]
VPX6-4955

Figure 1: VPX6-4955 block diagram

NVIDIA Turing Streaming NVIDIA Turing Tensor Cores


Multiprocessor (SM) Designed to speed up the tensor/matrix computation used
for deep learning neural network training and inference
The NVIDIA Turing architecture provides a 50% improvement operations, Tensor Cores first became available in the Volta
in performance per CUDA core compared to the previous GPUs that were not previously available in the embedded
Pascal™ generation. The Turing SM adds a new independent space. Turing GPUs include an updated version of the
data-path that allows concurrent execution of integer and Tensor Core design enhanced for inferencing. In addition to
floating-point instructions. The redesigned memory path the original support for FP16 precision, the Turing Tensor
combines shared memory, texture caching, and memory Cores add INT8 and INT4 precision modes for workloads
load caching into one unit. These improvements translates that tolerate quantization and does not require the higher
to 2x more bandwidth and greater than 2x more capacity precision.
for L1 cache and common workloads than the previous
generation.

© 2020 Curtiss-Wright. All rights reserved. Specifications are subject to change without notice.
CURTISSWRIGHTDS.COM
All trademarks are property of their respective owners I D380.0220

2
VPX6-4955

GDDR6 Memory Specifications and Features


The Turing’s GDDR6 memory subsystem delivers 14 Gbps
Processor
signal rates while providing a 20% power efficiency
improvement over the GDDR5 memory used in the previous • NVIDIA Quadro Turing TU104(RTX5000E)
generation Pascal GPUs. To meet the higher speed + 3072 CUDA cores, up to 11.2 TFLOPS
requirements, NVIDIA achieved a 40% reduction in signal
+ 384 Tensor Cores, 48 RT cores
crosstalk. With a 256-bit memory bandwidth, the TU104 can
achieve a maximum memory bandwidth of 448 GB/s. + 48 streaming multiprocessors
+ 16 GB GDD6

Hardware Accelerated Video + Max memory bandwidth: 448 GB/s


+ Memory width: 256-bit
Encode/Decode • PCIe switch supporting Gen 3 x16
Featuring the latest generation encode/decode hardware
acceleration engine, the VPX6-4955 adds support for Video Display
HEVC (H.265) 8K encoding at 30 fps and B-Frame support.
• 8 x independent simultaneous DisplayPort++ 1.4 supporting
This new engine achieves up to a 25% bitrate saving for up to 4k @ 120 Hz or 5k @60 Hz with 10-bit (HDR) color
HEVC and up to 15% bitrate saving for AVC (H.264) while depth
producing real-time 8k and 4k encoding without burdening
the CUDA cores. • Four SDI and four CVBS outputs
• NVENC/NVDEC accelerator (version 7.2) for HEVC (H.265)
Like previous versions of the encoding engines, NVENC and AVC (H.264) hardware encode/decode with up to 8k
supports CBR and VBR rate control, programmable intra- encode resolution and B-frame support
refresh for error resiliency. The Turing GPUs introduces new
hardware functionality for high performance computing • Front and rear I/O configurations
of the relative pixel motion (optical flow) between images. • Video termination provided
These algorithms effectively handle frame-to-frame intensity
variations and track the true object motions much more Power
accurately that the traditional Motion-Estimate mode of
NVENC. • Configurable GPU hard cap: 100-300W (Preliminary)

Environmental
• Rugged air-cooled or conduction-cooled
• -40°C to 85°C operating temperature
• Other environmental specifications are per WOLF Advanced
Technology

• Humiseal 1B73 conformal coating

Software Support
• NVIDIA drivers supporting Linux®
+ CUDA Toolkit 10.0, CUDA Compute version 7.5
+ OpenCL™ 1.2, OpenGL 4.6, Open GL ES 3.2
+ Vulkan™ 1.0

CURTISSWRIGHTDS.COM

3
VPX6-4955

Ordering Information
TABLE 1 VPX6-4955 Ordering Information
PART NUMBER VARIANTS
6U OpenVPX module with dual NVIDIA Quadro Turing TU104 (RTX5000E). Each GPU has:
› 3072 CUDA cores, up to 11.2 TFLOPS
› 384 Tensor Cores, 48 RT cores
› 16 GB GDDR6, 448 GB/Sec max bandwidth
VPX6-4955-A142-000 › 4x DP++ display output
Air-cooled, “1.0” pitch, temperature range (-40°C to 85°C)
2 x16 PCIe Gen3 with configurable switch
Configurable power 100-300 W
6U OpenVPX module with dual NVIDIA Quadro Turing TU104 (RTX5000E). Each GPU has:
› 3702 CUDA cores, up to 11.2 TFLOPS
› 384 Tensor Cores, 48 RT cores
› 16 GB GDDR6, 448 GB/Sec max bandwidth
VPX6-4955-C142-000 › 4x DP++ display output
“1.0” pitch, temperature range (-40°C to 85°C)
2 x16 PCIe Gen3 with configurable switch
Configurable power 100-300 W

© 2020 Curtiss-Wright. All rights reserved. Specifications are subject to change without notice.
CURTISSWRIGHTDS.COM
All trademarks are property of their respective owners I D380.0220

You might also like