Blackwell

Feb 05, 2025

OpenAI Triton on NVIDIA Blackwell Boosts AI Performance and Programmability

Matrix multiplication and attention mechanisms are the computational backbone of modern AI workloads. While libraries like NVIDIA cuDNN provide highly optimized...

5 MIN READ

Feb 03, 2025

Just Released: CUTLASS 3.8

Provides support for the NVIDIA Blackwell SM100 architecture. CUTLASS is a collection of CUDA C++ templates and abstractions for implementing high-performance...

1 MIN READ

Jan 31, 2025

Just Released: NVIDIA cuDNN 9.7

Bringing support for NVIDIA Blackwell architecture across data center and GeForce products, NVIDIA cuDNN 9.7 delivers speedups of up to 84% for FP8 Flash...

1 MIN READ

Jan 31, 2025

CUDA Toolkit Now Available for NVIDIA Blackwell

The latest release of the CUDA Toolkit, version 12.8, continues to push accelerated computing performance in data sciences, AI, scientific computing, and...

9 MIN READ

Jan 30, 2025

Build Apps with Neural Rendering Using NVIDIA Nsight Developer Tools on GeForce RTX 50 Series GPUs

The next generation of NVIDIA graphics hardware has arrived. Powered by NVIDIA Blackwell, GeForce RTX 50 Series GPUs deliver groundbreaking new RTX features...

4 MIN READ

Jan 30, 2025

New AI SDKs and Tools Released for NVIDIA Blackwell GeForce RTX 50 Series GPUs

NVIDIA recently announced a new generation of PC GPUs—the GeForce RTX 50 Series—alongside new AI-powered SDKs and tools for developers. Powered by the...

6 MIN READ

Jan 09, 2025

NVIDIA Project DIGITS, A Grace Blackwell AI Supercomputer On Your Desk

Powered by the new GB10 Grace Blackwell Superchip, Project DIGITS can tackle large generative AI models of up to 200B parameters.

1 MIN READ

Nov 21, 2024

Advancing Ansys Workloads with NVIDIA Grace and NVIDIA Grace Hopper

Accelerated computing is enabling giant leaps in performance and energy efficiency compared to traditional CPU computing. Delivering these advancements requires...

10 MIN READ

Nov 13, 2024

NVIDIA Blackwell Doubles LLM Training Performance in MLPerf Training v4.1

As models grow larger and are trained on more data, they become more capable, making them more useful. To train these models quickly, more performance,...

8 MIN READ

Oct 08, 2024

Bringing AI-RAN to a Telco Near You

Inferencing for generative AI and AI agents will drive the need for AI compute infrastructure to be distributed from edge to central clouds. IDC predicts that...

14 MIN READ

Sep 26, 2024

Low Latency Inference Chapter 2: Blackwell is Coming. NVIDIA GH200 NVL32 with NVLink Switch Gives Signs of Big Leap in Time to First Token Performance

Many of the most exciting applications of large language models (LLMs), such as interactive speech bots, coding co-pilots, and search, need to begin responding...

8 MIN READ

Sep 24, 2024

NVIDIA GH200 Grace Hopper Superchip Delivers Outstanding Performance in MLPerf Inference v4.1

In the latest round of MLPerf Inference – a suite of standardized, peer-reviewed inference benchmarks – the NVIDIA platform delivered outstanding...

7 MIN READ

Aug 28, 2024

NVIDIA Blackwell Platform Sets New LLM Inference Records in MLPerf Inference v4.1

Large language model (LLM) inference is a full-stack challenge. Powerful GPUs, high-bandwidth GPU-to-GPU interconnects, efficient acceleration libraries, and a...

13 MIN READ

Jul 02, 2024

Achieving High Mixtral 8x7B Performance with NVIDIA H100 Tensor Core GPUs and NVIDIA TensorRT-LLM

As large language models (LLMs) continue to grow in size and complexity, the performance requirements for serving them quickly and cost-effectively continue to...

9 MIN READ

Jul 01, 2024

How Cutting-Edge Computer Chips are Speeding Up the AI Revolution

Featured in Nature, this post delves into how GPUs and other advanced technologies are meeting the computational challenges posed by AI.

1 MIN READ

Jun 12, 2024

Demystifying AI Inference Deployments for Trillion Parameter Large Language Models

AI is transforming every industry, addressing grand human scientific challenges such as precision drug discovery and the development of autonomous vehicles, as...

14 MIN READ