Generative AI

Jul 17, 2025
Hackathon Winners Bring Agentic AI to Life with the NVIDIA NeMo Agent Toolkit
The best way to learn a new toolkit is to build something real, and that’s exactly what developers did at the recent NVIDIA NeMo Agent Toolkit Hackathon. Over...
6 MIN READ

Jul 17, 2025
NVIDIA Canary‑Qwen‑2.5B: Open‑Source ASR/LLM for Superior Transcription and Summarization
Top‑ranked on the HuggingFace Open‑ASR leaderboard, the model is production‑ready.
1 MIN READ

Jul 17, 2025
New Learning Pathway: Deploy AI Models with NVIDIA NIM on GKE
Get hands-on with Google Kubernetes Engine (GKE) and NVIDIA NIM when you join the new Google Cloud and NVIDIA community.
1 MIN READ

Jul 17, 2025
Safeguard Agentic AI Systems with the NVIDIA Safety Recipe
As large language models (LLMs) power more agentic systems capable of performing autonomous actions, tool use, and reasoning, enterprises are drawn to their...
7 MIN READ

Jul 16, 2025
CUTLASS: Principled Abstractions for Handling Multidimensional Data Through Tensors and Spatial Microkernels
In the era of generative AI, utilizing GPUs to their maximum potential is essential to training better models and serving users at scale. Often, these models...
12 MIN READ

Jul 15, 2025
Accelerate AI Model Orchestration with NVIDIA Run:ai on AWS
When it comes to developing and deploying advanced AI models, access to scalable, efficient GPU infrastructure is critical. But managing this infrastructure...
5 MIN READ

Jul 14, 2025
Upcoming Livestream: Techniques for Building High-Performance RAG Applications
Discover leaderboard-winning RAG techniques, integration strategies, and deployment best practices.
1 MIN READ

Jul 14, 2025
Just Released: NVDIA Run:ai 2.22
NVDIA Run:ai 2.22 is now here. It brings advanced inference capabilities, smarter workload management, and more controls.
1 MIN READ

Jul 11, 2025
Improving Synthetic Data Augmentation and Human Action Recognition with SynthDa
Human action recognition is a capability in AI systems designed for safety-critical applications, such as surveillance, eldercare, and industrial monitoring....
10 MIN READ

Jul 10, 2025
From Terabytes to Turnkey: AI-Powered Climate Models Go Mainstream
In the race to understand our planet’s changing climate, speed and accuracy are everything. But today’s most widely used climate simulators often struggle:...
7 MIN READ

Jul 10, 2025
Accelerating Video Production and Customization with GliaCloud and NVIDIA Omniverse Libraries
The proliferation of generative AI video models, along with the new workflows these models have introduced, has significantly accelerated production efficiency...
4 MIN READ

Jul 09, 2025
Reinforcement Learning with NVIDIA NeMo-RL: Reproducing a DeepScaleR Recipe Using GRPO
Reinforcement learning (RL) is the backbone of interactive AI. It is fundamental for teaching agents to reason and learn from human preferences, enabling...
5 MIN READ

Jul 07, 2025
Turbocharging AI Factories with DPU-Accelerated Service Proxy for Kubernetes
As AI evolves to planning, research, and reasoning with agentic AI, workflows are becoming increasingly complex. To deploy agentic AI applications efficiently,...
6 MIN READ

Jul 07, 2025
LLM Inference Benchmarking: Performance Tuning with TensorRT-LLM
This is the third post in the large language model latency-throughput benchmarking series, which aims to instruct developers on how to benchmark LLM inference...
11 MIN READ

Jul 03, 2025
New Video: Build Self-Improving AI Agents with the NVIDIA Data Flywheel Blueprint
AI agents powered by large language models are transforming enterprise workflows, but high inference costs and latency can limit their scalability and user...
2 MIN READ

Jul 02, 2025
Optimizing FLUX.1 Kontext for Image Editing with Low-Precision Quantization
FLUX.1 Kontext, the recently released model from Black Forest Labs, is a fascinating addition to the repertoire of community image generation models. The open...
10 MIN READ