How to get VSS API for CV Pipeline

taopikhidayat27 · July 10, 2025, 1:11am

Hello, NVDIA VSS Support team.

I am trying to get VSS API for CV Pipeline.

(VSS API Glossary — Video Search and Summarization Agent)

According to this doc, I can get summarization API for video. But I can’t get the API for CV Pipeline.

When deploy VSS, I used below overrides.yaml file.
overrides.txt (1.7 KB)

I successfully deployed NVIDIA VSS for CV Pipeline. But I can’t get CV pipeline api through doc.
Could you help with this?

Sincerely
Taopik H.

yuweiw · July 10, 2025, 1:40am

Could you please elaborate on what kind of CV pipeline api is needed? How do you plan to use these api and what feature do you want to achieve? Thanks

taopikhidayat27 · July 10, 2025, 2:13am

To clarify, my goal is to obtain actual computer vision (CV) pipeline results—such as object detection, activity recognition, or other visual analytics—from the video using NVIDIA VSS APIs. Currently, I can get only text result using Summarization API, but I am seeking APIs that can return structured CV data (like detected objects, activities, or annotated results) in JSON or similar formats simultaniously.

When I tested NVIDIA VSS, I saw detected objects overlaid on the video frames, as shown in the screenshot below:

How can I get the same results through an API? I haven’t been able to find a suitable API in the NVIDIA VSS Swagger documentation.

In summary, I am looking for an API that can provide CV pipeline data (such as detected objects on video frames) during the summarization process.

Thanks for considering. Looking forward to your response.

Sincerely
Taopik H.

yuweiw · July 10, 2025, 8:02am

Thank you for your suggestion. We will discuss this new feature.

taopikhidayat27 · July 11, 2025, 6:29pm

Hello Support Team,

I am very interested in your VSS (Video Search and Summarization Agent) project. I rented 8 * A100 GPUs and followed your documentation for setup and testing.
When deploy VSS, I used nvila-highres model (nvila-lite-15b-highres-lita).
Overall, it works well, but I have a few issues related to video analysis.

Video summarization works effectively, but in the Q&A section, the accuracy is low. For example, after summarizing my custom traffic video, when I ask, “Are there any collisions or accidents here?” the response only mentions 1-2 accidents, even though there are many in the video.
Regarding the VSS API, I am unable to access sufficient APIs for alerts, audio summarization, and CV pipeline data. I reviewed the Swagger API after running the VSS service, and I noticed that the alerts API is only providing live stream data. I want an alerts API that can deliver summarized results similar to what I see in your NVIDIA VSS demonstrations.

You previously mentioned that the CV pipeline API would be discussed in a future update. Could you please check again the status of the alerts, audio summarization, and CV pipeline APIs?

In summary, I would like to improve the accuracy of the Q&A, and gain access to APIs for video alerts, audio summarization, and CV pipeline results.

If anything is unclear, please let me know. I appreciate your assistance with these issues.

Sincerely
Taopik H

yuweiw · July 14, 2025, 2:03am

Sure. Do you want to provide all the APIs in the FastAPI Swagger API page or our python cli client?

taopikhidayat27 · July 14, 2025, 2:14am

I am developing a custom project using the NVIDIA VSS API. However, I couldn’t find some APIs that are available in NVIDIA VSS, such as alerts, audio summarization, and CV pipeline APIs.

I am building my project with FastAPI Swagger API.
I can get Swagger API after deploying VSS.

Screenshot 2025-07-14 091106

Currently, in Swagger API, it doesn’t include all service apis that provided in VSS.
Therefore, I need a comprehensive API documentation that includes all functions provided by NVIDIA VSS.

Could you please provide or guide me on how to access all these APIs?

yuweiw · July 14, 2025, 2:26am

Sure. At present, the FastAPI Swagger APIs may not be fully developed yet. We will confirm this as soon as possible. If there is any updates, we will reply promptly.

taopikhidayat27 · July 14, 2025, 2:27am

Thank you very much. Looking forward to your response.

pshin · July 17, 2025, 4:23am

Have you tested this with CV enabled? Do you also see this issue when CV pipeline is enabled?

taopikhidayat27 · July 17, 2025, 4:32am

Yes. I have tested with CV enabled. But the result is same.

When deploy for CV enabled, I used below overrides.yaml according to doc.

nim-llm:
  env:
  - name: NVIDIA_VISIBLE_DEVICES
    value: "0,1,2,3"
  - name: NIM_MAX_MODEL_LEN
    value: "128000"
  resources:
    limits:
      nvidia.com/gpu: 0    # no limit


vss:
  applicationSpecs:
    vss-deployment:
      securityContext:
        fsGroup: 0
        runAsGroup: 0
        runAsUser: 0
      containers:
        vss:
          env:
          - name: VLM_MODEL_TO_USE
            value: vila-1.5 # Or "openai-compat" or "custom" or "nvila"
          - name: MODEL_PATH
            value: "ngc:nim/nvidia/vila-1.5-40b:vila-yi-34b-siglip-stage3_1003_video_v8"
          - name: NVIDIA_VISIBLE_DEVICES
            value: "5,6,7"
          - name: INSTALL_PROPRIETARY_CODECS
            value: "true"
          - name: DISABLE_CV_PIPELINE
            value: "false"
          - name: GDINO_INFERENCE_INTERVAL
            value: "1"
          - name: NUM_CV_CHUNKS_PER_GPU
            value: "2"
  resources:
    limits:
      nvidia.com/gpu: 0    # no limit



nemo-embedding:
  applicationSpecs:
    embedding-deployment:
      containers:
        embedding-container:
          env:
          - name: NVIDIA_VISIBLE_DEVICES
            value: '4'
  resources:
    limits:
      nvidia.com/gpu: 0    # no limit


nemo-rerank:
  applicationSpecs:
    ranking-deployment:
      containers:
        ranking-container:
          env:
          - name: NVIDIA_VISIBLE_DEVICES
            value: '4'
  resources:
    limits:
      nvidia.com/gpu: 0    # no limit

After summarizing I can see Set-Of-Marks (SOM) videos but when Q&A, the accuracy is low.
This is the problem.

yuweiw · July 17, 2025, 5:12am

We suggest that you create a separate topic for this issue and attach your video. We will give it a try on our side.

We will provide the alert-related APIs in the upcoming version first.
Regarding the CV pipeline and the Audio APIs, they will take more time to be implemented. It is still not certain when they will be released.

Topic		Replies	Views
How to get VSS API for CV Pipeline Visual AI Agent cuda	2	21	July 15, 2025
Issue detecting video in NVIDIA VSS Visual AI Agent	28	240	July 1, 2025
Running Video Search and Summarization Blueprint on Jetson Devices Visual AI Agent ai , video , generative_ai , blueprints , agentic-ai	4	68	July 14, 2025
Advance Video Analytics AI Agents Using the NVIDIA AI Blueprint for Video Search and Summarization Technical Blog	1	22	May 19, 2025
VSS blueprint 2.2.0 - processing, percentage complete is 0.00 forever Visual AI Agent	8	100	March 6, 2025
Access past summarization after restart of VSS server Visual AI Agent	1	34	May 22, 2025
COMPUTEX 2025 Announcement \| NVIDIA AI Blueprint for Video Search and Summarization (VSS) Is Now Generally Available Announcements blueprints	1	73	May 22, 2025
VSS Engine (vila-1.5): "Sorry, I don't see that in the video" response with 0 chunks processed Visual AI Agent	4	33	July 31, 2025
Facing issue while using live stream APIs for public rtsp url Visual AI Agent nvbugs	18	179	June 24, 2025
Unable to configure gpt-4o with VSS instead of vila using Openai-azure API key Visual AI Agent	17	134	June 12, 2025

How to get VSS API for CV Pipeline

Related topics