How to call NVML APIs?

I am trying to write a program that calls NVML APIs to monitor temperature and power of the GPU.

Here is the code:

#include “nvml.h”

void *sampling_func(void *arg)
{

nvmlReturn_t rc;
unsigned int *temp;
nvmlReturn_t DECLDIR nvmlInit();
nvmlUnit_t *arg2 = (nvmlUnit_t *) 0;

while(!testComplete)
{
rc= nvmlReturn_t DECLDIR nvmlUnitGetTemperature(*arg2, 2, temp);
if(rc!= NVML_SUCCESS)
printf(“NVML has some problems!\n”);
}

nvmlReturn_t DECLDIR nvmlShutdown();
return NULL;
}

However, when I compile it, it gives me error:
Power_Recording.cu(121): error: a value of type “nvmlUnit_t” cannot be used to initialize an entity of type “nvmlReturn_t”

I couldn’t find any information about how to initiate nvmlUnit_t in any of the nvml APIs. Can someone help me with it? How should I actually call nvmlReturn_t DECLDIR nvmlUnitGetTemperature ( nvmlUnit_t unit,
unsigned int type,
unsigned int * temp
) ?

Thanks

From the posted code it is impossible to see which line of your program is line 121, but the following looks suspicious:

rc= nvmlReturn_t DECLDIR nvmlUnitGetTemperature(*arg2, 2, temp);

It looks like you may have inadvertently cut & paste the entire function prototype, when what you really wanted was to call the function like so:

rc = nvmlUnitGetTemperature (*arg2, 2, temp);

If you try to call rc = nvmlUnitGetTemperature (*arg2, 2, temp); like this , the compiler will tell you that "nvmlUnitGetTemperature " is not defined. When you add "nvmlReturn_t DECLDIR " in front of that function, then it wont report that mistake. line 121 is: rc= nvmlReturn_t DECLDIR nvmlUnitGetTemperature(*arg2, 2, temp);

But says instead that the line is not valid c; You need to leave that line as

rc = nvmlUnitGetTemperature (*arg2, 2, temp);

And either find the right include file to include or at the start of your code define the prototype on it’s own line. I’m guessing that it should be

nvmlReturn_t DECLDIR nvmlUnitGetTemperature(nvmlUnit_t, int, unsigned int *);

By the way, you will probably get a segmentation fault as you should probably declare temp as
unsigned int temp;
and then pass a pointer to it (pass &temp rather than temp)

You can try out this sample code as a reference for writing an application that calls the NVML API.

Also if you are more comfortable working in Perl or Python, we have bindings available:

Python bindings

Perl bindings

Sample code for these are in their docs. All of the functionality of NVML is exposed in the bindings.

example.c:

/***************************************************************************\

|*                                                                           *|

|*      Copyright 2010-2011 NVIDIA Corporation.  All rights reserved.        *|

|*                                                                           *|

|*   NOTICE TO USER:                                                         *|

|*                                                                           *|

|*   This source code is subject to NVIDIA ownership rights under U.S.       *|

|*   and international Copyright laws.  Users and possessors of this         *|

|*   source code are hereby granted a nonexclusive, royalty-free             *|

|*   license to use this code in individual and commercial software.         *|

|*                                                                           *|

|*   NVIDIA MAKES NO REPRESENTATION ABOUT THE SUITABILITY OF THIS SOURCE     *|

|*   CODE FOR ANY PURPOSE. IT IS PROVIDED "AS IS" WITHOUT EXPRESS OR         *|

|*   IMPLIED WARRANTY OF ANY KIND. NVIDIA DISCLAIMS ALL WARRANTIES WITH      *|

|*   REGARD TO THIS SOURCE CODE, INCLUDING ALL IMPLIED WARRANTIES OF         *|

|*   MERCHANTABILITY, NONINFRINGEMENT, AND FITNESS FOR A PARTICULAR          *|

|*   PURPOSE. IN NO EVENT SHALL NVIDIA BE LIABLE FOR ANY SPECIAL,            *|

|*   INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES          *|

|*   WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN      *|

|*   AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING     *|

|*   OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOURCE      *|

|*   CODE.                                                                   *|

|*                                                                           *|

|*   U.S. Government End Users. This source code is a "commercial item"      *|

|*   as that term is defined at 48 C.F.R. 2.101 (OCT 1995), consisting       *|

|*   of "commercial computer  software" and "commercial computer software    *|

|*   documentation" as such terms are used in 48 C.F.R. 12.212 (SEPT 1995)   *|

|*   and is provided to the U.S. Government only as a commercial end item.   *|

|*   Consistent with 48 C.F.R.12.212 and 48 C.F.R. 227.7202-1 through        *|

|*   227.7202-4 (JUNE 1995), all U.S. Government End Users acquire the       *|

|*   source code with only those rights set forth herein.                    *|

|*                                                                           *|

|*   Any use of this source code in individual and commercial software must  *| 

|*   include, in the user documentation and internal comments to the code,   *|

|*   the above Disclaimer and U.S. Government End Users Notice.              *|

|*                                                                           *|

|*                                                                           *|

 \***************************************************************************/

#include <stdio.h>

#include <nvml.h>

const char * convertToComputeModeString(nvmlComputeMode_t mode)

{

    switch (mode)

    {

        case NVML_COMPUTEMODE_DEFAULT:

            return "Default";

        case NVML_COMPUTEMODE_EXCLUSIVE_THREAD:

            return "Exclusive_Thread";

        case NVML_COMPUTEMODE_PROHIBITED:

            return "Prohibited";

        case NVML_COMPUTEMODE_EXCLUSIVE_PROCESS:

            return "Exclusive Process";

        default:

            return "Unknown";

    }

}

int main()

{

    nvmlReturn_t result;

    unsigned int device_count, i;

// First initialize NVML library

    result = nvmlInit();

    if (NVML_SUCCESS != result)

    { 

        printf("Failed to initialize NVML: %s\n", nvmlErrorString(result));

printf("Press ENTER to continue...\n");

        getchar();

        return 1;

    }

result = nvmlDeviceGetCount(&device_count);

    if (NVML_SUCCESS != result)

    { 

        printf("Failed to query device count: %s\n", nvmlErrorString(result));

        goto Error;

    }

    printf("Found %d device%s\n\n", device_count, device_count != 1 ? "s" : "");

printf("Listing devices:\n");    

    for (i = 0; i < device_count; i++)

    {

        nvmlDevice_t device;

        char name[64];

        nvmlPciInfo_t pci;

        nvmlComputeMode_t compute_mode;

// Query for device handle to perform operations on a device

        // You can also query device handle by other features like:

        // nvmlDeviceGetHandleBySerial

        // nvmlDeviceGetHandleByPciBusId

        result = nvmlDeviceGetHandleByIndex(i, &device);

        if (NVML_SUCCESS != result)

        { 

            printf("Failed to get handle for device %i: %s\n", i, nvmlErrorString(result));

            goto Error;

        }

result = nvmlDeviceGetName(device, name, sizeof(name)/sizeof(name[0]));

        if (NVML_SUCCESS != result)

        { 

            printf("Failed to get name of device %i: %s\n", i, nvmlErrorString(result));

            goto Error;

        }

// pci.busId is very useful to know which device physically you're talking to

        // Using PCI identifier you can also match nvmlDevice handle to CUDA device.

        result = nvmlDeviceGetPciInfo(device, &pci);

        if (NVML_SUCCESS != result)

        { 

            printf("Failed to get pci info for device %i: %s\n", i, nvmlErrorString(result));

            goto Error;

        }

printf("%d. %s [%s]\n", i, name, pci.busId);

// This is a simple example on how you can modify GPU's state

        result = nvmlDeviceGetComputeMode(device, &compute_mode);

        if (NVML_ERROR_NOT_SUPPORTED == result)

            printf("\t This is not CUDA capable device\n");

        else if (NVML_SUCCESS != result)

        { 

            printf("Failed to get compute mode for device %i: %s\n", i, nvmlErrorString(result));

            goto Error;

        }

        else

        {

            // try to change compute mode

            printf("\t Changing device's compute mode from '%s' to '%s'\n", 

                    convertToComputeModeString(compute_mode), 

                    convertToComputeModeString(NVML_COMPUTEMODE_PROHIBITED));

result = nvmlDeviceSetComputeMode(device, NVML_COMPUTEMODE_PROHIBITED);

            if (NVML_ERROR_NO_PERMISSION == result)

                printf("\t\t Need root privileges to do that: %s\n", nvmlErrorString(result));

            else if (NVML_ERROR_NOT_SUPPORTED == result)

                printf("\t\t Compute mode prohibited not supported. You might be running on\n"

                       "\t\t windows in WDDM driver model or on non-CUDA capable GPU.\n");

            else if (NVML_SUCCESS != result)

            {

                printf("\t\t Failed to set compute mode for device %i: %s\n", i, nvmlErrorString(result));

                goto Error;

            } 

            else

            {

                printf("\t Restoring device's compute mode back to '%s'\n", 

                        convertToComputeModeString(compute_mode));

                result = nvmlDeviceSetComputeMode(device, compute_mode);

                if (NVML_SUCCESS != result)

                { 

                    printf("\t\t Failed to restore compute mode for device %i: %s\n", i, nvmlErrorString(result));

                    goto Error;

                }

            }

        }

    }

result = nvmlShutdown();

    if (NVML_SUCCESS != result)

        printf("Failed to shutdown NVML: %s\n", nvmlErrorString(result));

printf("All done.\n");

printf("Press ENTER to continue...\n");

    getchar();

    return 0;

Error:

    result = nvmlShutdown();

    if (NVML_SUCCESS != result)

        printf("Failed to shutdown NVML: %s\n", nvmlErrorString(result));

printf("Press ENTER to continue...\n");

    getchar();

    return 1;

}

Makefile:

ARCH   := $(shell getconf LONG_BIT)

ifeq (${ARCH},32)

  NVML_LIB := ../lib/

else ifeq (${ARCH},64)

  NVML_LIB := ../lib64/

else

 $(error Unknown architecture!)

endif

CFLAGS  := -I ../inc

LDFLAGS := -lnvidia-ml -L $(NVML_LIB)

example: example.o

	$(CC) $(LDFLAGS) $< -o $@

clean:

	-@rm -f example.o

	-@rm -f example

To build:

$ make

-Robert

Thanks a lot Robert. I will be trying it and report the results.