GPU memory allocation not shown in nvprof

I have a deep learning framework which uses 11G out of the 12G Tesla K80. To understand this I collected nvprof with --print-gpu-trace. The nvprof only shows 4 cuda memcpys of each 4Bytes.

What would be the best tool or option to trace this memory allocation on the GPU ?

Hi,

Could you attach the full log of nvprof? Looks like your app doesn’t run on the right GPU, if you have multi GPU on your computer, you can use environment variable CUDA_VISIBLE_DEVICES=x to specify the GPU you want to use.

Best Regards