TensorRT-使用TensorRT部署DCNv2-算法加速-优质算法部署项目实战.zip资源-CSDN下载

共9个文件

sh：2个

md：1个

py：1个

版权申诉

157 浏览量 2024-10-21 05:16:51 上传评论收藏 20KB ZIP 举报

在当今的人工智能领域，深度学习模型的部署速度和效率成为了技术发展的关键问题之一。为了满足这一需求，各种深度学习加速框架应运而生，其中TensorRT就是被广泛认可的加速解决方案之一。TensorRT是由NVIDIA推出的一个深度学习推理（Inference）优化器，它可以对训练完成的模型进行优化，从而在GPU上实现快速部署。本文档的标题“TensorRT-使用TensorRT部署DCNv2-算法加速-优质算法部署项目实战”，明确指出了文档的核心内容，即通过TensorRT来优化和部署DCNv2（Deformable Convolutional Networks version 2）模型，并且注重算法的加速和实战应用。DCNv2作为一种先进的卷积神经网络，它通过变形卷积来提升模型在处理物体变形等复杂情况下的表现力，而这一点在图像识别、视频分析等任务中尤为关键。在介绍和实践如何使用TensorRT来部署DCNv2模型的过程中，文档将会涉及到多个方面的知识点。首先是对TensorRT的基本原理进行介绍，包括它的工作机制、优化的策略以及它在模型部署上的优势。接着，文档可能会详细阐述DCNv2的算法原理及其在不同应用场景下的效果，以及为什么它需要高效的部署方式。此外，文档还将提供一个实战项目，通过具体的步骤来指导用户如何将DCNv2模型用TensorRT进行优化。这不仅包括如何准备模型、如何使用TensorRT工具进行转换和优化，还会涉及到可能遇到的问题以及解决方案。实战项目还将展示优化前后的性能对比，证明TensorRT在加速推理方面的显著效果。文档还将讨论在使用TensorRT进行DCNv2部署时的注意事项和最佳实践，包括模型转换的兼容性问题、GPU硬件配置要求、内存管理、多流处理等高级主题。这些内容对于希望深入理解并实际操作模型加速的开发者来说至关重要。文档可能会探讨一些扩展话题，例如如何集成TensorRT优化模型到更大的系统中，以及如何持续追踪和应用TensorRT的最新更新，保持部署的高效性和模型的先进性。本文档将全面介绍TensorRT在加速DCNv2模型部署方面的应用，不仅覆盖理论知识，也包括详尽的实践指导和案例分析。通过学习本文档，用户将能够掌握使用TensorRT进行高效算法部署的核心技能，从而在深度学习项目中取得更高的性能和效率。

资源推荐

资源详情

资源评论

收起资源包目录

package

TensorRT_使用TensorRT部署DCNv2_算法加速_优质算法部署项目实战.zip （9个子文件）

folder

TensorRT_使用TensorRT部署DCNv2_算法加速_优质算法部署项目实战

DCNv2Plugin.h 5KB

Makefile 2KB

DCNv2Plugin.cpp 9KB

DCNv2Plugin.cu 15KB

Dockerfile 1KB

README.md 19KB

folder

scripts

insert_dcn_plugin.py 2KB

folder

docker

launch.sh 799B

build.sh 689B

# DCNv2 ONNX to TensorRT This repo provides the code necessary to build a custom TensorRT plugin for a network containing the DCNv2 layer types. The setup here assumes that you have already created your ONNX model and just want to convert to TensorRT. > NOTE: CUDA kernels for this plugin are slightly modified from [tensorRTIntegrate](https://github.com/dlunion/tensorRTIntegrate/blob/master/src/onnxplugin/plugins/DCNv2.cu) > with a re-write of the TensorRT API to use the plugin itself. ## Overview Just as a quick overview of what we want to do, once we have our container setup properly with all necessary packages and have our original ONNX model, we can simply follow these few commands to create the TensorRT engine using the custom plugin. ```sh # Convert attributes of ONNX model $ python scripts/insert_dcn_plugin.py --input=/models/original.onnx --output=/models/modified.onnx # Build the TensorRT plugin $ make -j$(nproc) # Use trtexec to build and execute the TensorRT engine $ trtexec --onnx=/models/modified.onnx --plugins=build/dcn_plugin.so --workspace=2000 --saveEngine=/models/dcnv2_trt_fp32.engine # OR (for FP16) $ trtexec --onnx=/models/modified.onnx --plugins=build/dcn_plugin.so --workspace=2000 --saveEngine=/models/dcnv2_trt_fp16.engine --fp16 ``` Further explanations and customizations are shown below for a more detailed account of what's going on behind the scenes. ## Setup This material was built on top of the [TensorRT NGC image](https://ngc.nvidia.com/catalog/containers/nvidia:tensorrt) and tested for functionality. TensorRT container versions 20.07 and 20.08 were used for testing of this plugin. We will also need to download [OSS TensorRT](https://github.com/nvidia/TensorRT) so that we can use ONNX GraphSurgeon to make some slight modifications to our ONNX model file. ### With Dockerfile The easiest way to get started is to use the provided [Dockerfile](Dockerfile) to create the Docker image with all the dependencies pre-installed. To do that, follow these steps: ```sh # Build the docker image $ bash scripts/docker/build.sh # Launch an interactive container (you will need to provide the full path to the directory containing your ONNX model, /models in this example) $ bash scripts/docker/launch.sh /models ``` You should now be inside the container with all of the dependencies installed. You should be able to run the commands above to modify the ONNX model, build the TensorRT plugin, and then create/run the TensorRT engine for the model. ### Without Dockerfile If you don't choose to use the Dockerfile, it will be a little bit more work upfront. You will need to clone the TensorRT NGC container manually and install some dependencies before you are able to run the commands for ONNX model conversion, plugin building, and TensorRT engine creation. The following commands should reproduce the environment that the Dockerfile creates: ```sh # Clone the TensorRT container $ docker pull nvcr.io/nvidia/tensorrt:20.08-py3 # Launch the container $ docker run --gpus 1 \ -v <path_to_onnx_model>:/models \ --name <name_for_docker_container> \ --network host \ --rm \ -i -t \ nvcr.io/nvidia/tensorrt:20.08-py3 \ bash ``` Once inside the container, you will need to install a few things to before getting started: ```sh # Clone OSS TensorRT $ git clone -b master https://github.com/nvidia/TensorRT TensorRT $ cd TensorRT/tools/onnx-graphsurgeon $ make install $ cd - # Install Python bindings for TensorRT $ /opt/tensorrt/python/python_setup.sh ``` This should give you the same environment that the Dockerfile above will give you. Then you should be able to go through the process of modifying the ONNX model, building the TensorRT plugin, and creating the TensorRT engine for the model. ## ONNX Model Conversion We will need to do a slight conversion to our ONNX model so that we are able to convert it to a TensorRT engine. The first modification that we will make (which doesn't theoretically have to be done, but makes everything easier) is to replace the ONNX `Plugin` node with a more meaningful `DCNv2_TRT` node. At this point, this is just a placeholder since ONNX doesn't know how to interpret the DCNv2 layer anyway. To do that, we are going to use [ONNX-GraphSurgeon](https://github.com/NVIDIA/TensorRT/tree/master/tools/onnx-graphsurgeon). ```python dcn_nodes = [node for node in graph.nodes if node.op == "Plugin"] for node in dcn_nodes: node.op = "DCNv2_TRT" ``` This will simply rename all of the `Plugin` nodes to `DCNv2_TRT` and make them easier to find with our TensorRT plugin. The second thing (arguably more important) is to convert the attributes of the layer from a string into the useable dictionary for the TensorRT plugin to use. Before this conversion, our attributes have 2 fields (`info` and `name`). The `info` field is a string of the following form: ``` {"dilation": [1, 1], "padding": [1, 1], "stride": [1, 1], "deformable_groups": 1} ``` What we actually want is to separate this string into individual attributes so they don't have to be parsed as a string by the TensorRT plugin creator (much more difficult). So we modify the ONNX graph with something similar to the following: ```python # For each of the "Plugin" nodes attrs = json.loads(node.attrs["info"]) node.attrs.update(attrs) del node.attrs["info"] ``` The [insert_dcn_plugin.py script](scripts/insert_dcn_plugin.py) provided with this repo does exactly this and only requires the user to provide the path to the input ONNX model and name of the output model. It can be used as follows: ```sh python scripts/insert_dcn_plugin.py --input models/<original_onnx_model>.onnx --output models/<modified_onnx_model>.onnx ``` ## Plugin Now that we have all our packages installed, we can now go ahead with building our TensorRT plugin that we will use to convert the ONNX model to TensorRT. We have provided a Makefile that compiles the `.cpp` and `.cu` files as well as links the appropriate libraries (including `-lcudart`, `-lcublas`, `-lnvinfer`, `-lnvparser`, etc.). To use it, you can simply run: ```sh $ make -j$(nproc) ``` This will produce the necessary shared object file (i.e. `build/dcn_plugin.so`) that will be used to build the TensorRT engine. ## TensorRT Engine To create the TensorRT engine and test it with the plugin, we will use `trtexec`. This will allow us to run synthetic data through the network to get an idea of the speed of the network as well as output a serialized engine that we can use later. Note that we are giving a large enough workspace as the plugin itself will use some of the workspace to determine the best operations to perform to create the most optimized engine. ```sh $ trtexec --onnx=<path_to_onnx_model>.onnx --plugins=build/dcn_plugin.so --workspace=2000 --saveEngine=<path_to_output_trt_engine>.engine ``` ## Explanation of TensorRT Plugin Development Now that we have created the TensorRT engine, let's dive a little deeper into how we were able to do that. ### IPluginV2DynamicExt The first thing that we want to point out is the we are going to base our Plugin off of the [IPluginV2DynamicExt](https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/classnvinfer1_1_1_i_plugin_v2_dynamic_ext.html) class which will give us the ability to use alot of the functionality that TensorRT already has built in. You can see where we built our plugin class around the IPluginV2DynamicExt class [here](DCNv2Plugin.h#L60). The first thing we want to do is to create our constructor and destructor for our TensorRT plugin (in this case, `DCNv2PluginDynamic`). You can see an example of that [here](DCNv2Plugin.h#L63-69): ``` DCNv2PluginDynamic(); DCNv2PluginDynamic(const void* data, size_t length, const std::string& name); DCNv2PluginDynamic(DCNv2Parameters param, const std::string& name); ~DCNv2PluginDynamic() override; ``` Note that here, we have 2 different ways that a `DCNv2PluginDynamic` can be created: eit

内容反馈

版权申诉

__AtYou__

粉丝: 3535

最新资源

资源上传下载、课程学习等过程中有任何疑问或建议，欢迎提出宝贵意见哦~我们会及时处理！点击此处反馈

feedback-tip