-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Plugin YoloNMS #3859
base: release/8.6
Are you sure you want to change the base?
New Plugin YoloNMS #3859
Conversation
Signed-off-by: Levi Pereira <levi.pereira@gmail.com>
Signed-off-by: Levi Pereira <levi.pereira@gmail.com>
Signed-off-by: Levi Pereira <levi.pereira@gmail.com>
Signed-off-by: Levi Pereira <levi.pereira@gmail.com>
Signed-off-by: Levi Pereira <levi.pereira@gmail.com>
Signed-off-by: Levi Pereira <levi.pereira@gmail.com>
Implementation using this plugin with DeepStream |
is it compatible with tensorrt 10? trt10 is now compatible with all plattforms like jetson agx Orin. It would be interesting to use |
@asfiyab-nvidia can you check this? Super interesting for demos on jetson agx orin! |
Adding @samurdhikaru for new plugin review |
release/10.0...levipereira:TensorRT:release/10.0 I have implemented the plugin in the release/10.0 branch, and everything is working correctly. The plugin is compatible with all versions from release/8.6 to release/10.0. I have conducted some tests on x86_64 and did not encounter any issues.
|
amazing @levipereira ! |
is it possible to merge asap? @samurdhikaru thank you! |
@levipereira Thank you very much for the contribution. @levipereira @johnnynunez To start off, note that
Beyond that, a plugin also needs to meet a general usability criterion. i.e. Its usecase should not be niche, and its functionality should not be easily achievable with existing plugins and/or TRT inbuilt ops. Note that the TRT's inbuilt INMSLayer outputs the selected indices? Does this address your usecase? If not, can you motivate the usecase for which the boxes and the indices would be needed at the same time? Because the former is provided by the |
I have seen the implementation of IPluginV3, and a significant amount of code needs to be adjusted and tested to upgrade from IPluginV2 to IPluginV3.
Regarding the use of INMSLayer instead of the plugin, it's important to note that INMSLayer returns only the indices of candidate detections and must be integrated into the model through dedicated nodes. In contrast, the plugin offers an end-to-end solution for NMS, handling standard inputs and outputs. Implementing INMSLayer typically uses the OP NonMaxSuppression from onnx-tensorrt. However, this approach is not universally applicable because INMSLayer/NonMaxSuppression requires adding an extra layer and further processing to identify the selected data. This adds complexity and reduces its suitability for generic use cases compared to the plugin. The plugin can be employed across different models by simply formatting the output to match the plugin's input requirements. In contrast, INMSLayer/NonMaxSuppression requires bespoke implementation for each specific model. In fact, the YoloNMS plugin is very similar to EfficientNMS, but it is a solution that will cater to various cases simply because the entire execution of NMS in an end-to-end format will always be ready. The user can decide whether to use the indices or not, but without needing any additional processing to find the bounding boxes, classes, scores, and number of detections. The only difference between YoloNMS and EfficientNMS is that YoloNMS returns indices. This might seem like a small change, but it is very important when integrating other plugins into the process. EfficientNMS should be used as a terminal layer, whereas YoloNMS can be used as either an intermediate or terminal layer, or both. Thus, while INMSLayer/NonMaxSuppression provides indices of selected detections, it lacks the ease of integration and general usability offered by the plugin, which efficiently handles both boxes and indices within various models by adhering to a standardized input-output format. INMSLayer is more flexible but not generic, whereas the plugin is more generic but less flexible. Examples of implementationYoloNMS Plugin Used in Conjunction with Other PluginsThis case det_indices was used by plugin ROIAling_TRT while output of (boxes,score,classes) already final output. Same as EfficientNMS (no change)This is a example of implementation same as EfficentNMS but using INMSLayer |
YoloNMS Plugin
https://github.com/levipereira/TensorRT/tree/release/8.6/plugin/yoloNMSPlugin
Description
The YOLO model is one of the most widely used architectures for object detection, and it's continually evolving. However, there's a lack of a performant TensorRT plugin for efficient Non-Maximum Suppression (NMS). For a long time, the EfficientNMS plugin served the needs of Yolo efficiently. However, when using the Yolo model architecture for Segmentation, the EfficientNMS plugin falls short due to the absence of a detection index layer. As a result, users have resorted to various plugins/methods to address this issue.
To address this gap, I have made modifications to the EfficientNMS plugin by creating a YoloNMS plugin where add a layer that returns detection indices, which can be utilized for various purposes. Currently, this modification is available only in the release/8.6 branch. However, it's straightforward to implement it for 8.5 and later versions.
This modification not only improves performance but also enhances the plugin's versatility for multiple use cases. Moreover, this plugin can be easily implemented in any version of YOLO.
The main goal is to simplify the implementation of YOLO series models on Deepstream/TritonServer while delivering maximum performance.
All changes can be tracked since the cloning of Efficient NMS.
release/8.6...levipereira:TensorRT:release/8.6
Example of End2End Implementation: