[TensorRT EP] Support engine hardware compatibility #20669

yf711 · 2024-05-13T17:09:44Z

Description

Introduce option trt_engine_hw_compatible to support engine hardware compatibility for Ampere+ GPUs
- This enables nvinfer1::HardwareCompatibilityLevel::kAMPERE_PLUS flag when generating engines
- This option has been validated on sm80/86 GPUs, as engine can be reused across different ampere+ arch:
  - Client side need to enable this option as well to leverage existing sm80+ engines
- If this option is enabled by users which TRT<8.6 or sm<80, there will be a warning showing this option not supported

Engine naming:

When	`trt_engine_hw_compat=false`	`trt_engine_hw_compat=true`
A100 (sm80)	TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm80.engine	TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm80+.engine
RTX3080 (sm86)	TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm86.engine	TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm80+.engine

Motivation and Context

Reference: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#hardware-compat

chilo-ms · 2024-05-28T04:17:14Z

A minor issue here:
In the case of Embedded engine model / EPContext model, the TensorRTCacheModelHandler::ValidateEPCtxNode() will check the compute capability between the underlying GPU and "hardware_architecture" attribute in the EPContext node, and throw warning message if they are not matched.
Now TRT EP supports Ampere+ hardware compatibility, we might need to modify the check function there if the embedded engine cache name has "sm80+".

yf711 added 2 commits May 10, 2024 11:06

add option

a69b8bf

update

910a5c4

jywu-msft requested a review from chilo-ms May 16, 2024 05:31

yf711 added 2 commits May 20, 2024 09:19

update

0dfee38

lint, usage desc update

19cd271

yf711 marked this pull request as ready for review May 21, 2024 23:10

yf711 requested a review from jywu-msft May 23, 2024 20:14

yf711 added the release:1.18.1 label May 23, 2024

chilo-ms and others added 6 commits May 28, 2024 08:26

Merge branch 'main' into yifanl/trtep_hw_compat

a815b06

fix merge conflict

4e95c3f

filter orin

ab0e753

lint

148a137

support hw_compat when DumpCtxModel

910311b

support hw_compat on ValidateEPCtxNode

b76c2dc

chilo-ms approved these changes May 28, 2024

View reviewed changes

yf711 merged commit d44be41 into main May 29, 2024
95 of 96 checks passed

yf711 deleted the yifanl/trtep_hw_compat branch May 29, 2024 01:12

jywu-msft added the ep:TensorRT issues related to TensorRT execution provider label May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TensorRT EP] Support engine hardware compatibility #20669

[TensorRT EP] Support engine hardware compatibility #20669

yf711 commented May 13, 2024 •

edited

chilo-ms commented May 28, 2024

[TensorRT EP] Support engine hardware compatibility #20669

[TensorRT EP] Support engine hardware compatibility #20669

Conversation

yf711 commented May 13, 2024 • edited

Description

Motivation and Context

chilo-ms commented May 28, 2024

yf711 commented May 13, 2024 •

edited