Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TensorRT EP] Support engine hardware compatibility #20669

Merged
merged 10 commits into from
May 29, 2024
Merged

Conversation

yf711
Copy link
Contributor

@yf711 yf711 commented May 13, 2024

Description

  • Introduce option trt_engine_hw_compatible to support engine hardware compatibility for Ampere+ GPUs
    • This enables nvinfer1::HardwareCompatibilityLevel::kAMPERE_PLUS flag when generating engines
    • This option has been validated on sm80/86 GPUs, as engine can be reused across different ampere+ arch:
      • Client side need to enable this option as well to leverage existing sm80+ engines
    • If this option is enabled by users which TRT<8.6 or sm<80, there will be a warning showing this option not supported

Engine naming:

When trt_engine_hw_compat=false trt_engine_hw_compat=true
A100 (sm80) TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm80.engine TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm80+.engine
RTX3080 (sm86) TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm86.engine TensorrtExecutionProvider_TRTKernel_graph_torch-jit-export_9454133937466702238_0_0_sm80+.engine

Motivation and Context

Reference: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#hardware-compat

@jywu-msft jywu-msft requested a review from chilo-ms May 16, 2024 05:31
@yf711 yf711 marked this pull request as ready for review May 21, 2024 23:10
@yf711 yf711 requested a review from jywu-msft May 23, 2024 20:14
@chilo-ms
Copy link
Contributor

A minor issue here:
In the case of Embedded engine model / EPContext model, the TensorRTCacheModelHandler::ValidateEPCtxNode() will check the compute capability between the underlying GPU and "hardware_architecture" attribute in the EPContext node, and throw warning message if they are not matched.
Now TRT EP supports Ampere+ hardware compatibility, we might need to modify the check function there if the embedded engine cache name has "sm80+".

@yf711 yf711 merged commit d44be41 into main May 29, 2024
95 of 96 checks passed
@yf711 yf711 deleted the yifanl/trtep_hw_compat branch May 29, 2024 01:12
@jywu-msft jywu-msft added the ep:TensorRT issues related to TensorRT execution provider label May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:TensorRT issues related to TensorRT execution provider release:1.18.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants