Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model.engine speed is slowest than model.pt #12653

Open
1 task done
shaimaahamam opened this issue May 13, 2024 · 3 comments
Open
1 task done

model.engine speed is slowest than model.pt #12653

shaimaahamam opened this issue May 13, 2024 · 3 comments
Labels
question Further information is requested

Comments

@shaimaahamam
Copy link

Search before asking

Question

when convert model from model.pt to model.engine the size of model is increased and takes more time in prediction step.

Additional

No response

@shaimaahamam shaimaahamam added the question Further information is requested label May 13, 2024
@glenn-jocher
Copy link
Member

Hello! It sounds like you're experiencing slower performance with the .engine format compared to the .pt format. This could be due to several factors including the complexity of your model, the configuration of the TensorRT optimization, or the specific hardware you are running the model on.

To potentially improve the performance, you might want to look into the following:

  • Ensure your TensorRT version is fully compatible with your hardware.
  • Experiment with different workspace sizes in the export command to optimize memory usage, which could influence speed.
  • Check if the input size (imgsz) used during the export matches the one used during inferencing, as mismatches can lead to inefficiencies.

Here's an example command to adjust the workspace size during export:

yolo export model=path/to/model.pt format=engine workspace=8

If adjustments to these areas do not improve the performance, it might be helpful to profile both executions to understand where the bottleneck occurs.

@shaimaahamam
Copy link
Author

I have RTX 3060 TI
driver version : 545.29.06
GPU 8 GB
Cuda version :12.3
Ubuntu 22.4
Ram size : 98 GB
CPU X86-64 AMD Ryzen 9 5900X 12-Core Processor

which version of python , tensorRT compatible with and number of workspace and batches ?

@glenn-jocher
Copy link
Member

Hello! For your setup with an RTX 3060 Ti, here’s a quick guide to get you started:

  • Python Version: Python 3.8 or newer should work well.
  • TensorRT Version: With CUDA 12.3, you should use TensorRT 8.x. Make sure to download the version compatible with CUDA 12.3 from the NVIDIA website.
  • Workspace Size: Starting with a workspace size of 4 GB is generally a good balance. You can adjust this if needed:
    yolo export model=yolov8n.pt format=engine workspace=4
  • Batch Size: If you're not facing memory issues, you can start with a batch size of 2 or more depending on your specific use case. Remember, larger batch sizes might increase throughput but also memory usage:
    yolo export model=yolov8n.pt format=engine batch=2

Feel free to tweak these settings based on your performance and accuracy needs! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants