Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GPU] updates to build some selected kernels in separate batches #24499

Merged
merged 5 commits into from
May 21, 2024

Conversation

e-ddykim
Copy link
Contributor

@e-ddykim e-ddykim commented May 14, 2024

Details:

  • This PR updates the kernels_cache to build the selected kernels in separate batches.
    • This is a temporary WA to resolve performance degradation when some kernels are built with other kernels in the same batch
    • Currently, the selected kernel includes gemm_tiled_opt.
    • The impacted scenario : Qwen INT4 first token latency for > 1K input in MTL

Tickets:

  • GSD-8910

@e-ddykim e-ddykim added WIP work in progress do not merge labels May 14, 2024
@e-ddykim e-ddykim requested review from a team as code owners May 14, 2024 06:33
@github-actions github-actions bot added the category: GPU OpenVINO GPU plugin label May 14, 2024
return unique_kernel_name.substr(0, pos);
};

auto get_target_batch = [&]() -> batch_program& {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the issue really happen due to multiple instances of same kernel in the batch or it's just related to batch size? As I remember, if program source is too large, then igc may produce worse binary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear what the root cause is as of now. BTW, it looks like that it is not due to program source size. In my test, the issue was gone when I commented out just one line.

@e-ddykim e-ddykim changed the title [GPU] updates to build similar kernels in separate batches [GPU] updates to build some selected kernels in separate batches May 19, 2024
@e-ddykim e-ddykim added this to the 2024.2 milestone May 20, 2024
@e-ddykim e-ddykim removed WIP work in progress do not merge labels May 20, 2024
// check if the current kernel name is in special_kernels
auto target_base_kernel_name = get_base_kernel_name(entry_point);
if (std::count(special_kernels.begin(), special_kernels.end(), target_base_kernel_name) > 0)
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If currerent entryu has gemm_tiled_opt => it will need_seperate_batch : Is this the intention?
(Current behavior seems so)
If it is so, why not just simply check :
if (entry_point.find("gemm_tiled_opt") != string::npos)
=> need_separate_batch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I updated it as you reviewed. Thank you!

@yeonbok
Copy link
Contributor

yeonbok commented May 20, 2024

I believe this will be reverted once the driver issue is resolved. Could you please add the ticket numbers to the PR?

@e-ddykim
Copy link
Contributor Author

I believe this will be reverted once the driver issue is resolved. Could you please add the ticket numbers to the PR?

I added it. Thank you.

@yeonbok yeonbok added this pull request to the merge queue May 21, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks May 21, 2024
@yeonbok yeonbok added this pull request to the merge queue May 21, 2024
Merged via the queue into openvinotoolkit:master with commit 415ba28 May 21, 2024
100 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: GPU OpenVINO GPU plugin Code Freeze
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants