Improvements to the INT8 GEMM portion of the code for Power #20595

ChipKerchner · 2024-05-07T13:09:36Z

These are changes to improve GEMM portion of the code for Power.

There are 2 main code changes :

Changing a function to a template parameter so that operations that add/sub zero are eliminated at compile time. Plus reuse a vector that has the mask instead of rebuilding each time.
Add processing 16 columns at a time in MlasGemmQuantCopyPackB8x8 - this should reduce potential page faults by a factor of 4 and also be faster.
Unroll MlasQgemmStoreVectorMMA and vectorize other variables.

…8 - up to 4x faster.

yuslepukhin · 2024-05-08T18:19:53Z

ChipKerchner · 2024-05-08T22:08:34Z

Questions about the lint warnings. Do all lines have to be less than 120 characters? And for statements that span multiple lines, how are they written?

yuslepukhin · 2024-05-08T22:16:19Z

Typically, we employ clangformat. I use a visual cue in the IDE and break them manually, then run the formatter again.

ChipKerchner · 2024-05-08T22:40:47Z

Which style should I be using for clang-format? microsoft?

It seems the formatter wants to change a lot of code that I did not alter.

yuslepukhin · 2024-05-08T23:14:29Z

Which style should I be using for clang-format? microsoft?

It seems the formatter wants to change a lot of code that I did not alter.

Your editor should pick this up automatically.
https://github.com/microsoft/onnxruntime/blob/main/.clang-format

yufenglee · 2024-05-09T00:00:29Z

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

yufenglee · 2024-05-09T00:00:42Z

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Android CI Pipeline

yufenglee · 2024-05-09T00:00:52Z

/azp run iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

azure-pipelines · 2024-05-09T00:02:16Z

Azure Pipelines successfully started running 2 pipeline(s).

azure-pipelines · 2024-05-09T00:02:44Z

Azure Pipelines successfully started running 10 pipeline(s).

azure-pipelines · 2024-05-09T00:02:52Z

Azure Pipelines successfully started running 10 pipeline(s).

ChipKerchner · 2024-05-09T11:56:42Z

I use vim as my editor. I'm not sure it will pickup lint formatting.

yuslepukhin · 2024-05-09T22:53:01Z

I use vim as my editor. I'm not sure it will pickup lint formatting.

Most of the failures about extra space. Lots of editors show non-visible characters.

edgchen1

can you share performance measurements to show how much improvement there is?

onnxruntime/core/mlas/lib/power/qgemm_kernel_power10.cpp

…erImprovements

ChipKerchner · 2024-05-13T14:55:38Z

I'm seeing about a 2.6-4X improvement for PackB

yufenglee · 2024-05-13T18:11:14Z

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

yufenglee · 2024-05-13T18:11:27Z

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Android CI Pipeline

azure-pipelines · 2024-05-13T18:11:56Z

Azure Pipelines successfully started running 10 pipeline(s).

azure-pipelines · 2024-05-13T18:12:07Z

Azure Pipelines successfully started running 10 pipeline(s).

…into powerImprovements

ChipKerchner · 2024-05-22T13:51:12Z

Can we move forward with this PR?

yuslepukhin · 2024-05-22T18:02:31Z

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline

yuslepukhin · 2024-05-22T18:02:45Z

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Android CI Pipeline

azure-pipelines · 2024-05-22T18:03:10Z

Azure Pipelines successfully started running 10 pipeline(s).

azure-pipelines · 2024-05-22T18:03:25Z

Azure Pipelines successfully started running 10 pipeline(s).

ChipKerchner added 4 commits April 19, 2024 11:43

Faster S8 Packing routines.

12b6219

Use vectors instead of individual element in MlasGemmQuantCopyPackB8x…

6b4f4cc

…8 - up to 4x faster.

Remove comment and fix tab.

263bfe4

Merge branch 'main' into powerImprovements

e25ea1b

ChipKerchner requested a review from a team as a code owner May 7, 2024 13:09

Merge branch 'main' into powerImprovements

d887654

ChipKerchner added 2 commits May 10, 2024 07:54

Fix lint warnings.

a65477d

Merge branch 'main' into powerImprovements

6c4c932

edgchen1 reviewed May 10, 2024

View reviewed changes

onnxruntime/core/mlas/lib/power/qgemm_kernel_power10.cpp Outdated Show resolved Hide resolved

onnxruntime/core/mlas/lib/power/qgemm_kernel_power10.cpp Outdated Show resolved Hide resolved

onnxruntime/core/mlas/lib/power/qgemm_kernel_power10.cpp Outdated Show resolved Hide resolved

Merge remote-tracking branch 'origin_chip/powerImprovements' into pow…

3fd9166

…erImprovements

ChipKerchner added 2 commits May 13, 2024 14:11

Fix commenters suggestions and other lint warnings.

89d63c2

Unroll MlasQgemmStoreVectorMMA and vectorize other variables.

1b6553a

ChipKerchner changed the title ~~Improvements to the GEMM portion of the code for Power~~ Improvements to the INT8 GEMM portion of the code for Power May 16, 2024

ChipKerchner added 3 commits May 22, 2024 07:12

Merge remote-tracking branch 'origin/main' into powerImprovements

9be49a0

Merge branch 'powerImprovements' of github.ibm.com:PowerAppLibs/onnx …

3e57627

…into powerImprovements

More lint fixes.

84725ba

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements to the INT8 GEMM portion of the code for Power #20595

Improvements to the INT8 GEMM portion of the code for Power #20595

ChipKerchner commented May 7, 2024 •

edited

yuslepukhin commented May 8, 2024

ChipKerchner commented May 8, 2024

yuslepukhin commented May 8, 2024

ChipKerchner commented May 8, 2024 •

edited

yuslepukhin commented May 8, 2024

yufenglee commented May 9, 2024

yufenglee commented May 9, 2024

yufenglee commented May 9, 2024

azure-pipelines bot commented May 9, 2024

azure-pipelines bot commented May 9, 2024

azure-pipelines bot commented May 9, 2024

ChipKerchner commented May 9, 2024

yuslepukhin commented May 9, 2024

edgchen1 left a comment

ChipKerchner commented May 13, 2024

yufenglee commented May 13, 2024

yufenglee commented May 13, 2024

azure-pipelines bot commented May 13, 2024

azure-pipelines bot commented May 13, 2024

ChipKerchner commented May 22, 2024

yuslepukhin commented May 22, 2024

yuslepukhin commented May 22, 2024

azure-pipelines bot commented May 22, 2024

azure-pipelines bot commented May 22, 2024

Improvements to the INT8 GEMM portion of the code for Power #20595

Are you sure you want to change the base?

Improvements to the INT8 GEMM portion of the code for Power #20595

Conversation

ChipKerchner commented May 7, 2024 • edited

yuslepukhin commented May 8, 2024

ChipKerchner commented May 8, 2024

yuslepukhin commented May 8, 2024

ChipKerchner commented May 8, 2024 • edited

yuslepukhin commented May 8, 2024

yufenglee commented May 9, 2024

yufenglee commented May 9, 2024

yufenglee commented May 9, 2024

azure-pipelines bot commented May 9, 2024

azure-pipelines bot commented May 9, 2024

azure-pipelines bot commented May 9, 2024

ChipKerchner commented May 9, 2024

yuslepukhin commented May 9, 2024

edgchen1 left a comment

Choose a reason for hiding this comment

ChipKerchner commented May 13, 2024

yufenglee commented May 13, 2024

yufenglee commented May 13, 2024

azure-pipelines bot commented May 13, 2024

azure-pipelines bot commented May 13, 2024

ChipKerchner commented May 22, 2024

yuslepukhin commented May 22, 2024

yuslepukhin commented May 22, 2024

azure-pipelines bot commented May 22, 2024

azure-pipelines bot commented May 22, 2024

ChipKerchner commented May 7, 2024 •

edited

ChipKerchner commented May 8, 2024 •

edited