Fix triton codegen main do_bench_gpu import error #126213

adelesun · 2024-05-14T21:03:18Z

Summary:
Encountered module import error when running triton kernel file.

The cause seems to be D57215950 which changed "do_bench" to "do_bench_gpu" for torch._inductor.runtime.runtime_utils

However, in the codegen, instead we have "from triton.testing import do_bench", so the line below should be reverted back to "do_bench".

Test Plan:
LOGLEVEL=DEBUG TORCH_COMPILE_DEBUG=1 TORCHINDUCTOR_MAX_AUTOTUNE=0 CUDA_VISIBLE_DEVICES=5 TORCHINDUCTOR_PROFILE=1 TORCHINDUCTOR_PROFILE_OUTPUT='/home/adelesun/mts_profiling/outputs/profile_output.txt' TORCH_LOGS='+inductor,+schedule,output_code' TORCHINDUCTOR_UNIQUE_KERNEL_NAMES=1 TORCHINDUCTOR_BENCHMARK_KERNEL=1 TORCHINDUCTOR_CACHE_DIR='/home/adelesun/mts_profiling/code' TORCHINDUCTOR_ENABLED_METRIC_TABLES=kernel_metadata buck2 run mode/opt -c=python.package_style=inplace -c fbcode.enable_gpu_sections=true -c fbcode.platform=platform010 -c fbcode.nvcc_arch=v100,a100,h100 -c fbcode.split-dwarf=true caffe2/torch/fb/model_transform/experimental/benchmark:mts_gpu_benchmark -- --local-model /home/adelesun/mts_profiling/inputs/offsite_cvr_model_526372970_793.input.predictor.disagg.gpu.merge --lower-backend AOT_INDUCTOR 2>&1 | tee /home/adelesun/mts_profiling/outputs/benchmark_output.txt

bento console --kernel=aetk --file=/home/adelesun/mts_profiling/code/op/copmbxfunzmywemwmg66lnlcx4apvn2f2vsi3glgisausgfvit4g.py

file ran successfully

Differential Revision: D57345619

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

Summary: Encountered module import error when running triton kernel file. The cause seems to be D57215950 which changed "do_bench" to "do_bench_gpu" for torch._inductor.runtime.runtime_utils However, in the codegen, instead we have "from triton.testing import do_bench", so the line below should be reverted back to "do_bench". Test Plan: LOGLEVEL=DEBUG TORCH_COMPILE_DEBUG=1 TORCHINDUCTOR_MAX_AUTOTUNE=0 CUDA_VISIBLE_DEVICES=5 TORCHINDUCTOR_PROFILE=1 TORCHINDUCTOR_PROFILE_OUTPUT='/home/adelesun/mts_profiling/outputs/profile_output.txt' TORCH_LOGS='+inductor,+schedule,output_code' TORCHINDUCTOR_UNIQUE_KERNEL_NAMES=1 TORCHINDUCTOR_BENCHMARK_KERNEL=1 TORCHINDUCTOR_CACHE_DIR='/home/adelesun/mts_profiling/code' TORCHINDUCTOR_ENABLED_METRIC_TABLES=kernel_metadata buck2 run mode/opt -c=python.package_style=inplace -c fbcode.enable_gpu_sections=true -c fbcode.platform=platform010 -c fbcode.nvcc_arch=v100,a100,h100 -c fbcode.split-dwarf=true caffe2/torch/fb/model_transform/experimental/benchmark:mts_gpu_benchmark -- --local-model /home/adelesun/mts_profiling/inputs/offsite_cvr_model_526372970_793.input.predictor.disagg.gpu.merge --lower-backend AOT_INDUCTOR 2>&1 | tee /home/adelesun/mts_profiling/outputs/benchmark_output.txt bento console --kernel=aetk --file=/home/adelesun/mts_profiling/code/op/copmbxfunzmywemwmg66lnlcx4apvn2f2vsi3glgisausgfvit4g.py file ran successfully Differential Revision: D57345619

pytorch-bot · 2024-05-14T21:03:20Z

This appears to be a diff that was exported from phabricator, but the PR author does not have sufficient permissions to run CI. @adelesun, please do step 2 of internal wiki to get write access so you do not need to get CI approvals in the future. If you think this is a mistake, please contact the Pytorch Dev Infra team.

linux-foundation-easycla · 2024-05-14T21:03:21Z

The committers listed above are authorized under a signed CLA.

✅ login: adelesun (d6a14a3)

pytorch-bot · 2024-05-14T21:03:23Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/126213

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (3 Unrelated Failures)

As of commit d6a14a3 with merge base 7ed67cd ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 3, 5, linux.g5.4xlarge.nvidia.gpu) (gh) (matched linux rule in flaky-rules.json)
no space left on device

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / linux-focal-cuda11.8-py3.10-gcc9 / test (distributed, 1, 3, linux.8xlarge.nvidia.gpu) (gh) (trunk failure)
distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs
pull / linux-focal-cuda11.8-py3.10-gcc9 / test (distributed, 2, 3, linux.8xlarge.nvidia.gpu) (gh) (trunk failure)
distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_parity_powerSGD

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-05-14T21:03:47Z

This pull request was exported from Phabricator. Differential Revision: D57345619

adelesun · 2024-05-14T21:49:50Z

/easycla

facebook-github-bot · 2024-05-15T22:54:26Z

@pytorchbot merge -f 'Landed internally'

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

pytorchmergebot · 2024-05-15T22:56:03Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Summary: Encountered module import error when running triton kernel file. The cause seems to be D57215950 which changed "do_bench" to "do_bench_gpu" for torch._inductor.runtime.runtime_utils However, in the codegen, instead we have "from triton.testing import do_bench", so the line below should be reverted back to "do_bench". Test Plan: LOGLEVEL=DEBUG TORCH_COMPILE_DEBUG=1 TORCHINDUCTOR_MAX_AUTOTUNE=0 CUDA_VISIBLE_DEVICES=5 TORCHINDUCTOR_PROFILE=1 TORCHINDUCTOR_PROFILE_OUTPUT='/home/adelesun/mts_profiling/outputs/profile_output.txt' TORCH_LOGS='+inductor,+schedule,output_code' TORCHINDUCTOR_UNIQUE_KERNEL_NAMES=1 TORCHINDUCTOR_BENCHMARK_KERNEL=1 TORCHINDUCTOR_CACHE_DIR='/home/adelesun/mts_profiling/code' TORCHINDUCTOR_ENABLED_METRIC_TABLES=kernel_metadata buck2 run mode/opt -c=python.package_style=inplace -c fbcode.enable_gpu_sections=true -c fbcode.platform=platform010 -c fbcode.nvcc_arch=v100,a100,h100 -c fbcode.split-dwarf=true caffe2/torch/fb/model_transform/experimental/benchmark:mts_gpu_benchmark -- --local-model /home/adelesun/mts_profiling/inputs/offsite_cvr_model_526372970_793.input.predictor.disagg.gpu.merge --lower-backend AOT_INDUCTOR 2>&1 | tee /home/adelesun/mts_profiling/outputs/benchmark_output.txt bento console --kernel=aetk --file=/home/adelesun/mts_profiling/code/op/copmbxfunzmywemwmg66lnlcx4apvn2f2vsi3glgisausgfvit4g.py file ran successfully Differential Revision: D57345619 Pull Request resolved: pytorch#126213 Approved by: https://github.com/shunting314

pytorch-bot bot added the module: inductor label May 14, 2024

facebook-github-bot added the fb-exported label May 14, 2024

adelesun requested a review from shunting314 May 14, 2024 22:12

shunting314 approved these changes May 14, 2024

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 14, 2024

pytorchmergebot added the merging label May 15, 2024

pytorchmergebot added the Merged label May 15, 2024

pytorchmergebot closed this in b5432ad May 15, 2024

pytorchmergebot removed the merging label May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix triton codegen main do_bench_gpu import error #126213

Fix triton codegen main do_bench_gpu import error #126213

adelesun commented May 14, 2024 •

edited by pytorch-bot bot

pytorch-bot bot commented May 14, 2024

linux-foundation-easycla bot commented May 14, 2024 •

edited

pytorch-bot bot commented May 14, 2024 •

edited

facebook-github-bot commented May 14, 2024

adelesun commented May 14, 2024

facebook-github-bot commented May 15, 2024

pytorchmergebot commented May 15, 2024

Fix triton codegen main do_bench_gpu import error #126213

Fix triton codegen main do_bench_gpu import error #126213

Conversation

adelesun commented May 14, 2024 • edited by pytorch-bot bot

pytorch-bot bot commented May 14, 2024

linux-foundation-easycla bot commented May 14, 2024 • edited

pytorch-bot bot commented May 14, 2024 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/126213

✅ You can merge normally! (3 Unrelated Failures)

facebook-github-bot commented May 14, 2024

adelesun commented May 14, 2024

facebook-github-bot commented May 15, 2024

pytorchmergebot commented May 15, 2024

Merge started

adelesun commented May 14, 2024 •

edited by pytorch-bot bot

linux-foundation-easycla bot commented May 14, 2024 •

edited

pytorch-bot bot commented May 14, 2024 •

edited