[AMD][BACKEND] Include `BufferLoadToLocal` in `UpdateAsyncWaitCount` computations #8621

AlexAUT · 2025-11-03T11:18:45Z

#8575 moved UpdateAsyncWaitCount from TTIR->TTGIR (before converting to buffer ops) to TTGIR->LLVM. This means we now have to include amdgpu.buffer_load_to_local when counting outstanding async instructions. This change is also required for Gluon where we directly emit buffer ops.

Ignoring them causes a performance regressions because we are emitting conservative waits.

…computations (triton-lang#8621) triton-lang#8575 moved `UpdateAsyncWaitCount` from `TTIR->TTGIR` (before converting to buffer ops) to `TTGIR->LLVM`. This means we now have to include `amdgpu.buffer_load_to_local` when counting outstanding async instructions. This change is also required for Gluon where we directly emit buffer ops. Ignoring them causes a performance regressions because we are emitting conservative waits.

AlexAUT added 2 commits November 3, 2025 10:24

Include buffer load to local in UpdateAsyncWaitCnt

f0603d9

Doc

44ac899

AlexAUT marked this pull request as ready for review November 3, 2025 16:28

AlexAUT requested review from antiagainst, ptillet and zhanglx13 as code owners November 3, 2025 16:28

antiagainst approved these changes Nov 3, 2025

View reviewed changes

antiagainst merged commit 5a6f410 into triton-lang:main Nov 3, 2025
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMD][BACKEND] Include `BufferLoadToLocal` in `UpdateAsyncWaitCount` computations #8621

[AMD][BACKEND] Include `BufferLoadToLocal` in `UpdateAsyncWaitCount` computations #8621

Uh oh!

AlexAUT commented Nov 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[AMD][BACKEND] Include BufferLoadToLocal in UpdateAsyncWaitCount computations #8621

[AMD][BACKEND] Include BufferLoadToLocal in UpdateAsyncWaitCount computations #8621

Uh oh!

Conversation

AlexAUT commented Nov 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[AMD][BACKEND] Include `BufferLoadToLocal` in `UpdateAsyncWaitCount` computations #8621

[AMD][BACKEND] Include `BufferLoadToLocal` in `UpdateAsyncWaitCount` computations #8621