Fix handling of unvisited operands in AxisInfoAnalysis #8723

neildhar · 2025-11-14T06:33:02Z

We currently force initialisation of operands that have not yet been
visited with setToEntryState. This means that the order in which
values are visited can change the results of the analysis.

This can be a source of bugs. For example, the lowering for
AsyncCopyGlobalToLocalOp validates that the load addresses permit
sufficient vectorisation, however, this is up to the analysis actually
recovering the same information it had when the async copy was created.
Otherwise, we crash during lowering. I have an actual repro for this
but it has been very difficult to minimise it enough to make it suitable
for an lit test:
https://gist.github.com/neildhar/7eea6a312afa39d1cc83dc12627c2ba3

Populating the operands in this way also means that we have to handle
control flow like ForOp and IfOp explicitly in setToEntryState,
because we may attempt to populate their results when we visit their
users.

Instead, when we encounter an operation whose operands have not yet been
encountered, skip over the operation entirely. We can revisit it once
the operands have actually been visited. This improves the quality of
the analysis, and leaves the handling of control flow to the dataflow
framework.

We currently force initialisation of operands that have not yet been visited with `setToEntryState`. This means that the order in which values are visited can change the results of the analysis. This can be a source of bugs. For example, the lowering for `AsyncCopyGlobalToLocalOp` validates that the load addresses permit sufficient vectorisation, however, this is up to the analysis actually recovering the same information it had when the async copy was created. Otherwise, we crash during lowering. I have an actual repro for this but it has been very difficult to minimise it enough to make it suitable for an lit test: https://gist.github.com/neildhar/7eea6a312afa39d1cc83dc12627c2ba3 Populating the operands in this way also means that we have to handle control flow like `ForOp` and `IfOp` explicitly in `setToEntryState`, because we may attempt to populate their results when we visit their users. Instead, when we encounter an operation whose operands have not yet been encountered, skip over the operation entirely. We can revisit it once the operands have actually been visited. This improves the quality of the analysis, and leaves the handling of control flow to the dataflow framework.

ThomasRaoux · 2025-11-14T16:06:53Z

Why is it expected to be better? Is it possible to write a test where the results are better?
@Mogball, does that match what you expect?

neildhar · 2025-11-14T17:14:33Z

@ThomasRaoux Yes, although it has been very difficult to minimise into something that would be suitable for a test. The bug in the PR description is an example of this. Existing tests should already exercise the changes in this PR though, since we cannot delete the control flow handling without the fix to skip over operations with unvisited operands.

I think while we're operating on structured control flow, this probably is not any better. The framework visits all the operations in order, with the exception of the scf ops, but we special case them by initialising them with the best case. So forcing all the operands to be initialised should only affect scf ops, which we handle.

However, once the structured control flow is lowered out, the order in which things are visited becomes important. If we visit an operation before its operands are visited. We may end up pessimising the analysis, because we assign that operand the most conservative state. This is what happens in the bug in the PR description. Performing the analysis on the structured control flow in an earlier pass determines that we can use AsyncCopyGlobalToLocal. However, when we actually try to lower the async copy in LoadStoreOpToLLVM, we can no longer recover that information because the analysis produces a worse result.

Mogball

This makes sense to me and is the right way to handle this.

Mogball · 2025-11-14T18:36:29Z

lib/Analysis/AxisInfo.cpp

-      // Control flow operations are initialized with "unknown" state:
-      // the maximum possible divisibility, contiguity, and constancy.
+    } else if (isa<gpu::WarpSpecializePartitionsOp>(op)) {
+      // Initialize the arguments to gpu::WarpSpecializePartitionsOp with


The argument states of WarpSpecializeOp should be forwarded from its operands. Unfortunately it doesn't implement RegionBranchOpInterface so you might have to implement this yourself.

Yeah, I'm planning to look a little more into this, but it seemed orthogonal to this PR. With this PR, we no longer try to populate the state of every operand that hasn't been visited. However, WarpSpecializePartitionsOp is special because we walk the IR and restart the analysis on it. So the framework will call setToEntryState on the block args to it (rather than us calling it ourselves from visitOperation).

When that happens, we want to start with the maximum divisibility, contiguity and constancy, so that we don't start from an artificially restricted state.

We are seeing failures in the coalesce pass caused by axis info being unset for some values.

…" (#8754) We are seeing failures in the coalesce pass caused by axis info being unset for some values.

) We currently force initialisation of operands that have not yet been visited with `setToEntryState`. This means that the order in which values are visited can change the results of the analysis. This can be a source of bugs. For example, the lowering for `AsyncCopyGlobalToLocalOp` validates that the load addresses permit sufficient vectorisation, however, this is up to the analysis actually recovering the same information it had when the async copy was created. Otherwise, we crash during lowering. I have an actual repro for this but it has been very difficult to minimise it enough to make it suitable for an lit test: https://gist.github.com/neildhar/7eea6a312afa39d1cc83dc12627c2ba3 Populating the operands in this way also means that we have to handle control flow like `ForOp` and `IfOp` explicitly in `setToEntryState`, because we may attempt to populate their results when we visit their users. Instead, when we encounter an operation whose operands have not yet been encountered, skip over the operation entirely. We can revisit it once the operands have actually been visited. This improves the quality of the analysis, and leaves the handling of control flow to the dataflow framework.

…n-lang#8723)" (triton-lang#8754) We are seeing failures in the coalesce pass caused by axis info being unset for some values.

neildhar force-pushed the fix-axisinfo-unvisited branch from 68fdec0 to a9cfa5c Compare November 14, 2025 06:38

neildhar marked this pull request as ready for review November 14, 2025 06:42

neildhar requested a review from ptillet as a code owner November 14, 2025 06:42

ThomasRaoux requested a review from Mogball November 14, 2025 17:58

Mogball reviewed Nov 14, 2025

View reviewed changes

jeffniu-openai approved these changes Nov 14, 2025

View reviewed changes

Mogball approved these changes Nov 14, 2025

View reviewed changes

ThomasRaoux merged commit e33219f into triton-lang:main Nov 14, 2025
9 checks passed

neildhar deleted the fix-axisinfo-unvisited branch November 14, 2025 22:32

peterbell10 added a commit that referenced this pull request Nov 18, 2025

Revert "Fix handling of unvisited operands in AxisInfoAnalysis (#8723)"

4e59a57

We are seeing failures in the coalesce pass caused by axis info being unset for some values.

peterbell10 added a commit that referenced this pull request Nov 18, 2025

Revert "Fix handling of unvisited operands in AxisInfoAnalysis (#8723)…

8133121

…" (#8754) We are seeing failures in the coalesce pass caused by axis info being unset for some values.

neildhar restored the fix-axisinfo-unvisited branch November 18, 2025 16:44

neildhar mentioned this pull request Nov 21, 2025

Make WarpSpecializePartitionsOp implement RegionBranchInterface #8799

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix handling of unvisited operands in AxisInfoAnalysis #8723

Fix handling of unvisited operands in AxisInfoAnalysis #8723

Uh oh!

neildhar commented Nov 14, 2025 •

edited

Loading

Uh oh!

ThomasRaoux commented Nov 14, 2025

Uh oh!

neildhar commented Nov 14, 2025

Uh oh!

Mogball left a comment

Uh oh!

Mogball Nov 14, 2025

Uh oh!

neildhar Nov 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix handling of unvisited operands in AxisInfoAnalysis #8723

Fix handling of unvisited operands in AxisInfoAnalysis #8723

Uh oh!

Conversation

neildhar commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ThomasRaoux commented Nov 14, 2025

Uh oh!

neildhar commented Nov 14, 2025

Uh oh!

Mogball left a comment

Choose a reason for hiding this comment

Uh oh!

Mogball Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

neildhar Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

neildhar commented Nov 14, 2025 •

edited

Loading