Skip to content

Conversation

@AlexAUT
Copy link
Contributor

@AlexAUT AlexAUT commented Apr 29, 2025

Membar does not deduce that two shared memory sub views do not alias if they are from the same allocation. When pipelining we therefore get barriers between the LocalLoad and the prefetch loads (AsyncCopy/BufferLoadToLocal) because they read/write into sub views of the same allocation.

This PR adds a filter function to Membar analysis to avoid those barriers by ignoring the dependency of LocalLoad if it consumes an AsyncToken from an AsyncWait. For such cases the barrier of the AsyncWait is enough to correctly synchronize the access.

@antiagainst antiagainst marked this pull request as ready for review April 29, 2025 17:05
@AlexAUT
Copy link
Contributor Author

AlexAUT commented Apr 30, 2025

Thank you for the review. I addressed the comments and made the filter less generic by additionally testing if one operand is an AsyncCopyGlobalToLocal or a BufferLoadToLocal.

@antiagainst antiagainst merged commit 4891fd2 into triton-lang:main Apr 30, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants