-
Notifications
You must be signed in to change notification settings - Fork 2.6k
[BACKEND] Generalise maybeDeduplicate to all layouts #8492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
include/triton/Conversion/TritonGPUToLLVM/ElementwiseOpToLLVMBase.h
Outdated
Show resolved
Hide resolved
We had a subtle asymmetry here that was producing different PTX for the same layout.
ThomasRaoux
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
We had a subtle asymmetry here that was producing different PTX for the same layout. We now generalise this pass to work with any layout and we drop a few restrictions the previous pass had.
We had a subtle asymmetry here that was producing different PTX for the same layout. We now generalise this pass to work with any layout and we drop a few restrictions the previous pass had.
…nerically (triton-lang#8421) (triton-lang#8495) This PR relands triton-lang#8386. It depends on triton-lang#8492 to avoid regressing in some workloads.
We had a subtle asymmetry here that was producing different PTX for the
same layout. We now generalise this pass to work with any layout and we drop
a few restrictions the previous pass had.