WIP: xe: use double with post-ops with f64 to preserve accuracy by echeresh · Pull Request #4746 · uxlfoundation/oneDNN

echeresh · 2026-03-02T22:49:07Z

echeresh · 2026-03-02T22:58:42Z

make test
set test_scope=NIGHTLY
disable test_device_cpu
disable benchdnn_all
enable benchdnn_conv
enable benchdnn_pool
enable benchdnn_reorder
enable benchdnn_deconv
enable arch_gpu_xe-hpc
enable arch_gpu_xe-hpg-atsm
enable arch_gpu_xe-hpg-dg2
enable arch_gpu_xe-lp
enable arch_gpu_xe-lpg
enable arch_gpu_xe-lpg+
enable arch_gpu_xe2-hpg-bmg
enable arch_gpu_xe2-lpg
enable arch_gpu_xe3-lpg

Simonsays095

There are still many other OpenCL kernels that use float instead of POST_OP_DATA_T. I guess those will be handled separately?

Simonsays095 · 2026-03-02T23:02:17Z

src/gpu/intel/matmul/ref.cl

+        POST_OP_DATA_T dst_data;
 #if WITH_SUM
-        dst_data = convert_float(DATA_TO_REF(C[dst_off]));
+        dst_data = (POST_OP_DATA_T)DATA_TO_REF(C[dst_off]);


DATA_TO_REF here will convert to float. We either need to change DATA_TO_REF for f64 to be convert_double, or carve out an exception via macros here.

Likely DATA_TO_REF should be replaced with FLT_ACC_DATA_T which is already used in this file. Alternatively, we can just refactor this to use the load/store API so that we don't need to go over this with a fine-grained comb looking for invalid casts.

As an aside, currently we have 3 different floating point accumulator types DATA_TO_REF, POST_OP_DATA_T, and FLT_ACC_DATA_T. I am not convinced we really need all of these.

Thanks for the catch, let me move the PR to WIP for now. I want to try to migrate REF family of macros to load/store. Eliminating extra macros would be nice.

Just in case this is helpful, here as a sample commit porting gemm to use the load store interface.

echeresh added the platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel label Mar 2, 2026

echeresh requested review from a team as code owners March 2, 2026 22:49

github-actions bot added the component:tests Codeowner: @oneapi-src/onednn-arch label Mar 2, 2026

echeresh added 2 commits March 2, 2026 14:52

gtests: add regression test for f64 matmul with post-ops

8abe630

xe: use double with post-ops with f64 to preserve accuracy

3eee765

echeresh force-pushed the echeresh/f64 branch from 11fee11 to 3eee765 Compare March 2, 2026 22:52

Simonsays095 approved these changes Mar 2, 2026

View reviewed changes

dzarukin approved these changes Mar 2, 2026

View reviewed changes

echeresh changed the title ~~xe: use double with post-ops with f64 to preserve accuracy~~ WIP: xe: use double with post-ops with f64 to preserve accuracy Mar 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: xe: use double with post-ops with f64 to preserve accuracy#4746

WIP: xe: use double with post-ops with f64 to preserve accuracy#4746
echeresh wants to merge 2 commits intomainfrom
echeresh/f64

echeresh commented Mar 2, 2026

Uh oh!

echeresh commented Mar 2, 2026

Uh oh!

Simonsays095 left a comment

Uh oh!

Simonsays095 Mar 2, 2026

Uh oh!

rjoursler Mar 3, 2026

Uh oh!

echeresh Mar 3, 2026

Uh oh!

rjoursler Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

echeresh commented Mar 2, 2026

Uh oh!

echeresh commented Mar 2, 2026

Uh oh!

Simonsays095 left a comment

Choose a reason for hiding this comment

Uh oh!

Simonsays095 Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

rjoursler Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

echeresh Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

rjoursler Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants