Skip to content

Conversation

@derek-gerstmann
Copy link
Contributor

CodeGen wasn't handling arbitrary mixing of vectors with different lanes to form a shuffle, causing a cast instruction to be generated with mixed types which generated invalid SPIR-V and caused the driver compiler to fail to compile.

Now, we handle the extraction of each selected lane of each vector argument, and construct a new Vector from the combined composite.

Fixes #8580

Copy link
Contributor

@mcourteaux mcourteaux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. I have a few points of feedback.
Perhaps, as you took my code from the GPU C codegen for this, I suggest we move this into a helper function:

namespace Internal {
std::vector<std::pair<int, int>> calculate_shuffle_vector_and_lane_indices(const Shuffle *s)
}

And clean up both CodeGen_GPU and Codegen_Vulkan using this helper. All the sanity checks can then also go in there.

idx -= vec_lanes;
}

} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not overlooking anything, it seems there is no specialized branch for simply joining (i.e., concatenating) vectors. Either put a TODO comment, make an issue, or add that branch. Falling back on the generic vector shuffle, below, seems like a performance waste.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracking this here: #8622

@mcourteaux
Copy link
Contributor

Also, is this fixable:

Error: Vulkan: Failed to create compute pipeline! vkCreateComputePipelines returned <Unknown Vulkan Result Code>

to actually give a meaningful name to the error code this produced? Khronos spec says:

On failure, this command returns

    VK_ERROR_OUT_OF_HOST_MEMORY
    VK_ERROR_OUT_OF_DEVICE_MEMORY
    VK_ERROR_INVALID_SHADER_NV

I'm guessing I get the last one? Weird that that would be NVIDIA specific.

Copy link
Contributor

@mcourteaux mcourteaux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks :)

@derek-gerstmann
Copy link
Contributor Author

Also, is this fixable:

Error: Vulkan: Failed to create compute pipeline! vkCreateComputePipelines returned <Unknown Vulkan Result Code>

to actually give a meaningful name to the error code this produced? Khronos spec says:

On failure, this command returns

    VK_ERROR_OUT_OF_HOST_MEMORY
    VK_ERROR_OUT_OF_DEVICE_MEMORY
    VK_ERROR_INVALID_SHADER_NV

I'm guessing I get the last one? Weird that that would be NVIDIA specific.

All of the above are handled in vk_get_error_name(). In this case the return value was -13 (or VK_ERROR_UNKNOWN).

However, if your Vulkan driver supports the debug extension, the runtime registers callbacks to get more information. On my machine I was seeing the following output before this PR:

NVVM compilation failed: 1
Vulkan [WARNING]: (user_context=0x7ffcd1660370, id=2, name:NVIDIA) CreatePipeline: failed to compile internal representation
Vulkan [WARNING]: (user_context=0x7ffcd1660370, id=2, name:NVIDIA) CreatePipeline: unexpected compilation failure
Vulkan [WARNING]: (user_context=0x7ffcd1660370, id=2, name:NVIDIA) CreateComputePipeline: unexpected failure compiling SPIR-V shader: 0x9f5484893ce4c6a
Error: Vulkan: Failed to create compute pipeline! vkCreateComputePipelines returned <Unknown Vulkan Result Code>
Vulkan: Failed to create compute pipeline!
Vulkan: Failed to setup compute pipeline!

@mcourteaux
Copy link
Contributor

Ready to land, @derek-gerstmann ?

@derek-gerstmann
Copy link
Contributor Author

Yep all good with me! @abadams any objections to us adding a helper method to the IR Shuffle node?

std::vector<std::pair<int, int>> Shuffle::calculate_vector_and_lane_indices() const;

@derek-gerstmann derek-gerstmann requested a review from abadams April 29, 2025 19:32
…ethod name.

Simplify vector/lane index method to create all indices, and return each pair.
Refactor call sites to use `auto vector_and_lane_indices = op->vector_and_lane_indices()`
@derek-gerstmann derek-gerstmann requested a review from abadams April 29, 2025 20:39

// Sanity check that the total lane count matches between the op-type and indices
internal_assert(!op->vectors.empty());
for (size_t i = 1; i < op->vectors.size(); i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think shuffle nodes are allowed to combine vectors of different sizes. It looks like the removed code handles this, but this assert looks like it would fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handling that case looks like it was the whole point of the PR, so I must be missing something. Maybe a longer comment here would help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch! These asserts seem wrong or out of date ... I lifted them from CodeGen_GPU:

https://github.com/halide/Halide/blame/65b08efa78baf964387b9615b4c80e43d4cb967a/src/CodeGen_GPU_Dev.cpp#L149

And the date on those lines is from 3 years ago. So my guess is those asserts were written to match the implementation as it was ... without support to handle vectors of different lanes.

Thanks! I'll remove these lines from both CodeGen_Vulkan and CodeGen_GPU.

Copy link
Contributor Author

@derek-gerstmann derek-gerstmann Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually aren't triggering those asserts at all. Even the test case seems to produce lowered code that has vectors of the same size for the arguments, but only uses a few lanes from the arguments to produce the shuffle. That case wasn't handled correctly before ... it triggered a type mismatch for the assignments which generated invalid code for mixing vector/scalar types of different sizes.

@derek-gerstmann derek-gerstmann requested a review from abadams May 1, 2025 16:56
@derek-gerstmann
Copy link
Contributor Author

Okay to merge? @abadams

@derek-gerstmann derek-gerstmann merged commit 82d3aff into main May 1, 2025
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Shuffle triggers vague issue on Vulkan.

4 participants