Skip to content

Conversation

@ksivaman
Copy link
Member

This was mistakenly added during #41

Signed-off-by: Kirthi Shankar Sivamani <[email protected]>
@ksivaman
Copy link
Member Author

/te-ci

Copy link
Member

@ptrendx ptrendx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

@ksivaman ksivaman merged commit 67114f9 into NVIDIA:main Feb 25, 2023
ptrendx pushed a commit that referenced this pull request Mar 7, 2023
Signed-off-by: Kirthi Shankar Sivamani <[email protected]>
cyanguwa pushed a commit to cyanguwa/TransformerEngine that referenced this pull request Apr 1, 2023
Signed-off-by: Kirthi Shankar Sivamani <[email protected]>
Signed-off-by: Charlene Yang <[email protected]>
@ksivaman ksivaman deleted the bias_gelu_nvfusion_bug_fix branch July 19, 2023 01:40
zhiyu-deep pushed a commit to zhiyu-deep/TransformerEngine that referenced this pull request Sep 3, 2024
[New feature] With cudnn backend 9.2.0 and above, `Graph::check_support`
can determine support check for runtime engines without invoking the
nvrtc compiler. This allows users to check the support surface of cudnn
without invoking the nvrtc compilation.

[New feature] Python pip wheel now contains the necessary c++
development headers.

[New feature] Sliding window attention is now supported as an attribute
to the sdpa forward and bprop node. Usage:
`sdpa_attributes.set_sliding_window_length(window_length)`

[New feature] Bottom right aligned causal masking is now supported as an
attribute to the sdpa forward and bprop node. Usage:
`sdpa_attributes.use_causal_mask_bottom_right(true)`

[New feature] SDPA bprop attributes can choose deterministic algorithm
using the `use_deterministic_algorithm` API.

[New feature] Allow users to filter candidate execution plans of graph
by its shared memory usage in cudnn 9.2.0 and later.

[Bug fix] A runtime error if chosen execution plan candidate is
incorrectly set in the backend has been fixed. This would happen when
`check_support` does not correctly filter by the workspace size.

[Bug fix] selecting/deselecting by behavior and numerical notes has now
been fixed and works as intended.

[Debugging] A new tool for easy reproduction of a failure using the json
representation of the graph can be found [here](tools/json_reproducer).

[Samples] Restructured the cpp samples into categories for easier
navigation.

[Samples] Added a sample to showcase how different plans can be built in
parallel in separate threads.

[Compilation enhancement] Added a new macro
`CUDNN_FRONTEND_SKIP_NLOHMANN_JSON` as compilation flag to not have
nlohman::json as compilation dependency. Users lose access to certain
API functions like `print`, `key`, `serialize`, `deserialzie` that
depend on the library.

[Enhancement] Serialization of resample operation is now supported.

[Enhancement] Bug template has been added for new github issues
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants