Skip to content

Latest commit

 

History

History
52 lines (30 loc) · 3.74 KB

CHANGELOG.md

File metadata and controls

52 lines (30 loc) · 3.74 KB

Changelog

0.0.4 (2024-05-01)

Features

  • pytorch 2.3 support
  • gpu sampling kernels (top-p, top-k)
  • more gqa group sizes
  • add mma instructions for fp8 (#179) (d305798)
  • mma rowsum for fp8 (#180) (5af935c)
  • support any num_heads for get_alibi_slope (#200) (b217a6f)

Bug Fixes

  • fix python package dispatch error message (#182) (8eed01c)

0.0.3 (2024-03-08)

Features

Bug Fixes

Misc

  • add stream argument in BeginForwardFunction of TVMWrapper (#164) (fabfcb5)

Performance Improvements

  • multiple q by sm_scale in decode kernels (#144) (660c559)

0.0.2 (2024-02-17)

Bug Fixes