Decide on removal of HIP-only workaround, solved in HIP 6.4
We have the following currently in the reference GEMM implementation:
// FIXME: When using 'half' output type on an AMD GPU, the computation cannot be
// done using 'half' type. This is because the required operators, `__half&
// operator+(const __half& x)` and `__half& operator*(const __half& x)`, are
// missing. According to the HIP documentation, they should be available. A bug
// report has been submitted: https://github.com/ROCm/HIP/issues/3690 but the
// fix has not yet been upstreamed.
#if defined(__HIP_PLATFORM_AMD__)
using ComputeType =
typename std::conditional<std::is_same<Tin, half>::value ||
std
std::is_same<Tout, half>::value,
float, Tout>::type;
#else
using ComputeType = Tout;
#endif
The relevant operators are now available for the host as well, see https://github.com/ROCm/clr/commit/5122b8c999696f29523bf955e6347d0b06709cae
This should be available in ROCm 6.4, but do we want to required such a recent version for a host-only reference implementation?