Skip to content
Snippets Groups Projects
Commit b25351c9 authored by Bram Veenboer's avatar Bram Veenboer Committed by Bram Veenboer
Browse files

Replace add and mul with fmadd

parent 8a3c26ec
No related branches found
No related tags found
1 merge request!4Add matrixMultiplyAVX2b
......@@ -145,7 +145,7 @@ void matrixMultiplyAVX2b(const std::complex<float>* a,
__m256 b_4 = _mm256_permutevar8x32_ps(b_m, b_4_ind);
__m256 c_m = _mm256_mul_ps(a_1, b_1);
c_m = _mm256_add_ps(_mm256_mul_ps(a_2, b_2), c_m);
c_m = _mm256_fmadd_ps(a_2, b_2, c_m);
c_m = _mm256_addsub_ps(c_m, _mm256_mul_ps(a_3, b_3));
c_m = _mm256_addsub_ps(c_m, _mm256_mul_ps(a_4, b_4));
_mm256_store_ps(c_ptr, c_m);
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment