Skip to content
Snippets Groups Projects
Commit 36832487 authored by Mattia Mancini's avatar Mattia Mancini
Browse files

Use fmaddsub

parent 8f4c79f2
No related branches found
No related tags found
1 merge request!3Add test for matrix
Pipeline #81790 passed
......@@ -107,9 +107,8 @@ void matrixMultiplyAVX2(const std::complex<float>* a,
__m256 a_4_inv = _mm256_mul_ps(a_4, inv);
__m256 b_4 = _mm256_permutevar8x32_ps(b_m, b_4_ind);
__m256 c_m = _mm256_mul_ps(a_1, b_1);
c_m = _mm256_fmadd_ps(a_2, b_2, c_m);
c_m = _mm256_fmadd_ps(a_3_inv, b_3, c_m);
c_m = _mm256_fmadd_ps(a_4_inv, b_4, c_m);
__m256 c_p1 = _mm256_fmaddsub_ps(a_1, b_1, _mm256_mul_ps(a_3, b_3));
__m256 c_p2 = _mm256_fmaddsub_ps(a_2, b_2, _mm256_mul_ps(a_4, b_4));
__m256 c_m = _mm256_add_ps(c_p1, c_p2);
_mm256_store_ps(c_ptr, c_m);
}
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment