Fix W-tiling CPU code
Forward and backward FFT in the W-Tiling CPU code were reversed. This went unnoticed because there is no effect when the shift parameter is zero. The test has been adjusted to use a non-zero shift, which triggers the error in the tests. The order of FFTs in the W-Tiling has been fixed.