Use constants for block size in transpose and packing kernels
For the GEMMs we have constants for the block sizes and related values. The packing/transpose kernels are simpler, with just one configurable parameter: the block size. It would be good to make these constants as well