AVX-512 implementation
AVX-512 implementation required extensions: F, DQ and BW
Considerations:
Use vzeroall
or vzeroupper
required after process finished. Or at least after Encoder/Decoder instance closed.
Intel processors lower the clock speed if ZMM registers have been used. Even the process is completed. This may hurts all other scalar based operation -- even it takes only small parts.
Edited by Grant Kim