Add hardware implementations for sqrt, ceil, floor and trunc for
x86 and x86_64. These routines, and in particular sqrt are much
faster than the BSD C language versions of these functions.
Fixed whitespace errors.
Revised x86 versions with respect to alignment.
Rebased for Android 5.0
Change-Id: I86bdb520ce5e589b0cf63778f353fbd3263c8f0e
Author: James Rose <james.rose@intel.com>
Signed-off-by: James Rose <james.rose@intel.com>