If __get_tls has the right type, a lot of confusing casting can disappear.
It was probably a mistake that __get_tls was exposed as a function for mips
and x86 (but not arm), so let's (a) ensure that the __get_tls function
always matches the macro, (b) that we have the function for arm too, and
(c) that we don't have the function for any 64-bit architecture.
Change-Id: Ie9cb989b66e2006524ad7733eb6e1a65055463be