The magic numbers that C defines are obnoxious. We had partial
definitions for these internally. Add the missing one and move them to
a public header for anyone else that may want to use them.
Bug: None
Test: None
Change-Id: Ia6b8cff4310bcccb23078c52216528db668ac966
* Rationale
The question often comes up of how to use multiple time zones in C code.
If you're single-threaded, you can just use setenv() to manipulate $TZ.
toybox does this, for example. But that's not thread-safe in two
distinct ways: firstly, getenv() is not thread-safe with respect to
modifications to the environment (and between the way putenv() is
specified and the existence of environ, it's not obvious how to fully
fix that), and secondly the _caller_ needs to ensure that no other
threads are using tzset() or any function that behaves "as if" tzset()
was called (which is neither easy to determine nor easy to ensure).
This isn't a bigger problem because most of the time the right answer
is to stop pretending that libc is at all suitable for any i18n, and
switch to icu4c instead. (The NDK icu4c headers do not include ucal_*,
so this is not a realistic option for most applications.)
But what if you're somewhere in between? Like the rust chrono library,
for example? What then?
Currently their "least worst" option is to reinvent the entire wheel and
read our tzdata files. Which isn't a great solution for anyone, for
obvious maintainability reasons.
So it's probably time we broke the catch-22 here and joined NetBSD in
offering a less broken API than standard C has for the last 40 years.
Sure, any would-be caller will have to have a separate "is this
Android?" and even "is this API level >= 35?" path, but that will fix
itself sometime in the 2030s when developers can just assume "yes, it
is", whereas if we keep putting off exposing anything, this problem
never gets solved.
(No-one's bothered to try to implement the std::chrono::time_zone
functionality in libc++ yet, but they'll face a similar problem if/when
they do.)
* Implementation
The good news is that tzcode already implements these functions, so
there's relatively little here.
I've chosen not to expose `struct state` because `struct __timezone_t`
makes for clearer error messages, given that compiler diagnostics will
show the underlying type name (`struct __timezone_t*`) rather than the
typedef name (`timezone_t`) that's used in calling code.
I've moved us over to FreeBSD's wcsftime() rather than keep the OpenBSD
one building --- I've long wanted to only have one implementation here,
and FreeBSD is already doing the "convert back and forth, calling the
non-wide function in the middle" dance that I'd hoped to get round to
doing myself someday. This should mean that our strftime() and
wcsftime() behaviors can't easily diverge in future, plus macOS/iOS are
mostly FreeBSD, so any bugs will likely be interoperable with the other
major mobile operating system, so there's something nice for everyone
there!
The FreeBSD wcsftime() implementation includes a wcsftime_l()
implementation, so that's one stub we can remove. The flip side of that
is that it uses mbsrtowcs_l() and wcsrtombs_l() which we didn't
previously have. So expose those as aliases of mbsrtowcs() and
wcsrtombs().
Bug: https://github.com/chronotope/chrono/issues/499
Test: treehugger
Change-Id: Iee1b9d763ead15eef3d2c33666b3403b68940c3c
...by inlining them.
Also fix a couple of harmless bugs in passing. I've added tests, but in
both cases I don't think it was actually possible to hit the bad behavior:
we'd hit another test and fail immediately after in an externally
indistinguishable way.
Bug: N/A
Test: readelf
Change-Id: I8466050b0bfe2b7b94c76b383cf10c1d9d28debd
We don't need these in libandroid_support, but we do need the other
parts of wchar.cpp, and they're not really related.
Test: make checkbuild
Bug: None
Change-Id: I40f3089b034abfd4873e81c0b6216a7cfd977d8d
"ls -q" (or "adb shell -tt ls") was mangling non-ASCII because mbrtowc
was returning multibyte characters as their individual bytes. This was
because toybox asks for "" rather than "C.UTF-8", and for some reason
we were interpreting that as "C" rather than "C.UTF-8".
Test: bionic tests, ls
Change-Id: Ic60e3b90cd5fe689e5489fad0d5d91062b9594ed
strtoll(3), strtoull(3), wcstoll(3), and wcstoull(3) all take an _int_
as a base, not a size_t. This is an ABI compatibility issue.
Bug: 17628622
Change-Id: I17f8eead34ce2112005899fc30162067573023ec
A mistake I made while cleaning this up the first time through.
mbstrtowcs(3) sets the src param to null if it finishes the string.
Change-Id: I6263646e25d9537043b7025fd1dd6ae195f365e2
The len parameter is a _maximum_ length. The previous code was treating
it as an exact length, causing the following typical call to fail:
mbsrtowcs(out, &in, sizeof(out), state); // sizeof(out) > strlen(in)
Change-Id: I48e474fd54ea5f122bc168a4d74bfe08704f28cc
Accidentally verified against a dirty tree. Needs the companion change to libc++ to land upstream before I can submit this.
This reverts commit e087eac404.
Change-Id: I317ecd0923114f415eaad7603002f77feffb5e3f
Since we only support the C locale, we can just forward all of these to
their non-locale equivalents for correct behavior.
Change-Id: Ib7be71b7f636309c0cc3be1096a4c1f693f04fbb
mbrtoc32 and c32rtomb get their implementations from mbrtowc and wcrtomb. The
wc functions now simply call the c32 functions.
Bug: 14646575
Change-Id: I49d4b95fed0f9d790260c996c4d0f8bfd1686324
Although glibc gets by with an 8-byte mbstate_t, OpenBSD uses 12 bytes (of
the 128 bytes it reserves!).
We can actually implement UTF-8 encoding/decoding with a 0-byte mbstate_t
which means we can make things work on LP32 too, as long as we accept the
limitation that the caller needs to present us with a complete sequence
before we'll process it.
Our behavior is fine when going from characters to bytes; we just
update the source wchar_t** to say how far through the input we got.
I'll come back and use the 4 bytes we do have to cope with byte sequences
split across multiple input buffers. The fact that we don't support
UTF-8 sequences longer than 4 bytes plus the fact that the first byte of
a UTF-8 sequence encodes the length means we shouldn't need the other
fields OpenBSD used (at the cost of some recomputation in cases where a
sequence is split across buffers).
This patch also makes the minimal changes necessary to setlocale(3) to
make us behave like glibc when an app requests UTF-8. (The difference
being that our "C" locale is the same as our "C.UTF-8" locale.)
Change-Id: Ied327a8c4643744b3611bf6bb005a9b389ba4c2f
This also gets us the C99 wcstoimax and wcstoumax, and a working fgetwc and
ungetwc, all of which are needed in the implementation.
This also brings several other files closer to upstream.
Change-Id: I23b025a8237a6dbb9aa50d2a96765ea729a85579
This replaces a partial set of non-functional functions with a complete
set of functions, all of which actually work.
This requires us to implement mbsnrtowcs and wcsnrtombs which completes
the set of what we need for libc++.
The mbsnrtowcs is basically a copy & paste of wcsnrtombs, but I'm going
to go straight to looking at using the OpenBSD UTF-8 implementation rather
than keep polishing our home-grown turd.
(This patch also opportunistically switches us over to upstream btowc,
mbrlen, and wctob, since they're all trivially expressed in terms of
other functions.)
Change-Id: I0f81443840de0f1aa73b96f0b51988976793a323
We have similar degenerate implementations for all the other isw* functions,
so it's weird to exclude just one.
Change-Id: I659b97930e68598826c4882bb59f4146870fb6a0
This is an implementation in the style of the rest: char == byte.
We might want to come back and implement UTF-8, but this is enough for ltrace.
Bug: 13747066
Change-Id: Ib2b63609c9014fdef9a8491e067467c4fc5ae3cc
Also separate out the C++ files so we can use -Werror on them. I'd
rather wait for LOCAL_CPPFLAGS to be in AOSP, but this also lets us
see which files still need to be sorted into one bucket or the other.
Change-Id: I6acc1f7c043935c70a3b089f705d218b9aaaba0a