We've had these backward all this time. The relevant quote is in a
code comment in the implementation, but the first call after
completely decoding a code point that requires a surrogate pair should
return the number of bytes decoded by the most recent call, and the
second call should return -3 (if only C had given those some named
constants that might have been more obviously wrong).
Bug: https://issuetracker.google.com/289419882
Test: Fixed the test, tests run against glibc and musl to confirm
Change-Id: Idabf01075b1cad35b604ede8d676d6f0b1dc91e6
These all appear to be either musl bugs or underspecified corners of
the C standard, so rather than verify the musl behavior I've just
disabled the body of the test for musl.
Now that this is skipped, all the uchar tests are passing for bionic,
glibc, and musl.
Bug: None
Test: None
Change-Id: Icf88ef42e9b750ab45ba76bf8112967b00e72a9f
Doesn't do anything for bionic (which is why this has gone unnoticed),
but it does change the locale for glibc and musl. After this patch,
all these tests pass on glibc. musl is still failing in
uchar.start_state, which I haven't finished investigating.
This should probably be a test fixture so it's harder to forget, but
there are a handful of tests here which don't call setlocale until
part way through the tests, and I'm not certain if that was attempting
to test some non-obvious behavior, or if that was just an accident. I
don't want to change that test behavior before understanding it
better, so this will do for now.
Bug: None
Test: ./tests/run-on-host.sh 64 --gtest_filter="uchar.*"
Test: ./tests/run-on-host.sh glibc --gtest_filter="uchar.*"
Test: ./tests/run-on-host.sh musl --gtest_filter="uchar.*"
Change-Id: Ib781a41893f021e336e67281070932f41f792318
This one forgot to set its locale, so it was passing on glibc (because
the sequence here wasn't valid in its default locale) and failing on
musl, both for the wrong reasons.
Bug: None
Test: ./tests/run-on-host.sh 64 --gtest_filter="uchar.*"
Test: ./tests/run-on-host.sh glibc --gtest_filter="uchar.*"
Test: ./tests/run-on-host.sh musl --gtest_filter="uchar.*"
Change-Id: Ic6bcd1836ba23c7010e2cde673a3beca73778021
Musl was treating 0xf0 as a valid character in its default locale. I
didn't dig into whether that was a musl bug or whether it was actually
translating whatever extended ASCII character that was into the
correct code point (tbh I don't know what the rule there is either).
Bug: None
Test: ./tests/run-on-host.sh 64 --gtest_filter="uchar.*"
Test: ./tests/run-on-host.sh glibc --gtest_filter="uchar.*"
Test: ./tests/run-on-host.sh musl --gtest_filter="uchar.*"
Change-Id: I0845fdee9a016ad67ccff3716129ff29f83a63d7
Also actually assign a bool to it. I'd originally written a #define
and apparently forgot to fix the value. I'm a bit surprised that clang
didn't complain.
Bug: None
Test: ./tests/run-on-host.sh 64 --gtest_filter="uchar.*"
Test: ./tests/run-on-host.sh glibc --gtest_filter="uchar.*"
Test: ./tests/run-on-host.sh musl --gtest_filter="uchar.*"
Change-Id: Iac46bc9c48fd70853d5c447e812e25e617281d2b
Same as the earlier fix for mbrtoc16. Other implementations support
the older RFC, bionic supports the new one.
Bug: None
Test: ./tests/run-on-host.sh glibc --gtest_filter="uchar.*"
Change-Id: I9e85a9ae53aaaa112a76665063acd2bd856b26cf
glibc and musl both interpreted the spec differently than we did
(better, imo) for the return value of a zero-length conversion. Fix
the test to handle that.
Also converted the test from ASSERT to EXPECT to reduce the number of
builds needed to find failures.
Bug: None
Test: ./tests/run-on-host.sh glibc --gtest_filter="uchar.*"
Change-Id: Id74364040ce3b0e21bacd78f70467053cc8a6058
Also split that case out into a separate test to avoid complicating
the test for the common cases.
Bug: None
Test: ./tests/run-on-host.sh glibc --gtest_filter="uchar.mbrtoc16"
Change-Id: If7e50f659ad99ee9bab8847fc7320c7bbd629c5d
Explicitly test an invalid 5-byte UTF-8 sequence with mbrtoc16(3); the
fact that we weren't testing this was shown by coverage data.
Merge the surrogate pair tests in with their fewer-byte siblings to make
it clearer to a human reader that we've covered both cases.
Clear errno to make assertions about errno more convincing.
Test: treehugger
Change-Id: I485a48cc141f3e52058e2138326f3134d41b2243
...by inlining them.
Also fix a couple of harmless bugs in passing. I've added tests, but in
both cases I don't think it was actually possible to hit the bad behavior:
we'd hit another test and fail immediately after in an externally
indistinguishable way.
Bug: N/A
Test: readelf
Change-Id: I8466050b0bfe2b7b94c76b383cf10c1d9d28debd
This way it's a lot harder for us to screw up (since we should always
be including <sys/cdefs.h> anyway).
Bug: 14659579
Change-Id: I23070fff3296b0d1c683bb5e3a6e214146327d53
Without that fix the test fails with:
"error: comparison between signed and unsigned integer expressions" on x86,
due to the fact that char is signed on x86.
Change-Id: I44462d67c15c7e9b730ad5da52eb9c05e207d34b
Signed-off-by: Alexander Ivchenko <alexander.ivchenko@intel.com>
mbrtoc32 and c32rtomb get their implementations from mbrtowc and wcrtomb. The
wc functions now simply call the c32 functions.
Bug: 14646575
Change-Id: I49d4b95fed0f9d790260c996c4d0f8bfd1686324