platform_bionic/libc/upstream-openbsd
Elliott Hughes a4959aa6f8 Reimplement the <ctype.h> is* functions.
Following on from the towlower()/towupper() changes, add benchmarks for
most of <ctype.h>, rewrite the tests to cover the entire defined range
for all of these functions, and then reimplement most of the functions.

The old table-based implementation is mostly a bad idea on modern
hardware, with only ispunct() showing a significant benefit compared to
any other way I could think of writing it, and isalnum() a marginal but
still convincingly genuine benefit.

My new benchmarks make an effort to test an example from each relevant
range of characters to avoid, say, accidentally optimizing the behavior
of `isalnum('0')` at the expense of `isalnum('z')`.

Interestingly, clang is able to generate what I believe to be the
optimal implementations from the most readable code, which is
impressive. It certainly matched or beat all my attempts to be clever!

The BSD table-based implementations made a special case of EOF despite
having a `_ctype_` table that's offset by 1 to include EOF at index 0.
I'm not sure why they didn't take advantage of that, but removing the
explicit check for EOF measurably improves the generated code on arm and
arm64, so even the two functions that still use the table benefit from
this rewrite.

Here are the benchmark results:

arm64 before:
  BM_ctype_isalnum_n                 3.73 ns         3.73 ns    183727137
  BM_ctype_isalnum_y1                3.82 ns         3.81 ns    186383058
  BM_ctype_isalnum_y2                3.73 ns         3.72 ns    187809830
  BM_ctype_isalnum_y3                3.78 ns         3.77 ns    181383055
  BM_ctype_isalpha_n                 3.75 ns         3.75 ns    189453927
  BM_ctype_isalpha_y1                3.76 ns         3.75 ns    184854043
  BM_ctype_isalpha_y2                4.32 ns         3.78 ns    186326931
  BM_ctype_isascii_n                 2.49 ns         2.48 ns    275583822
  BM_ctype_isascii_y                 2.51 ns         2.51 ns    282123915
  BM_ctype_isblank_n                 3.11 ns         3.10 ns    220472044
  BM_ctype_isblank_y1                3.20 ns         3.19 ns    226088868
  BM_ctype_isblank_y2                3.11 ns         3.11 ns    220809122
  BM_ctype_iscntrl_n                 3.79 ns         3.78 ns    188719938
  BM_ctype_iscntrl_y1                3.72 ns         3.71 ns    186209237
  BM_ctype_iscntrl_y2                3.80 ns         3.80 ns    184315749
  BM_ctype_isdigit_n                 3.76 ns         3.74 ns    188334682
  BM_ctype_isdigit_y                 3.78 ns         3.77 ns    186249335
  BM_ctype_isgraph_n                 3.99 ns         3.98 ns    177814143
  BM_ctype_isgraph_y1                3.98 ns         3.95 ns    175140090
  BM_ctype_isgraph_y2                4.01 ns         4.00 ns    178320453
  BM_ctype_isgraph_y3                3.96 ns         3.95 ns    175412814
  BM_ctype_isgraph_y4                4.01 ns         4.00 ns    175711174
  BM_ctype_islower_n                 3.75 ns         3.74 ns    188604818
  BM_ctype_islower_y                 3.79 ns         3.78 ns    154738238
  BM_ctype_isprint_n                 3.96 ns         3.95 ns    177607734
  BM_ctype_isprint_y1                3.94 ns         3.93 ns    174877244
  BM_ctype_isprint_y2                4.02 ns         4.01 ns    178206135
  BM_ctype_isprint_y3                3.94 ns         3.93 ns    175959069
  BM_ctype_isprint_y4                4.03 ns         4.02 ns    176158314
  BM_ctype_isprint_y5                3.95 ns         3.94 ns    178745462
  BM_ctype_ispunct_n                 3.78 ns         3.77 ns    184727184
  BM_ctype_ispunct_y                 3.76 ns         3.75 ns    187947503
  BM_ctype_isspace_n                 3.74 ns         3.74 ns    185300285
  BM_ctype_isspace_y1                3.77 ns         3.76 ns    187202066
  BM_ctype_isspace_y2                3.73 ns         3.73 ns    184105959
  BM_ctype_isupper_n                 3.81 ns         3.80 ns    185038761
  BM_ctype_isupper_y                 3.71 ns         3.71 ns    185885793
  BM_ctype_isxdigit_n                3.79 ns         3.79 ns    184965673
  BM_ctype_isxdigit_y1               3.76 ns         3.75 ns    188251672
  BM_ctype_isxdigit_y2               3.79 ns         3.78 ns    184187481
  BM_ctype_isxdigit_y3               3.77 ns         3.76 ns    187635540

arm64 after:
  BM_ctype_isalnum_n                 3.37 ns         3.37 ns    205613810
  BM_ctype_isalnum_y1                3.40 ns         3.39 ns    204806361
  BM_ctype_isalnum_y2                3.43 ns         3.43 ns    205066077
  BM_ctype_isalnum_y3                3.50 ns         3.50 ns    200057128
  BM_ctype_isalpha_n                 2.97 ns         2.97 ns    236084076
  BM_ctype_isalpha_y1                2.97 ns         2.97 ns    236083626
  BM_ctype_isalpha_y2                2.97 ns         2.97 ns    236084246
  BM_ctype_isascii_n                 2.55 ns         2.55 ns    272879994
  BM_ctype_isascii_y                 2.46 ns         2.45 ns    286522323
  BM_ctype_isblank_n                 3.18 ns         3.18 ns    220431175
  BM_ctype_isblank_y1                3.18 ns         3.18 ns    220345602
  BM_ctype_isblank_y2                3.18 ns         3.18 ns    220308509
  BM_ctype_iscntrl_n                 3.10 ns         3.10 ns    220344270
  BM_ctype_iscntrl_y1                3.10 ns         3.07 ns    228973615
  BM_ctype_iscntrl_y2                3.07 ns         3.07 ns    229192626
  BM_ctype_isdigit_n                 3.07 ns         3.07 ns    228925676
  BM_ctype_isdigit_y                 3.07 ns         3.07 ns    229182934
  BM_ctype_isgraph_n                 2.66 ns         2.66 ns    264268737
  BM_ctype_isgraph_y1                2.66 ns         2.66 ns    264445277
  BM_ctype_isgraph_y2                2.66 ns         2.66 ns    264327427
  BM_ctype_isgraph_y3                2.66 ns         2.66 ns    264427480
  BM_ctype_isgraph_y4                2.66 ns         2.66 ns    264155250
  BM_ctype_islower_n                 2.66 ns         2.66 ns    264421600
  BM_ctype_islower_y                 2.66 ns         2.66 ns    264341148
  BM_ctype_isprint_n                 2.66 ns         2.66 ns    264415198
  BM_ctype_isprint_y1                2.66 ns         2.66 ns    264268793
  BM_ctype_isprint_y2                2.66 ns         2.66 ns    264419205
  BM_ctype_isprint_y3                2.66 ns         2.66 ns    264205886
  BM_ctype_isprint_y4                2.66 ns         2.66 ns    264440797
  BM_ctype_isprint_y5                2.72 ns         2.72 ns    264333293
  BM_ctype_ispunct_n                 3.52 ns         3.51 ns    198956572
  BM_ctype_ispunct_y                 3.38 ns         3.38 ns    201661792
  BM_ctype_isspace_n                 3.39 ns         3.39 ns    206896620
  BM_ctype_isspace_y1                3.39 ns         3.39 ns    206569020
  BM_ctype_isspace_y2                3.39 ns         3.39 ns    206564415
  BM_ctype_isupper_n                 2.76 ns         2.75 ns    254227134
  BM_ctype_isupper_y                 2.76 ns         2.75 ns    254235314
  BM_ctype_isxdigit_n                3.60 ns         3.60 ns    194418653
  BM_ctype_isxdigit_y1               2.97 ns         2.97 ns    236082424
  BM_ctype_isxdigit_y2               3.48 ns         3.48 ns    200390011
  BM_ctype_isxdigit_y3               3.48 ns         3.48 ns    202255815

arm32 before:
  BM_ctype_isalnum_n                 4.77 ns         4.76 ns    129230464
  BM_ctype_isalnum_y1                4.88 ns         4.87 ns    147939321
  BM_ctype_isalnum_y2                4.74 ns         4.73 ns    145508054
  BM_ctype_isalnum_y3                4.81 ns         4.80 ns    144968914
  BM_ctype_isalpha_n                 4.80 ns         4.79 ns    148262579
  BM_ctype_isalpha_y1                4.74 ns         4.73 ns    145061326
  BM_ctype_isalpha_y2                4.83 ns         4.82 ns    147642546
  BM_ctype_isascii_n                 3.74 ns         3.72 ns    186711139
  BM_ctype_isascii_y                 3.79 ns         3.78 ns    183654780
  BM_ctype_isblank_n                 4.20 ns         4.19 ns    169733252
  BM_ctype_isblank_y1                4.19 ns         4.18 ns    165713363
  BM_ctype_isblank_y2                4.22 ns         4.21 ns    168776265
  BM_ctype_iscntrl_n                 4.75 ns         4.74 ns    145417484
  BM_ctype_iscntrl_y1                4.82 ns         4.81 ns    146283250
  BM_ctype_iscntrl_y2                4.79 ns         4.78 ns    148662453
  BM_ctype_isdigit_n                 4.77 ns         4.76 ns    145789210
  BM_ctype_isdigit_y                 4.84 ns         4.84 ns    146909458
  BM_ctype_isgraph_n                 4.72 ns         4.71 ns    145874663
  BM_ctype_isgraph_y1                4.86 ns         4.85 ns    142037606
  BM_ctype_isgraph_y2                4.79 ns         4.78 ns    145109612
  BM_ctype_isgraph_y3                4.75 ns         4.75 ns    144829039
  BM_ctype_isgraph_y4                4.86 ns         4.85 ns    146769899
  BM_ctype_islower_n                 4.76 ns         4.75 ns    147537637
  BM_ctype_islower_y                 4.79 ns         4.78 ns    145648017
  BM_ctype_isprint_n                 4.82 ns         4.81 ns    147154780
  BM_ctype_isprint_y1                4.76 ns         4.76 ns    145117604
  BM_ctype_isprint_y2                4.87 ns         4.86 ns    145801406
  BM_ctype_isprint_y3                4.79 ns         4.78 ns    148043446
  BM_ctype_isprint_y4                4.77 ns         4.76 ns    145157619
  BM_ctype_isprint_y5                4.91 ns         4.90 ns    147810800
  BM_ctype_ispunct_n                 4.74 ns         4.73 ns    145588611
  BM_ctype_ispunct_y                 4.82 ns         4.81 ns    144065436
  BM_ctype_isspace_n                 4.78 ns         4.77 ns    147153712
  BM_ctype_isspace_y1                4.73 ns         4.72 ns    145252863
  BM_ctype_isspace_y2                4.84 ns         4.83 ns    148615797
  BM_ctype_isupper_n                 4.75 ns         4.74 ns    148276631
  BM_ctype_isupper_y                 4.80 ns         4.79 ns    145529893
  BM_ctype_isxdigit_n                4.78 ns         4.77 ns    147271646
  BM_ctype_isxdigit_y1               4.74 ns         4.74 ns    145142209
  BM_ctype_isxdigit_y2               4.83 ns         4.82 ns    146398497
  BM_ctype_isxdigit_y3               4.78 ns         4.77 ns    147617686

arm32 after:
  BM_ctype_isalnum_n                 4.35 ns         4.35 ns    161086146
  BM_ctype_isalnum_y1                4.36 ns         4.35 ns    160961111
  BM_ctype_isalnum_y2                4.36 ns         4.36 ns    160733210
  BM_ctype_isalnum_y3                4.35 ns         4.35 ns    160897524
  BM_ctype_isalpha_n                 3.67 ns         3.67 ns    189377208
  BM_ctype_isalpha_y1                3.68 ns         3.67 ns    189438146
  BM_ctype_isalpha_y2                3.75 ns         3.69 ns    190971186
  BM_ctype_isascii_n                 3.69 ns         3.68 ns    191029191
  BM_ctype_isascii_y                 3.68 ns         3.68 ns    191011817
  BM_ctype_isblank_n                 4.09 ns         4.09 ns    171887541
  BM_ctype_isblank_y1                4.09 ns         4.09 ns    171829345
  BM_ctype_isblank_y2                4.08 ns         4.07 ns    170585590
  BM_ctype_iscntrl_n                 4.08 ns         4.07 ns    170614383
  BM_ctype_iscntrl_y1                4.13 ns         4.11 ns    171495899
  BM_ctype_iscntrl_y2                4.19 ns         4.18 ns    165255578
  BM_ctype_isdigit_n                 4.25 ns         4.24 ns    165237008
  BM_ctype_isdigit_y                 4.24 ns         4.24 ns    165256149
  BM_ctype_isgraph_n                 3.82 ns         3.81 ns    183610114
  BM_ctype_isgraph_y1                3.82 ns         3.81 ns    183614131
  BM_ctype_isgraph_y2                3.82 ns         3.81 ns    183616840
  BM_ctype_isgraph_y3                3.79 ns         3.79 ns    183620182
  BM_ctype_isgraph_y4                3.82 ns         3.81 ns    185740009
  BM_ctype_islower_n                 3.75 ns         3.74 ns    183619502
  BM_ctype_islower_y                 3.68 ns         3.68 ns    190999901
  BM_ctype_isprint_n                 3.69 ns         3.68 ns    190899544
  BM_ctype_isprint_y1                3.68 ns         3.67 ns    190192384
  BM_ctype_isprint_y2                3.67 ns         3.67 ns    189351466
  BM_ctype_isprint_y3                3.67 ns         3.67 ns    189430348
  BM_ctype_isprint_y4                3.68 ns         3.68 ns    189430161
  BM_ctype_isprint_y5                3.69 ns         3.68 ns    190962419
  BM_ctype_ispunct_n                 4.14 ns         4.14 ns    171034861
  BM_ctype_ispunct_y                 4.19 ns         4.19 ns    168308152
  BM_ctype_isspace_n                 4.50 ns         4.50 ns    156250887
  BM_ctype_isspace_y1                4.48 ns         4.48 ns    155124476
  BM_ctype_isspace_y2                4.50 ns         4.50 ns    155077504
  BM_ctype_isupper_n                 3.68 ns         3.68 ns    191020583
  BM_ctype_isupper_y                 3.68 ns         3.68 ns    191015669
  BM_ctype_isxdigit_n                4.50 ns         4.50 ns    156276745
  BM_ctype_isxdigit_y1               3.28 ns         3.27 ns    214729725
  BM_ctype_isxdigit_y2               4.48 ns         4.48 ns    155265129
  BM_ctype_isxdigit_y3               4.48 ns         4.48 ns    155216846

I've also corrected a small mistake in the documentation for isxdigit().

Test: tests and benchmarks
Change-Id: I4a77859f826c3fc8f0e327e847886882f29ec4a3
2019-10-08 12:04:09 -07:00
..
android Optimize tolower(3)/toupper(3) from <ctype.h>. 2019-09-27 14:42:39 -07:00
lib/libc Reimplement the <ctype.h> is* functions. 2019-10-08 12:04:09 -07:00
README.md Move to .md files for even trivial documentation. 2017-01-07 12:47:28 -08:00

This directory contains upstream OpenBSD source. You should not edit these files directly. Make fixes upstream and then pull down the new version of the file.

TODO: write a script to make this process automated.