platform_bionic/libc/kernel
Elliott Hughes 24400ac1e3 am e63ea6ac: Merge "Clean up <sys/mount.h>/<linux/fs.h> duplication."
* commit 'e63ea6aca7d01ba9219b877ed5fab6de5c6037fb':
  Clean up <sys/mount.h>/<linux/fs.h> duplication.
2013-07-08 10:12:32 -07:00
..
arch-arm/asm Clean up trailing whitespace in the kernel headers. 2013-01-30 10:13:07 -08:00
arch-mips/asm Clean up trailing whitespace in the kernel headers. 2013-01-30 10:13:07 -08:00
arch-x86/asm Clean up trailing whitespace in the kernel headers. 2013-01-30 10:13:07 -08:00
common am e63ea6ac: Merge "Clean up <sys/mount.h>/<linux/fs.h> duplication." 2013-07-08 10:12:32 -07:00
tools Slight script cleanup; make gensyscalls work from any directory. 2013-06-17 10:39:33 -07:00
README.TXT Clean up trailing whitespace in the kernel headers. 2013-01-30 10:13:07 -08:00

Bionic comes with a set of 'clean' Linux kernel headers that can safely be
included by userland applications and libraries without fear of hideous
conflicts. for more information why this is needed, see the "RATIONALE"
section at the end of this document.

these clean headers are automatically generated by several scripts located
in the 'bionic/kernel/tools' directory, which process a set of original
and unmodified kernel headers in order to get rid of many annoying
declarations and constructs that usually result in compilation failure.

the 'clean headers' only contain type and macro definitions, with the
exception of a couple static inline functions used for performance
reason (e.g. optimized CPU-specific byte-swapping routines)

they can be included from C++, or when compiling code in strict ANSI mode.
they can be also included before or after any Bionic C library header.

the generation process works as follows:

  * 'external/kernel-headers/original/'
    contains a set of kernel headers as normally found in the 'include'
    directory of a normal Linux kernel source tree. note that this should
    only contain the files that are really needed by Android (use
    'find_headers.py' to find these automatically).

  * 'bionic/libc/kernel/common'
    contains the non-arch-specific clean headers and directories
    (e.g. linux, asm-generic and mtd)

  * 'bionic/libc/kernel/arch-arm/'
    contains the ARM-specific directory tree of clean headers.

  * 'bionic/libc/kernel/arch-arm/asm'
    contains the real ARM-specific headers

  * 'bionic/libc/kernel/arch-x86'
    'bionic/libc/kernel/arch-x86/asm'
    similarly contains all headers and symlinks to be used on x86

  * 'bionic/libc/kernel/tools' contains various Python and shell scripts used
    to manage and re-generate the headers

the tools you can use are:

  * tools/find_users.py
    scans a list of source files or directories and prints which ones do
    include Linux headers.

  * tools/find_headers.py
    scans a list of source files or directories and recursively finds all
    the original kernel headers they need.

  * tools/clean_header.py
    prints the clean version of a given kernel header. with the -u option,
    this will also update the corresponding clean header file if its
    content has changed. you can also process more than one file with -u

  * tools/update_all.py
    automatically update all clean headers from the content of 
    'external/kernel-headers/original'. this is the script you're likely going to
    run whenever you update the original headers.


HOW TO BUILD BIONIC AND OTHER PROGRAMS WITH THE CLEAN HEADERS:
==============================================================

add bionic/kernel/common and bionic/kernel/arch-<yourarch> to your C
include path. that should be enough. Note that Bionic will not compile properly 
if you don't.


HOW TO SUPPORT ANOTHER ARCHITECTURE:
====================================

see the content of tools/defaults.py, you will need to make a few updates
here:

  - add a new item to the 'kernel_archs' list of supported architectures

  - add a proper definition for 'kernel_known_<arch>_statics' with
    relevant definitions.

  - update 'kernel_known_statics' to map "<arch>" to
    'kernel_known_<arch>_statics'

then, add the new architecture-specific headers to original/asm-<arch>.
(please ensure that these are really needed, e.g. with tools/find_headers.py)

finally, run tools/update_all.py



HOW TO UPDATE THE HEADERS WHEN NEEDED:
======================================

IMPORTANT IMPORTANT:

  WHEN UPDATING THE HEADERS, ALWAYS CHECK THAT THE NEW CLEAN HEADERS DO
  NOT BREAK THE KERNEL <-> USER ABI, FOR EXAMPLE BY CHANGING THE SIZE
  OF A GIVEN TYPE. THIS TASK CANNOT BE EASILY AUTOMATED AT THE MOMENT

copy any updated kernel header into the corresponding location under
'bionic/kernel/original'.

for any new kernel header you want to add, first run tools/find_headers.py to be
sure that it is really needed by the Android sources. then add it to
'bionic/kernel/original'

then, run tools/update_all.py to re-run the auto-cleaning



HOW THE CLEANUP PROCESS WORKS:
==============================

this section describes the action performed by the cleanup program(s) when they
process the original kernel headers into clean ones:

1. Optimize well-known macros (e.g. __KERNEL__, __KERNEL_STRICT_NAMES)

    this pass gets rid of everything that is guarded by a well-known macro
    definition. this means that a block like

       #ifdef __KERNEL__
       ....
       #endif

    will be totally omitted from the output. the optimizer is smart enough to
    handle all complex C-preprocessor conditional expression appropriately.
    this means that, for example:

       #if defined(__KERNEL__) || defined(FOO)
       ...
       #endif

    will be transformed into:

       #ifdef FOO
       ...
       #endif

    see tools/defaults.py for the list of well-known macros used in this pass,
    in case you need to update it in the future.

    note that this also remove any reference to a kernel-specific configuration
    macro like CONFIG_FOO from the clean headers.


2. remove variable and function declarations:

  this pass scans non-directive text and only keeps things that look like a
  typedef/struct/union/enum declaration. this allows to get rid of any variable
  or function declaration that should only be used within the kernel anyway
  (and which normally *should* be guarded in a #ifdef __KERNEL__ ... #endif
  block, if the kernel writers were not so messy)

  there are however a few exceptions: it is seldom useful to keep the definition
  of some static inline functions performing very simple operations. a good
  example is the optimized 32-bit byte-swap function found in
  arch-arm/asm/byteorder.h

  the list of exceptions is in tools/defaults.py in case you need to update it
  in the future.

  note that we do *not* remove macro definitions, including these macro that
  perform a call to one of these kernel-header functions, or even define other
  functions. we consider it safe since userland applications have no business
  using them anyway.


3. whitespace cleanup:

  the final pass remove any comments and empty lines from the final headers.


4. add a standard disclaimer:

  prepended to each generated header, contains a message like
  "do not edit directly - file was auto-generated by ...."


RATIONALE:
==========

OVERVIEW OF THE CURRENT KERNEL HEADER MESS:
-------------------------------------------

The original kernel headers are not easily usable from userland applications.
they contain many declarations and construct that will result in a compilation
failure or even worse, incorrect behaviour. for example:

- some headers try to define Posix types (e.g. size_t, ssize_t) that can
  conflict with the corresponding definitions provided by your C library.

- some headers use constructs that cannot be compiled in ANSI C mode.

- some headers use constructs do not compile with C++ at all.

- some headers contain invalid "legacy" definitions for the benefit of old
  C libraries (e.g. glibc5) but result in incorrect behaviour if used
  directly.

  e.g. gid_t being defined in <linux/types.h> as a 16-bit type while the
  kernel uses 32-bit ids. this results in problems when getgroups() or
  setgroups() are called, since they operate on gid_t arrays.

unfortunately, these headers are also the only source of some really extensive
constant and type definitions that are required by userland applications.
think any library/program that need to access ALSA, or Video4Linux, or
anything related to a specific device or Linux-specific system interface
(e.g. IOCTLS, etc...)

As a consequence, every Linux distribution provides a set of patched kernel
headers to be used by userland applications (which installs in
/usr/include/linux/, /usr/include/asm/, etc...). these are manually maintained
by distribution packagers, and generated either manually or with various
scripts. these headers are also tailored to GNU LibC and cannot be reused
easily by Bionic.

for a really long period, the kernel authors have stated that they don't want
to fix the problem, even when someone proposed a patch to start cleaning the
official headers. from their point of view this is purely a library author
problem.

fortunately, enlightnment happened, and the kernel now provides a way to
install a set of "user-friendly" headers that are generated from the official
ones by stripping the __KERNEL__ protected declarations.

unfortunately, this is not enough for Bionic because the result still contains
a few broken declarations that are difficult to route around. (see below for
a little bit of details).

we plan to be able to support these kernel-generated user-land headers in the
future, but the priority on this issue is very low.


WHAT WE DO:
-----------

so we're doomed to repeat the same effort than anyone else. the big difference
here is that we want to automate as much as possible the generation of the
clean headers to easily support additional architectures in the future,
and keep current with upstream changes in the header definitions with the
least possible hassle.

of course, this is only a race to the bottom. the kernel maintainers still
feel free to randomly break the structure of their headers (e.g. moving the
location of some files) occasionally, so we'll need to keep up with that by
updating our build script/original headers as these cases happen.

what we do is keep a set of "original" kernel headers, and process them
automatically to generate a set of "clean" headers that can be used from
userland and the C library.

note that the "original" headers can be tweaked a little to avoid some subtle
issues. for example:

- when the location of various USB-related headers changes in the kernel
  source tree, we want to keep them at the same location in our generated
  headers (there is no reason to break the userland API for something
  like that).

- sometimes, we prefer to take certain things out of blocks guarded by a
  #ifdef __KERNEL__ .. #endif. for example, on recent kernels <linux/wireless.h>
  only includes <linux/if.h> when in kernel mode. we make it available to
  userland as well since some code out there assumes that this is the case.

- sometimes, the header is simply incorrect (e.g. it uses a type without
  including the header that defines it before-hand)