Until fuse-bpf is upstreamed, the value BPF_PROG_TYPE_FUSE is
dynamically defined, so use BPF_PROG_TYPE_UNSPEC as a placeholder and read
the actual value from sys/fs/fuse
Bug: 202785178
Test: fuse bpf can be enabled successfully
Change-Id: I67d3ff45768b581a6b239e235edaa6e46a2f6fe0
This also effectively forces the 'struct bpf_map_def'
definition to include the zero field, and thus to have
the uid field at the new offset.
Bug: 239807354
Test: TreeHugger
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: I44dd740bda4cce4d8fcb04895ee7eb3d4c3300e4
1. The TEST_MAPPING file should be inside bpf folder to be
triggered on any system/bpf change.
2. It should include Connectivity module to run connectivity CTS
tests on any change.
Test: TH
Fix: 235927729
Change-Id: I00149bf8a278853efdd2fcda4bcfb56398a73687
Underscore character may cause bpf prog/map naming collision. For
example, x.o with map y_z and x_y.o with map z both result in x_y_z
prog/map name, which should be prevented during compile-time.
aosp/2147825 will prohibit underscore character in bpf source name
(source name derives the obj name). Existing bpf modules with underscore
characters in source name need to be updated accordingly.
Bug: 236706995
Test: adb root; adb shell ls -l /sys/fs/bpf/ | grep timeInState
Change-Id: Ia4eefd8b7debed2c81e194052488e15df72cab69
Underscore character may cause bpf prog/map naming collision. For
example, x.o with map y_z and x_y.o with map z both result in x_y_z
prog/map name, which should be prevented during compile-time.
aosp/2147825 will prohibit underscore character in bpf source name
(source name derives the obj name). Existing bpf modules with underscore
characters in source name need to be updated accordingly.
Bug: 236706995
Test: atest libbpf_load_test
Change-Id: I037ccfedc4d2e48688ee47f575c73998ce1c2c4b
This allows for 2 different versions of obj.o to be
shipped simultaneously (via mainline module),
with different bpfloader version limitations.
For example a obj.o for bpfloader < 0.25 and a
obj@25.o for bpfloader 0.25+. These can provide
for different implementations of maps/programs,
while still being pinned into the same ultimate
destination in /sys/fs/bpf/.../{map,prog}_obj_...
so as to not require special selection of appropriate
program/map path names in higher level code
(at least for common functionality).
When using this functionality one does have to be
careful to not end up with unintentional duplication
(ie. an obj@1.o and obj@2.o that both load on bpfloader
version X), and to make sure that the defined
bpf maps and programs with identical names
are also sufficiently identical in behaviour.
In practice it is likely that all versions of obj@ver.o
will be built from the same source code, with compilation
controlled by appropriate preprocessor conditional macros,
to hide certain parts of obj.c while building the version
for older bpfloader...
However, exactly how to use this is ultimately
left up to the future...
Multiple viable mechanisms exist:
(a) each obj@ver.o is standalone, only one should be loaded,
bpfloader min/max version annotations would be used
to guarantee this, making sure that programs/maps that
exist in multiple versions of obj.o should have matching
types and behaviours, but nothing guarantees this
(although key/value size checks will certainly help)
(b) obj.o is baseline and always loaded,
while obj@25.o is an extension with extra maps/programs
and is loaded on only newer bpfloaders,
it may have duplicate defines of shared maps
(likely via #include of some shared header file)
if some of the extra programs also need some of
the data from maps from the 'older' obj.o
(c) variously complex combinations of (a) and (b) are also possible
Bug: 218408035
Test: TreeHugger, manually with offload@1.o in p/m/C
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: Ib60d07b18fd6617d456c2c469f8e8ed166aadffd
(this more closely matches how this is stored in the 32 char field,
and is thus simply less confusing)
Before:
I LibBpfLoader: map configuration_map selinux_context [ fs_bpf_netd_readonly] -> 5 -> 'fs_bpf_netd_readonly' (netd_readonly/)
After:
I LibBpfLoader: map configuration_map selinux_context [fs_bpf_netd_readonly ] -> 5 -> 'fs_bpf_netd_readonly' (netd_readonly/)
Test: TreeHugger
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: I734f21b7bf562a411fcb2b21fe5bb20fb7004ec9
we currently have a way to entirely ignore a bpf .o file on older
bpfloaders, however sometimes (especially during development) we
don't want the file to be ignored (because it is required for correct
system operation) but we already know at build time that it is incompatible
due to some missing bpfloader functionality.
While in such a situation stuff will likely fail anyway (either
bpfloader will fail due to missing privs or maps/programs won't
be created and netd updatable library will fail, or something else...),
this happens *much* later and is actually much harder to debug.
This way the failure is early, and the error message is clear.
This will make it easier for us to prevent such broken setups
from ever making it out to devices, as - for a critical program
- the resulting bpfloader boot time failure will trigger a mainline
module rollback.
While developing further isolation changes to tethering apex bpf
programs I've run into cases where I've wanted to make changes
that requires bpfloader >= X to work correctly. This will help
make such changes safely in the future.
Bug: 218408035
Test: TreeHugger, manual testing
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: Ic11b48312dfe5960b787e1786b690773a41a4d18
(this is instead of relying on the CAP_CHOWN capability it has)
The goal is to allow bpfloader to open maps/programs it creates,
so that it can reuse them. By virtue of CAP_CHOWN it can create,
pin, then give away ownership, and no longer be able to bpf_obj_get()
the pinned map or program (to reuse it at a later time).
This could be considered a partial (more targetted) workaround
for the lack of bpfloader CAP_DAC_OVERRIDE (or CAP_DAC_READ_SEARCH).
But for obvious reasons jeffv@ doesn't really want to grant that.
In some sense this doesn't actually really grant any privs on a writeable
filesystem, as CHOWN already allows stealing ownership...
However explicit membership is much easier to reason about,
and does not require playing:
- stat (to get current uid/gid/mode)
- chown (set uid to root, ie. self -- works due to CAP_CHOWN)
- chmod (grant user read if missing)
- bpf_obj_get (this now succeeds -- does not require capabilities)
- chmod (restore mode)
- chown (restore uid -- works due to CAP_CHOWN)
games in order to open pinned bpf maps/programs we'd normally be unable
to open due to unix uid/gid/mode restrictions.
Yes, I've verified the above 'magic' actually works with current privs,
provided we grant the missing 'getattr' selinux priv to allow the stat() call.
(obviously without it we can still gain access, we just can't undo things)
Currently /sys/fs/bpf maps and program ownership on a tip-of-tree T device looks like:
$ adb shell getprop ro.build.fingerprint
google/oriole/oriole:13/TP1A.220624.007/8785063:userdebug/dev-keys
$ adb shell ls -l /sys/fs/bpf/* | egrep '^-' | cut -d' ' -f3-4 | sort | uniq -c
count uid gid examples
5 root graphics platform: gpu_mem.o & gpu_work.o
5 root net_admin tethering apex T+: netd.o skfilter_..._xtbpf & schedact_ingress_account programs
10 root net_bw_acct tethering apex T+: netd.o maps
24 root network_stack tethering apex S+: offload.o & test.o
1 root root tethering apex T+: netd.o cgroupsock_inet_create program
38 root system platform & tethering apex T+: time_in_state.o, block.o, clatd.o, dscp_policy.o, netd.o cgroupskb_(e|in)gress_stats
And additionally due to the utter lack of a 'groups' line in bpfloader.rc,
the default bpfloader gid is of course 'root'.
This suggests we should use:
groups root graphics network_stack net_admin net_bw_acct system
(but only really mainline updatable stuff matters, so we could limit
this to just networking and strip out 'graphics'...)
A glance through:
system/core/libcutils/include/private/android_filesystem_config.h
Finds the following groups which might be of interest to bpfloader & mainline networking:
* root
* system
* graphics
dhcp
vpn
mdnsr
clat
dns
dns_tether
* network_stack
inet
net_raw
* net_admin
net_bw_stats
* net_bw_acct
[stars mark the one's we've already identified previously]
Networking mainline code runs in 3 processes: netd, system_server and network_stack.
Based on looking at a live oriole device, these processes have the following
uid/gid/groups/capabilities:
netd - uid:0[root] gid:0[root] + 3005[net_admin]
Cap: 00000000000074ef=cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_kill,cap_setgid,cap_setuid,cap_net_bind_service,cap_net_admin,cap_net_raw,cap_ipc_lock
networkstack.process - uid:1073[network_stack] gid:1073[network_stack] + 1073[network_stack]
3002[net_bt] 3003[inet] 3004[net_raw] 3005[net_admin] 3006[net_bw_stats] 3007[net_bw_acct] 9997[everybody]
Cap: 0000000000003c00=cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw
system_server - uid:1000[system] gid:1000[system] + 1001[radio] 1002[bluetooth] 1003[graphics]
1004[input] 1005[audio] 1006[camera] 1007[log] 1008[compass] 1009[mount] 1010[wifi]
1018[usb] 1021[gps] 1023[media_rw] 1024[mtp] 1032[package_info] 1065[reserved_disk]
3001[net_bt_admin] 3002[net_bt] 3003[inet] 3005[net_admin] 3006[net_bt_stats] 3007[net_bw_acct]
3009[readproc] 3010[wakeloc] 3011[uhid] 3012[readtracefs]
Cap: 0000001806897c20=cap_kill,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_sys_module,cap_sys_ptrace,cap_sys_nice,cap_sys_time,cap_sys_tty_config,cap_wake_alarm,cap_block_suspend
Since netd has DAC_OVERRIDE, it really doesn't matter from a group analysis perspective
(side note: it probably should have a lot more groups than it actually does...)
Either way, both *root & *net_admin are already in the above list.
For the network stack process the obvious groups are:
*network_stack, net_raw, *net_admin, net_bw_stats, *net_bw_acct
which means we should add:
net_raw, net_bw_stats
to the above list.
(I'm assuming 'inet' & 'everybody' are too generic groups to be of use for bpf,
and that we don't [yet] care about bluetooth (net_bt) being able to use bpf directly)
For the system server the choice is harder, but I'd tend to pick:
*system, *graphics, *net_admin, *net_bw_acct
(Again ignoring non-networking stuff, and assuming radio/bluetooth/wifi bpf
use will come at some later point in time.)
This gives us decent coverage of the 3 processes (and combinations there-of):
netd process -> group root
network stack process -> group network_stack
system server process -> group system
both network stack and system server -> group net_bw_acct
Note that due to DAC_OVERRIDE netd always has unix access no matter what,
and needs to be limited via selinux contexts instead.
Additionally 'net_admin' is used for xt_bpf iptables programs due to need
for netutils_wrappers support and it is also usable by all 3 processes.
This means we can fully explain all groups that currently show up as in use.
Adding net_raw & net_bw_stats is possibly not needed, but also won't hurt,
and might be useful in the future.
We could also argue that we should add:
dhcp, vpn, mdnsr, clat, dns, dns_tether & inet
But since none of our mainline code running processes are currently
members of those groups (besides netd due to DAC_OVERRIDE), there doesn't
seem to be much benefit (this can't be changed with mainline pushes).
I assume new stuff which would need these groups will actually only be loaded
on U+ bpfloader, which will have a less hacky solution for this problem anyway.
Note: on U+ bpfloader we should probably fix this by simply caching
all bpf map/prog filedescriptors in a path->fd hashmap, and thus
avoid the need to ever reopen anything. This is a far more invasive change,
but once done we should be able to revert this change.
For safety we'll also want to make sure we abort() if we detect cases
that cannot be safely handled by S bpfloader, an example would be
maps with uid != root in tethering location.
Bug: 218408035
Bug: 237716689
Test: TreeHugger, manual testing
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: I742868b1a6819547fcd7a3573946a2fc479a21a5
This is to prevent platform and tethering mainline module updatable
code from being to step on each other.
Bug: 218408035
Test: TreeHugger
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: I8f4ffafb72efb17d07aaf993892c5d395bd6876d
After we chown, we potentially are no longer the owner (if we chown'ed
to a different uid then AID_ROOT) and thus can no longer chmod.
If we chmod before we chown, this problem is eliminated.
This fixes uid != AID_ROOT for maps.
Bug: 218408035
Bug: 237716689
Test: TreeHugger
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: Ifbbe37d8dceff245d4def950712f1847615d5856
bpf_obj_get(path) is entirely equivalent to mapRetrieveRW(path)
See implementation in frameworks/libs/net common/native/bpf_syscall_wrappers/include/BpfSyscallWrappers.h:
inline int bpfFdGet(const char* pathname, uint32_t flag) {
return bpf(BPF_OBJ_GET, {
.pathname = ptr_to_u64(pathname),
.file_flags = flag,
});
}
inline int mapRetrieve(const char* pathname, uint32_t flag) { return bpfFdGet(pathname, flag); }
inline int mapRetrieveRW(const char* pathname) { return mapRetrieve(pathname, 0); }
inline int mapRetrieveRO(const char* pathname) { return mapRetrieve(pathname, BPF_F_RDONLY); }
inline int mapRetrieveWO(const char* pathname) { return mapRetrieve(pathname, BPF_F_WRONLY); }
inline int retrieveProgram(const char* pathname) { return bpfFdGet(pathname, BPF_F_RDONLY); }
However, this requires selinux file:write access which bpfloader
currently lacks, ie. we would need:
system/sepolicy private/bpfloader.te:
-allow bpfloader bpffs_type:file { create read rename setattr };
+allow bpfloader bpffs_type:file { create read rename setattr write };
Switching from mapRetrieveRW() to mapRetrieveRO() eliminates this problem.
BpfLoader itself does not need to write to the maps, while bpf program write
access is controlled by a different bit (it is independent of the r/w access
bits of the file descriptor itself).
Verified by re-opening the maps after pinning them.
Bug: 218408035
Bug: 237716689
Test: TreeHugger
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: Ica7ac8ee48d4a73e5f92dbf47cd441c3bfba38cf
BTF support was added to bpfloader during Android T dev cycle.
As it causes bpfloader boot time process to shell out to a new
btfloader subprocess for every bpf.o file with BTF debugging
information compiled in, I'm worried this might have unforeseen
consequences - things like crashes or boot time bpfloader cpu
regressions.
However, BTF is exceedingly useful for debugging,
and it would be a huge pity if we were forced to disable it
and keep it disabled in mainline tethering module just to support
Android T devices for the next 5+ years.
It would also be a pity if the bpf.o files in prebuilt mainline
tethering apex in dev branches (like tm-dev) did not include
BTF debug information simply due to compatibility with older
kernels or OSes, since this would require rebuilding the module
everytime BTF information could be of use.
One of the things functioning BTF enables is 'cat /sys/fs/bpf/map_*'
as root on a userdebug build. Among other things this can be
used to verify that in kernel bpf map state matches that dumped
by the mainline module's pretty dump code.
Even if there's issues wrt. BTF in Android T (or on older kernels),
we can always fix them in Android U or later (for example build
btfloader into bpfloader to avoid exec overhead, etc...).
Bug: 218408035
Bug: 230585250
Bug: 235559605
Test: TreeHugger, cuttlefish devices boots, and:
adb root && adb shell cat /sys/fs/bpf/map_time_in_state_cpu_last_pid_map
continues to show information which is available due to BTF debug provided
in /system/etc/bpf/time_in_state.o file (due to "btf: true" in Android.bp)
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: I553e90e0414453f8f3aaca8cf05d5decc8b911a2
(cherry picked from commit 37b3d06333)
Merged-In: I553e90e0414453f8f3aaca8cf05d5decc8b911a2
This reverts commit 8e5e239dd0a7856e486a9e2d772ef3a9e38021eb.
No longer required now that selinux change has landed.
(while we're at it bump the bpfloader version to v0.19 just
to make sure we can tell these apart in the future)
Bug: 218408035
Test: TreeHugger
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: I0b3349ef908bbfd225b8b7f83b2a4a8870c1e409
(it requires bpfloader to be granted rename priv by selinux)
Bug: 218408035
Test: TreeHugger
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: Ic27d0e5f3c6b78db39c6742fe9e1955f65d2b1f5
This is needed to make it safe to share maps across different .o's.
Bug: 218408035
Test: TreeHugger
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: I9e8a5893ed4f91354f6544be587b10a97d179de6
For use by:
- maps netd should have read but not write access to
(needed due to netd being root with DAC_OVERRIDE,
and thus not obeying standard unix permissions)
- programs that netd should have access to but
not netutils_wrappers (which due to being able to
run iptables, needs access to xt_bpf programs)
Bug: 218408035
Test: booted on cuttlefish
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: I72b106692a25077ff54252fd93db81f46b52125d
combined with content of /proc/${pid}/fd/${fd}
and /proc/${pid}/fdinfo/${fd} this allows figuring
out which programs/maps a process continues to
have open
See:
adbz logcat -d | egrep '(map|prog) .* id'
adbz root
adbz_bpf_progs_and_maps
Bug: 230880517
Test: TreeHugger
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: I66c7fa12d079257486569105368bf1eea2d3ab0c
Original change: https://android-review.googlesource.com/c/platform/system/bpf/+/2071267
Change-Id: I6ffab207294946357f3a97685d3eacc75b724e89
Ignore-AOSP-First: this is an automerge
Signed-off-by: Automerger Merge Worker <android-build-automerger-merge-worker@system.gserviceaccount.com>
(no selinux consequences to this in and of itself,
though required for follow up selinux changes)
Test: TreeHugger, manual
Bug: 218408035
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: Id5ef7f3b119743ef26b062068756c4e0f754e694
This also fixes a permissions issue if a non-root user is set. The read
permissions should be set before the file is set as non-root to ensure
that the permissions can be set without error.
Bump the BPF loader version.
Bug: 203462310
Test: Ensure that vendor skfilter bpf programs can load
Change-Id: Ib6b9a64d8652ff464c9d4d734bb8ae351673b6ce