grant bpfloader explicit membership in some groups
(this is instead of relying on the CAP_CHOWN capability it has) The goal is to allow bpfloader to open maps/programs it creates, so that it can reuse them. By virtue of CAP_CHOWN it can create, pin, then give away ownership, and no longer be able to bpf_obj_get() the pinned map or program (to reuse it at a later time). This could be considered a partial (more targetted) workaround for the lack of bpfloader CAP_DAC_OVERRIDE (or CAP_DAC_READ_SEARCH). But for obvious reasons jeffv@ doesn't really want to grant that. In some sense this doesn't actually really grant any privs on a writeable filesystem, as CHOWN already allows stealing ownership... However explicit membership is much easier to reason about, and does not require playing: - stat (to get current uid/gid/mode) - chown (set uid to root, ie. self -- works due to CAP_CHOWN) - chmod (grant user read if missing) - bpf_obj_get (this now succeeds -- does not require capabilities) - chmod (restore mode) - chown (restore uid -- works due to CAP_CHOWN) games in order to open pinned bpf maps/programs we'd normally be unable to open due to unix uid/gid/mode restrictions. Yes, I've verified the above 'magic' actually works with current privs, provided we grant the missing 'getattr' selinux priv to allow the stat() call. (obviously without it we can still gain access, we just can't undo things) Currently /sys/fs/bpf maps and program ownership on a tip-of-tree T device looks like: $ adb shell getprop ro.build.fingerprint google/oriole/oriole:13/TP1A.220624.007/8785063:userdebug/dev-keys $ adb shell ls -l /sys/fs/bpf/* | egrep '^-' | cut -d' ' -f3-4 | sort | uniq -c count uid gid examples 5 root graphics platform: gpu_mem.o & gpu_work.o 5 root net_admin tethering apex T+: netd.o skfilter_..._xtbpf & schedact_ingress_account programs 10 root net_bw_acct tethering apex T+: netd.o maps 24 root network_stack tethering apex S+: offload.o & test.o 1 root root tethering apex T+: netd.o cgroupsock_inet_create program 38 root system platform & tethering apex T+: time_in_state.o, block.o, clatd.o, dscp_policy.o, netd.o cgroupskb_(e|in)gress_stats And additionally due to the utter lack of a 'groups' line in bpfloader.rc, the default bpfloader gid is of course 'root'. This suggests we should use: groups root graphics network_stack net_admin net_bw_acct system (but only really mainline updatable stuff matters, so we could limit this to just networking and strip out 'graphics'...) A glance through: system/core/libcutils/include/private/android_filesystem_config.h Finds the following groups which might be of interest to bpfloader & mainline networking: * root * system * graphics dhcp vpn mdnsr clat dns dns_tether * network_stack inet net_raw * net_admin net_bw_stats * net_bw_acct [stars mark the one's we've already identified previously] Networking mainline code runs in 3 processes: netd, system_server and network_stack. Based on looking at a live oriole device, these processes have the following uid/gid/groups/capabilities: netd - uid:0[root] gid:0[root] + 3005[net_admin] Cap: 00000000000074ef=cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_kill,cap_setgid,cap_setuid,cap_net_bind_service,cap_net_admin,cap_net_raw,cap_ipc_lock networkstack.process - uid:1073[network_stack] gid:1073[network_stack] + 1073[network_stack] 3002[net_bt] 3003[inet] 3004[net_raw] 3005[net_admin] 3006[net_bw_stats] 3007[net_bw_acct] 9997[everybody] Cap: 0000000000003c00=cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw system_server - uid:1000[system] gid:1000[system] + 1001[radio] 1002[bluetooth] 1003[graphics] 1004[input] 1005[audio] 1006[camera] 1007[log] 1008[compass] 1009[mount] 1010[wifi] 1018[usb] 1021[gps] 1023[media_rw] 1024[mtp] 1032[package_info] 1065[reserved_disk] 3001[net_bt_admin] 3002[net_bt] 3003[inet] 3005[net_admin] 3006[net_bt_stats] 3007[net_bw_acct] 3009[readproc] 3010[wakeloc] 3011[uhid] 3012[readtracefs] Cap: 0000001806897c20=cap_kill,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_sys_module,cap_sys_ptrace,cap_sys_nice,cap_sys_time,cap_sys_tty_config,cap_wake_alarm,cap_block_suspend Since netd has DAC_OVERRIDE, it really doesn't matter from a group analysis perspective (side note: it probably should have a lot more groups than it actually does...) Either way, both *root & *net_admin are already in the above list. For the network stack process the obvious groups are: *network_stack, net_raw, *net_admin, net_bw_stats, *net_bw_acct which means we should add: net_raw, net_bw_stats to the above list. (I'm assuming 'inet' & 'everybody' are too generic groups to be of use for bpf, and that we don't [yet] care about bluetooth (net_bt) being able to use bpf directly) For the system server the choice is harder, but I'd tend to pick: *system, *graphics, *net_admin, *net_bw_acct (Again ignoring non-networking stuff, and assuming radio/bluetooth/wifi bpf use will come at some later point in time.) This gives us decent coverage of the 3 processes (and combinations there-of): netd process -> group root network stack process -> group network_stack system server process -> group system both network stack and system server -> group net_bw_acct Note that due to DAC_OVERRIDE netd always has unix access no matter what, and needs to be limited via selinux contexts instead. Additionally 'net_admin' is used for xt_bpf iptables programs due to need for netutils_wrappers support and it is also usable by all 3 processes. This means we can fully explain all groups that currently show up as in use. Adding net_raw & net_bw_stats is possibly not needed, but also won't hurt, and might be useful in the future. We could also argue that we should add: dhcp, vpn, mdnsr, clat, dns, dns_tether & inet But since none of our mainline code running processes are currently members of those groups (besides netd due to DAC_OVERRIDE), there doesn't seem to be much benefit (this can't be changed with mainline pushes). I assume new stuff which would need these groups will actually only be loaded on U+ bpfloader, which will have a less hacky solution for this problem anyway. Note: on U+ bpfloader we should probably fix this by simply caching all bpf map/prog filedescriptors in a path->fd hashmap, and thus avoid the need to ever reopen anything. This is a far more invasive change, but once done we should be able to revert this change. For safety we'll also want to make sure we abort() if we detect cases that cannot be safely handled by S bpfloader, an example would be maps with uid != root in tethering location. Bug: 218408035 Bug: 237716689 Test: TreeHugger, manual testing Signed-off-by: Maciej Żenczykowski <maze@google.com> Change-Id: I742868b1a6819547fcd7a3573946a2fc479a21a5
This commit is contained in:
parent
ae58e7e4a7
commit
bbf5ee3971
2 changed files with 8 additions and 2 deletions
|
@ -27,6 +27,12 @@ on load_bpf_programs
|
|||
|
||||
service bpfloader /system/bin/bpfloader
|
||||
capabilities CHOWN SYS_ADMIN NET_ADMIN
|
||||
# The following group memberships are a workaround for lack of DAC_OVERRIDE
|
||||
# and allow us to open (among other things) files that we created and are
|
||||
# no longer root owned (due to CHOWN) but still have group read access to
|
||||
# one of the following groups. This is not perfect, but a more correct
|
||||
# solution requires significantly more effort to implement.
|
||||
group root graphics network_stack net_admin net_bw_acct net_bw_stats net_raw system
|
||||
#
|
||||
# Set RLIMIT_MEMLOCK to 1GiB for bpfloader
|
||||
#
|
||||
|
|
|
@ -30,9 +30,9 @@
|
|||
#include <sys/wait.h>
|
||||
#include <unistd.h>
|
||||
|
||||
// This is BpfLoader v0.22
|
||||
// This is BpfLoader v0.23
|
||||
#define BPFLOADER_VERSION_MAJOR 0u
|
||||
#define BPFLOADER_VERSION_MINOR 22u
|
||||
#define BPFLOADER_VERSION_MINOR 23u
|
||||
#define BPFLOADER_VERSION ((BPFLOADER_VERSION_MAJOR << 16) | BPFLOADER_VERSION_MINOR)
|
||||
|
||||
#include "bpf/BpfUtils.h"
|
||||
|
|
Loading…
Reference in a new issue