platform_bionic/docs/EINTR.md
Elliott Hughes 38be11e88c Add some documentation about EINTR.
It's a common cause of confusion, and even a brief explanation can be
quite involved, so it's worth having something we can point to (and
something that interested parties might just find via a web search).

Bug: http://b/207248554
Test: treehugger
Change-Id: I4a6d8917baf99a8f7abef05ce852a31ebe048d68
2021-12-13 10:16:57 -08:00

4.1 KiB

EINTR

The problem

If your code is blocked in a system call when a signal needs to be delivered, the kernel needs to interrupt that system call. For something like a read(2) call where some data has already been read, the call can just return with what data it has. (This is one reason why read(2) sometimes returns less data than you asked for, even though more data is available. It also explains why such behavior is relatively rare, and a cause of bugs.)

But what if read(2) hasn't read any data yet? Or what if you've made some other system call, for which there is no equivalent "partial" success, such as poll(2)? In poll(2)'s case, there's either something to report (in which case the system call would already have returned), or there isn't.

The kernel's solution to this problem is to return failure (-1) and set errno to EINTR: "interrupted system call".

Can I just opt out?

Technically, yes. In practice on Android, no. Technically if a signal's disposition is set to ignore, the kernel doesn't even have to deliver the signal, so your code can just stay blocked in the system call it was already making. In practice, though, you can't guarantee that all signals are either ignored or will kill your process... Unless you're a small single-threaded C program that doesn't use any libraries, you can't realistically make this guarantee. If any code has installed a signal handler, you need to cope with EINTR. And if you're an Android app, the zygote has already installed a whole host of signal handlers before your code even starts to run. (And, no, you can't ignore them instead, because some of them are critical to how ART works. For example: Java NullPointerExceptions are optimized by trapping SIGSEGV signals so that the code generated by the JIT doesn't have to insert explicit null pointer checks.)

Why don't I see this in Java code?

You won't see this in Java because the decision was taken to hide this issue from Java programmers. Basically, all the libraries like java.io.* and java.net.* hide this from you. (The same should be true of android.* too, so it's worth filing bugs if you find any exceptions that aren't documented!)

Why doesn't libc do that too?

For most people, things would be easier if libc hid this implementation detail. But there are legitimate use cases, and automatically retrying would hide those. For example, you might want to use signals and EINTR to interrupt another thread (in fact, that's how interruption of threads doing I/O works in Java behind the scenes!). As usual, C/C++ choose the more powerful but more error-prone option.

The fix

Easy cases

In most cases, the fix is simple: wrap the system call with the TEMP_FAILURE_RETRY macro. This is basically a while loop that retries the system call as long as the result is -1 and errno is EINTR.

So, for example:

  n = read(fd, buf, buf_size); // BAD!
  n = TEMP_FAILURE_RETRY(read(fd, buf, buf_size)); // GOOD!

close(2)

TL;DR: never wrap close(2) calls with TEMP_FAILURE_RETRY.

The case of close(2) is complicated. POSIX explicitly says that close(2) shouldn't close the file descriptor if it returns EINTR, but that's not true on Linux (and thus on Android). See Returning EINTR from close() for more discussion.

Given that most Android code (and especially "all apps") are multithreaded, retrying close(2) is especially dangerous because the file descriptor might already have been reused by another thread, so the "retry" succeeds, but actually closes a different file descriptor belonging to a different thread.

Timeouts

System calls with timeouts are the other interesting case where "just wrap everything with TEMP_FAILURE_RETRY()" doesn't work. Because some amount of time will have elapsed, you'll want to recalculate the timeout. Otherwise you can end up with your 1 minute timeout being indefinite if you're receiving signals at least once per minute, say. In this case you'll want to do something like adding an explicit loop around your system call, calculating the timeout inside the loop, and using continue each time the system call fails with EINTR.