IO Multiplexing: The select and poll Function
Introduction
When the TCP client is handling two inputs at the same time: standard input and a TCP socket, we encounter a problem when the client was blocked in a call to fgets
(on standard input) and the server process was killed. The server TCP correctly sent a FIN to the client TCP, but since the client process was blocked reading from standard input, it never saw the EOF until it read from the socket (possibly much later).
We want to be notified if one or more IO condition are ready (i.e., input is ready to be read, or the descriptor is capable of taking more output). The capability is called IO multiplexing and is provided by the select
and poll
functions, as well as a newer POSIX variation of the former, called pselect
.
IO multiplexing is typically used in networking application in many scenarios. And IO multiplexing is not limited to networking programming, many nontrivial applications find a need for these techniques.
IO Models
We firstly examine the basic differences in the five IO models that are available to us under Unix:
- blocking IO
- nonblocking IO
- IO multiplexing (select and poll)
- signal driven IO
- asynchronous IO (the POSIX aio_ functions)
There are normally two distinct phases for an input operations:
- Waiting for data to be ready. This involves waiting for data to arrive on the network. When the packet arrives, it is copied into a buffer within the kernel.
- Copying the data from the kernel to the process. This means copying the (ready) data from the kernel’s buffer into our application buffer.
Blocking IO Model
The most prevalent model for IO is the blocking IO model. By default, all socket are blocking. The scenario is shown in the figure below:
We use UDP for this example instead of TCP because with UDP, the concept of data being “ready” to read is simple: either an entire datagram has been received or it has not. With TCP it gets more complicated, as additional variables such as the socket’s low-water mark come into play.
We also refer to recvfrom
as a system call to differentiating between our application and the kernel, regardless of how recvfrom
is implemented (system call on BSD and function that invokes getmsg
system call on System V). There is normally a switch from running in application to running in the kernel, followed at some time later by a return to the application.
In the figure above, the process calls recvfrom
and the system call does not return until the datagram arrives and is copied into our application buffer, or an error occurs. The most common error is the system call being interrupted by signal. We say that the process is blocked the entire time from when it calls recvfrom
until it returns. When recvfrom
return successfully, our application process the datagram.
Nonblocking IO Model
When a socket is set to be nonblocking, we are telling the kernel “When an IO operation that I request cannot be completed without putting the process to sleep, do not put the process to sleep, but return an error instead”. The figure is below:
- For the first three
recvfrom
, there is no data to return and the kernel immediately returns an error ofEWOULDBLOCK
. - For the forth time we call
recvfrom
, a datagram is ready, it is copied into our application buffer, andrecvfrom
returns successfully. We then process the data.
When an application sits in a loop calling recvfrom
on a nonblocking descriptor like this, it is called polling. The application is continually polling the kernel to see if some operation is ready. This is often a waste of CPU time, but this model is occasionally encountered, normally on systems dedicated to one function.
IO Multiplexing Model
With IO multiplexing, we call select
or poll
and block in one of these two system calls, instead of blocking in the actual IO system call. The figure is a summary of the IO multiplexing model:
We block in a call to select
, waiting for the datagram socket to be readable. When select
returns that the socket is readable, we then call recvfrom
to copy the datagram into our application buffer.
Comparing blocking model:
- Disadvantage: using
select
requires two system call (select
andrecvfrom
) instead of one - Advantage: we can wait for more than one descriptor to be ready
Another closely related IO model is to use multithreading with blocking IO. That model very closely resembles the model described above, except that instead of using select
to block on multiple file descriptors, the program uses multiple threads (one per file descriptor), and each thread is then free to call blocking system calls like recvfrom
.
Signal-Driven IO Model
The signal-driven IO model uses signals, telling the kernel to notify us with the SIGIO
signal when descriptor is ready. The figure is below:
- We first enable the socket for signal-driven IO and install a signal handler using the
sigaction
system call. The return from this system is immediate and our process continues; it is not blocked. - When the datagram is ready to be read, the
SIGIO
signal is generated for our process. We can either:- read the datagram from the signal handler by calling
recvfrom
and then notify the main loop that data is ready to be process. - notify the main loop and let it read the datagram.
- read the datagram from the signal handler by calling
The advantage to this model is that we are not blocked while waiting for the datagram to arrive. The main loop can continue executing and just wait to be notified by the signal handler that either the data is ready to process or the datagram is ready to be read.
Asynchronous IO model
Asynchronous IO is defined by the POSIX specification, and various differences in the read-time function that appeared in the various standards which came together to form the current POSIX specification have been reconciled.
These functions work by telling the kernel to start the operation and to notify us when then entire operation (including the copy of the data from the kernel to our buffer) is complete. The main difference between this model and the signal-driven IO model is that with signal-driven IO, the kernel tells us when IO operation can be initiated, but asynchronous IO, the kernel tells us when an IO operation is complete. See the figure below for example:
- We call
aio_read
(the POSIX asynchronous IO function begin withaio_
orlio_
, This system call returns immediately and our process is not blocked while waiting for the IO to complete) and pass the kernel the following:- descriptor, buffer pointer, buffer size (the same three arguments for
read
) - file offset (similar to
lseek
) - and how to notify us when the entire operation is complete
- descriptor, buffer pointer, buffer size (the same three arguments for
- We assume in this example that we ask the kernel to generate some signal when the operation is complete. This signal is not generated until the data has been copied into our application buffer, which is different from the signal-driven IO model.
Comparison of the IO models
The figure below is a comparison of the five different IO models:
The main difference between the first four models is the first phase, as the second phase in the first four models is the same: the process is blocked in a call to recvfrom
while the data is copied from the kernel to the caller’s buffer. Asynchronous IO, however, handles both phases and is different from the first four.
Synchronous IO versus Asynchronous IO
POSIX defines these two terms as follows:
- A synchronous IO operation causes the requesting process to be blocked until that IO operation completes.
- An asynchronous IO operation does not cause the requesting process to be blocked.
Using these definitions, the first four IO models (blocking, nonblocking, IO complexing and signal-driven IO) are all synchronous because the actual IO operation (recvfrom
) blocks the process. Only the asynchronous IO model matches the asynchronous IO definition.