Linux multiplexing

Linux multiplexing
Mark Veltzer
mailto: mark@veltzer.net

What is multiplexing?
● Multiplexing is doing multiple IO operations in
just one thread.
● It is also used in the more general meaning of
waiting for more than one thing to happen.
● In our case it means waiting for more than one
thing to happen at the OS level (usually IO to
complete).
● This could be IO, signal, event from another
thread, timer and more.

Multiplexing and multi-threading
● Some see multiplexing as an alternative to multi
threading.
● If one thread can do IO on many file descriptors and
if IO is slow then one thread may handle all the IO
needs of the application.
● This view has a lot of truth to it and practice has
shown that you can implement a very efficient web
server using a single thread.
● Note that this has nothing to do with content
generation or business logic. They may need threads
because they are CPU consuming.

Multiplexing and multi-threading
(cont)
● Multiplexing can also be used in concert with multi-
threading.
● Either one thread will do multiplexing and will use
a thread pool to do business logic.
● Or you could have more than one multiplexer (with
it's own thread pool, or not...)
● Because of the rising speed of network devices
you may need more than one multiplexer.
● Obviously then you need to divide connections/file
descriptors between your multiplexing threads.

What about signal handling?
● Signal handling could be seen as a sort of multiplexing.
● It enables you to do more than one thing in one thread:
both wait for a signal and do synchronous IO at the
same time.
● But signal handling is asynchronous which causes
problems for many programmers.
● While multiplexing is synchronous (albeit not easy to
implement correctly).
● And signal handling was never scalable to a large
number of concurrent tasks.

What about asynchronous IO?
● Asynchronous IO does allow you to do
something like multiplexing: do IO and not wait
for it to complete.
● But how would you know that the IO has
completed?
● You either get a signal with signal handling
problems.
● Or wait. But that is exactly what multiplexing
does.

The reactor pattern
● The reactor pattern in ACE and other frameworks is a pattern
built on top of the OS specific multiplexing API.
● You write multiple objects with methods to handle various types
of events.
● The framework then uses a single thread to wait for any of these
events using the OS API.
● When the event happens the thread wakes up and calls the
appropriate method on the appropriate object.
● And you get a nice C++ OO API so that you can easily write the
event handlers independent of each other.

File descriptors in UNIX
● File descriptors in UNIX were flexible to begin with.
● They could stand for:
● Regular file / folders
● Device driver
● Network socket
● Listening socket
● Anonymous or named pipes
● UNIX domain socket
● Virtual files, like /proc, which look like regular files
● So multiplexing on many file descriptors is already
powerful.

File descriptors in Linux
● In order to make multiplexing more flexible and
effective the concept of file descriptor was
enhanced in Linux to include things that were
never in the UNIX definition.
● In Linux a file descriptor can also be:
● Timer (timerfd)
● Signal file descriptor for synchronous signal
delivery. (signalfd)
● Event from another thread or process (eventfd)

timerfd
● API: timerfd_create(2), timerfd_settime(2),
timerfd_gettime(2)
● You create (open?!?) a timerfd by calling
timerfd_create.
● You can say what type of clock (hardware, software,
etc) your timer will be based on.
● You get a file descriptor.
● Now you can read(2) it and you block until the timer
expires.
● Or you can multiplex on it.

signalfd
● API: signalfd(2)
● You call signalfd(2) and create (open?!?) a file
descriptor.
● You can specify what signals you want this file
descriptor to catch and various flags.
● You can now read(2) this file descriptor and go
to sleep and wait for any of the prescribed
signals.

eventfd
● API: eventfd(2), eventfd_read(2), eventfd_write(2)
● Create (open?) an eventfd using eventfd(2)
● Now write or read from it.
● Writes and reads only work in 64 bit chunks.
● You can now read(2) this file descriptor and go to
sleep to wait for this event. You will wake up once
another thread does a write(2) to this file descriptor.

eventfd (cont)
● More efficient than a pipe (no circular buffer,
messages are constant size).
● Simpler than a pipe (one fd and not two, no
signals involved).
● Kernel uses this to signal user space (async io).
● You can use it for inter-thread or inter-process
communication.
● For instance using fork.

Linux multiplexing API
● Is comprised of three system calls (or system
call groups): select, poll and epoll.
● They all work on file descriptors.
● Since file descriptor in Linux can stand for all
the previously mentioned types multiplexing is
very flexible.
● You pack file descriptors into a group and then
you wait on the group.

Why 3 different APIs?
● Because they evolved through time.
● First was select, then poll and then epoll.
● Differences are: size of group of fds, efficient
handling of large groups of fds, timer support,
signal masking and more.
● Epoll is the most advanced API so we will
concentrate on that.

epoll
● API: epoll(2), epoll_create(2), epoll_create1(2),
epoll_ctl(2), epoll_pwait(2), epoll_wait(2)
● You create (open?) an epoll file descriptor using
epoll_create(2).
● Now you can add/remove file descriptors
to/from it using epoll_ctl.
● When adding a file descriptor you can say what
you are waiting on this file descriptor for (read,
write, error, etc).

epoll (cont)
● Now you can epoll_wait(2) until something
happens.
● When you wake up your array of epll_event
structs is filled with details about what
happened.
● You handle the events and go back to
epoll_wait(2).

epoll (cont)
● In epoll the list of file descriptors is held in
kernel space.
● This is seen as one of epolls main strengths
since passing a large file descriptor collection on
each call to select(2) or poll(2) was problematic.
● This is what makes epoll(2) scale to large
numbers of file descriptors (some report
hundreds of thousands).
● Some see this as the pinnacle of Linux APIs.

Multiplexing - conclusions
● Multiplexing in Linux is very powerful.
● You can do many many things in parallel in a single
thread/process.
● It scales well to large numbers of file descriptors.
● The API is not easy but that is the kernel design.
● The idea is that it will be wrapped by user level libraries,
maybe even OO.
● ACE needs to be compiled in a special way to support
epoll.

Linux multiplexing

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (15)

Similar a Linux multiplexing

Similar a Linux multiplexing (20)

Más de Mark Veltzer

Más de Mark Veltzer (11)

Último

Último (20)

Linux multiplexing