SlideShare una empresa de Scribd logo
1 de 24
Linux IO
1
Liran Ben Haim
liran@discoversdk.com
Rights to Copy
 Attribution – ShareAlike 2.0
 You are free
to copy, distribute, display, and perform the work
to make derivative works
to make commercial use of the work
Under the following conditions
Attribution. You must give the original author credit.
Share Alike. If you alter, transform, or build upon this work, you may
distribute the resulting work only under a license identical to this one.
For any reuse or distribution, you must make clear to others the license
terms of this work.
Any of these conditions can be waived if you get permission from the
copyright holder.
Your fair use and other rights are in no way affected by the above.
License text: http://creativecommons.org/licenses/by-sa/2.0/legalcode
 This kit contains work by the
following authors:
 © Copyright 2004-2009
Michael Opdenacker /Free
Electrons
michael@free-electrons.com
http://www.free-electrons.com
 © Copyright 2003-2006
Oron Peled
oron@actcom.co.il
http://www.actcom.co.il/~oron
 © Copyright 2004–2008
Codefidence ltd.
info@codefidence.com
http://www.codefidence.com
 © Copyright 2009–2010
Bina ltd.
info@bna.co.il
http://www.bna.co.il
2
Typical System
3
IO Models
 Blocking I/O
 Non-blocking I/O
 I/O multiplexing (select/poll/…)
 Signal driven I/O (SIGIO)
 Asynchronous I/O
 aio_* functions
 io_* functions
4
Blocking IO
 Default mode
 Makes the calling thread to block in the kernel in case
no data is available
 Can block forever
 To block with timeout, use select
5
Non Blocking IO
 If there is no data, calling thread returns with EAGAIN
or EWOULDBLOCK
flags = fcntl(fd, F_GETFL, 0);
fcntl(fd, F_SETFL, flags | O_NONBLOCK);
 To use in your own driver
6
IO Multiplexing
 With I/O multiplexing, we call select/poll/epoll* and
block in one of these system calls, instead of blocking
in the actual I/O system call
 Disadvantage: using select requires at least two system
calls (select and recvfrom) instead of one
 Advantage: we can wait for more than one descriptor to
be ready
7
Signal driven I/O
 The signal-driven I/O model uses signals, telling the kernel
to notify us with the SIGIO signal when the descriptor is
ready
fd=open(“name”, O_NONBLOCK);
fcntl(fd, F_SETSIG, SIGRTMIN + 1);
 Its better to use RT signals (queued)
 To use in your own driver
 if (file->f_flags & O_NONBLOCK)
 Send signal to file->f_owner (fown_struct)
 signum
 pid
8
Asynchronous IO
 The POSIX asynchronous I/O interface
 Allows applications to initiate one or more I/O operations
that are performed asynchronously.
 The application can select to be notified of completion of
the I/O operation in a variety of ways:
 Delivery of a signal
 Instantiation of a thread
 No notification at all
 User space asynchronous implementation
 Call the driver read/write callbacks
9
Kernel Async Support
 file_operations callbacks
 aio_read, aio_write
 Initialize the data processing
 Create workqueue item/completion/timer/…
 On completion
 aio_complete
 Kernel 3.16
 read_iter
 write_iter
 To use from user space call io_submit
10
I/O Multiplexing
 select, pselect
 poll, ppoll
 epoll
 epoll_create, epoll_create1
 epoll_ctl
 epoll_wait, epoll_pwait
11
select(2) system call
The select( ) system call provides a mechanism for implementing synchronous
multiplexing I/O
A call to select( ) will block until the given file descriptors are ready to perform
I/O, or until an optionally specified timeout has elapsed
The watched file descriptors are broken into three sets
File descriptors listed in the readfds set are watched to see if data is
available for reading.
File descriptors listed in the writefds set are watched to see if a write
operation will complete without blocking.
File descriptors in the exceptfds set are watched to see if an exception has
occurred, or if out-of-band data is available (these states apply only to
sockets).
A given set may be NULL, in which case select( ) does not watch for that event.
On successful return, each set is modified such that it contains only the file
descriptors that are ready for I/O of the type delineated by that set
Blocking for events
You can use select(2) to block for events
int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
struct timeval *timeout);
nfds: number of highest file descriptor + 1
fd_sets: sets of file descriptors to block for events on,
one for read, one for write, one for exceptions.
timeout: NULL or structure describing how long to
block.
Linux updates the timeout structure according to how
much time was left to the time out.
File Descriptor Sets
fd_set is a group of file descriptor for actions or
reports:
void FD_CLR(int fd, fd_set *set);
Remove fd from this set.
int FD_ISSET(int fd, fd_set *set);
Is fd in the set?
void FD_SET(int fd, fd_set *set);
Add fd to this set.
void FD_ZERO(fd_set *set);
Zero the set.
Driver notification example
int fd1, fd2;
fd_set fds_set;
int ret;
fd1 = open(“/dev/drv_ctl0”, O_RDWR);
fd2 = open(“/dev/drv_ctl1”, O_RDWR);
FD_ZERO(&fds_set);
FD_SET(fd1, &fds_set);
FD_SET(fd2, &fds_set);
do {
ret = select(fd1 + fd2 + 1, &fds_set, NULL, NULL,
NULL);
} while(errno == EINTR);
Open two file descriptors
fd1 and fd2.
Create an fd set that
holds them.
Block for events on them.
We try again if we
interrupted by a signal.
Driver notification example 2
if(ret == -1) {
perror("Select failed.");
exit(1);
}
if(FD_ISSET(fd1, &fds_set)) {
printf("event at FD 1... “);
ioctl(fd1, IOCTL_CLR, 1);
printf("clearn);
}
If we have an error bail
out.
Check if fd1 one has a
pending read even.
If so, use ioctl(2) to
notify driver to clear
status
pselect
POSIX implementation
Does not modify the timeout parameter
Can set signal mask before entering to sleep
Uses the timespec structure (seconds, nanoseconds)
Poll
 int poll (struct pollfd *fds, unsigned int nfds, int timeout);
Unlike select(), with its inefficient three bitmask-based sets of file descriptors, poll( )
employs a single array of nfds pollfd structures
#include <sys/poll.h>
struct pollfd {
int fd;
short events;
short revents;
};
POSIX implementation – ppoll (with signal mask)
19https://www.ulduzsoft.com/2014/01/select-poll-epoll-practical-difference-for-system-architects/
Poll vs. Select
poll( ) does not require that the user calculate the value of the highest-
numbered file descriptor +1
poll( ) is more efficient for large-valued file descriptors. Imagine watching a
single file descriptor with the value 900 via select()—the kernel would have
to check each bit of each passed-in set, up to the 900th bit.
select( )’s file descriptor sets are statically sized.
With select( ), the file descriptor sets are reconstructed on return, so each
subsequent call must reinitialize them. The poll( ) system call separates the
input (events field) from the output (revents field), allowing the array to be
reused without change.
The timeout parameter to select( ) is undefined on return. Portable code
needs to reinitialize it. This is not an issue with pselect( )
select( ) is more portable, as some Unix systems do not support poll( ).
Event Poll
Has state in the kernel
O(1) instead of O(n)
epoll_create(2) – initializes epoll context
epoll_ctl(2) – adds/removes file descriptors from the
context
epoll_wait(2) – performs the actual event wait
Can behave as
Edge triggered
Level triggered
Example
int epfd = epoll_create(0);
int client_sock = socket(…..);
static struct epoll_event ev;
ev.events = EPOLLIN | EPOLLOUT | EPOLLERR;
ev.data.fd = client_sock;
int res = epoll_ctl(epfd, EPOLL_CTL_ADD, client_sock, &ev);
struct epoll_event *events = malloc(SIZE);
while (1) {
int nfds = epoll_wait(epfd, events, MAX_EVENTS, TIMEOUT);
for(int i = 0; i < nfds; i++) {
int fd = events[i].data.fd;
handle_io_on_socket(fd);
}
}
Driver Implementation
 Implement poll callback on file_operations
 The driver adds a wait queue to the poll_table structure
by calling the function poll_wait
23
Thank You
Code examples and more
http://www.discoversdk.com/blog
24

Más contenido relacionado

La actualidad más candente

Top 100 Linux Interview Questions and Answers 2014
Top 100 Linux Interview Questions and Answers 2014Top 100 Linux Interview Questions and Answers 2014
Top 100 Linux Interview Questions and Answers 2014
iimjobs and hirist
 
Linux presentation
Linux presentationLinux presentation
Linux presentation
Nikhil Jain
 

La actualidad más candente (20)

Basic commands of linux
Basic commands of linuxBasic commands of linux
Basic commands of linux
 
Top 100 Linux Interview Questions and Answers 2014
Top 100 Linux Interview Questions and Answers 2014Top 100 Linux Interview Questions and Answers 2014
Top 100 Linux Interview Questions and Answers 2014
 
Linux Kernel Tour
Linux Kernel TourLinux Kernel Tour
Linux Kernel Tour
 
Linux basics
Linux basicsLinux basics
Linux basics
 
Ipc in linux
Ipc in linuxIpc in linux
Ipc in linux
 
Implementation of Pipe in Linux
Implementation of Pipe in LinuxImplementation of Pipe in Linux
Implementation of Pipe in Linux
 
Ninja Build: Simple Guide for Beginners
Ninja Build: Simple Guide for BeginnersNinja Build: Simple Guide for Beginners
Ninja Build: Simple Guide for Beginners
 
Intro to Linux Shell Scripting
Intro to Linux Shell ScriptingIntro to Linux Shell Scripting
Intro to Linux Shell Scripting
 
Linux file system
Linux file systemLinux file system
Linux file system
 
Basic unix commands
Basic unix commandsBasic unix commands
Basic unix commands
 
OPERATING SYSTEMS DESIGN AND IMPLEMENTATION
OPERATING SYSTEMSDESIGN AND IMPLEMENTATIONOPERATING SYSTEMSDESIGN AND IMPLEMENTATION
OPERATING SYSTEMS DESIGN AND IMPLEMENTATION
 
Understanding DPDK
Understanding DPDKUnderstanding DPDK
Understanding DPDK
 
Linux file system
Linux file systemLinux file system
Linux file system
 
Signal Handling in Linux
Signal Handling in LinuxSignal Handling in Linux
Signal Handling in Linux
 
Linux standard file system
Linux standard file systemLinux standard file system
Linux standard file system
 
Linux presentation
Linux presentationLinux presentation
Linux presentation
 
Signal
SignalSignal
Signal
 
Shell scripting
Shell scriptingShell scripting
Shell scripting
 
Linux kernel Architecture and Properties
Linux kernel Architecture and PropertiesLinux kernel Architecture and Properties
Linux kernel Architecture and Properties
 
Linux scheduling and input and output
Linux scheduling and input and outputLinux scheduling and input and output
Linux scheduling and input and output
 

Destacado

Windows Internals for Linux Kernel Developers
Windows Internals for Linux Kernel DevelopersWindows Internals for Linux Kernel Developers
Windows Internals for Linux Kernel Developers
Kernel TLV
 

Destacado (20)

DMA Survival Guide
DMA Survival GuideDMA Survival Guide
DMA Survival Guide
 
Windows Internals for Linux Kernel Developers
Windows Internals for Linux Kernel DevelopersWindows Internals for Linux Kernel Developers
Windows Internals for Linux Kernel Developers
 
Programming Embedded linux
Programming Embedded linuxProgramming Embedded linux
Programming Embedded linux
 
Android internals
Android internalsAndroid internals
Android internals
 
Linux internals v4
Linux internals v4Linux internals v4
Linux internals v4
 
VLANs in the Linux Kernel
VLANs in the Linux KernelVLANs in the Linux Kernel
VLANs in the Linux Kernel
 
Linux scheduler
Linux schedulerLinux scheduler
Linux scheduler
 
React native
React nativeReact native
React native
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
Linux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use CasesLinux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use Cases
 
Linux interrupts
Linux interruptsLinux interrupts
Linux interrupts
 
Userfaultfd and Post-Copy Migration
Userfaultfd and Post-Copy MigrationUserfaultfd and Post-Copy Migration
Userfaultfd and Post-Copy Migration
 
WiFi and the Beast
WiFi and the BeastWiFi and the Beast
WiFi and the Beast
 
Linux Kernel Init Process
Linux Kernel Init ProcessLinux Kernel Init Process
Linux Kernel Init Process
 
Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network Interfaces
 
Debugging the Deadlock for the Scheduler
Debugging the Deadlock for the SchedulerDebugging the Deadlock for the Scheduler
Debugging the Deadlock for the Scheduler
 
Testing real-time Linux. What to test and how
Testing real-time Linux. What to test and how Testing real-time Linux. What to test and how
Testing real-time Linux. What to test and how
 
Hardware Probing in the Linux Kernel
Hardware Probing in the Linux KernelHardware Probing in the Linux Kernel
Hardware Probing in the Linux Kernel
 
Switchdev - No More SDK
Switchdev - No More SDKSwitchdev - No More SDK
Switchdev - No More SDK
 
Linux Interrupts
Linux InterruptsLinux Interrupts
Linux Interrupts
 

Similar a Linux IO

Linux or unix interview questions
Linux or unix interview questionsLinux or unix interview questions
Linux or unix interview questions
Teja Bheemanapally
 
Working with the IFS on System i
Working with the IFS on System iWorking with the IFS on System i
Working with the IFS on System i
Chuck Walker
 
Linuxdd[1]
Linuxdd[1]Linuxdd[1]
Linuxdd[1]
mcganesh
 

Similar a Linux IO (20)

Linux 系統程式--第一章 i/o 函式
Linux 系統程式--第一章 i/o 函式Linux 系統程式--第一章 i/o 函式
Linux 系統程式--第一章 i/o 函式
 
Basic Linux Internals
Basic Linux InternalsBasic Linux Internals
Basic Linux Internals
 
Kqueue : Generic Event notification
Kqueue : Generic Event notificationKqueue : Generic Event notification
Kqueue : Generic Event notification
 
brief intro to Linux device drivers
brief intro to Linux device driversbrief intro to Linux device drivers
brief intro to Linux device drivers
 
Linux
LinuxLinux
Linux
 
Linux or unix interview questions
Linux or unix interview questionsLinux or unix interview questions
Linux or unix interview questions
 
Unix.system.calls
Unix.system.callsUnix.system.calls
Unix.system.calls
 
Block Drivers
Block DriversBlock Drivers
Block Drivers
 
Linux System Programming - Advanced File I/O
Linux System Programming - Advanced File I/OLinux System Programming - Advanced File I/O
Linux System Programming - Advanced File I/O
 
Working with the IFS on System i
Working with the IFS on System iWorking with the IFS on System i
Working with the IFS on System i
 
Sockets and Socket-Buffer
Sockets and Socket-BufferSockets and Socket-Buffer
Sockets and Socket-Buffer
 
Linux
LinuxLinux
Linux
 
Introduction to Linux
Introduction to LinuxIntroduction to Linux
Introduction to Linux
 
Linuxdd[1]
Linuxdd[1]Linuxdd[1]
Linuxdd[1]
 
All'ombra del Leviatano: Filesystem in Userspace
All'ombra del Leviatano: Filesystem in UserspaceAll'ombra del Leviatano: Filesystem in Userspace
All'ombra del Leviatano: Filesystem in Userspace
 
The Linux Kernel Implementation of Pipes and FIFOs
The Linux Kernel Implementation of Pipes and FIFOsThe Linux Kernel Implementation of Pipes and FIFOs
The Linux Kernel Implementation of Pipes and FIFOs
 
How To Add System Call In Ubuntu OS
How To Add System Call In Ubuntu OSHow To Add System Call In Ubuntu OS
How To Add System Call In Ubuntu OS
 
Unix-module3.pptx
Unix-module3.pptxUnix-module3.pptx
Unix-module3.pptx
 
Linux
LinuxLinux
Linux
 
Basics of Linux Commands, Git and Github
Basics of Linux Commands, Git and GithubBasics of Linux Commands, Git and Github
Basics of Linux Commands, Git and Github
 

Último

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 

Último (20)

Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 

Linux IO

  • 1. Linux IO 1 Liran Ben Haim liran@discoversdk.com
  • 2. Rights to Copy  Attribution – ShareAlike 2.0  You are free to copy, distribute, display, and perform the work to make derivative works to make commercial use of the work Under the following conditions Attribution. You must give the original author credit. Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one. For any reuse or distribution, you must make clear to others the license terms of this work. Any of these conditions can be waived if you get permission from the copyright holder. Your fair use and other rights are in no way affected by the above. License text: http://creativecommons.org/licenses/by-sa/2.0/legalcode  This kit contains work by the following authors:  © Copyright 2004-2009 Michael Opdenacker /Free Electrons michael@free-electrons.com http://www.free-electrons.com  © Copyright 2003-2006 Oron Peled oron@actcom.co.il http://www.actcom.co.il/~oron  © Copyright 2004–2008 Codefidence ltd. info@codefidence.com http://www.codefidence.com  © Copyright 2009–2010 Bina ltd. info@bna.co.il http://www.bna.co.il 2
  • 4. IO Models  Blocking I/O  Non-blocking I/O  I/O multiplexing (select/poll/…)  Signal driven I/O (SIGIO)  Asynchronous I/O  aio_* functions  io_* functions 4
  • 5. Blocking IO  Default mode  Makes the calling thread to block in the kernel in case no data is available  Can block forever  To block with timeout, use select 5
  • 6. Non Blocking IO  If there is no data, calling thread returns with EAGAIN or EWOULDBLOCK flags = fcntl(fd, F_GETFL, 0); fcntl(fd, F_SETFL, flags | O_NONBLOCK);  To use in your own driver 6
  • 7. IO Multiplexing  With I/O multiplexing, we call select/poll/epoll* and block in one of these system calls, instead of blocking in the actual I/O system call  Disadvantage: using select requires at least two system calls (select and recvfrom) instead of one  Advantage: we can wait for more than one descriptor to be ready 7
  • 8. Signal driven I/O  The signal-driven I/O model uses signals, telling the kernel to notify us with the SIGIO signal when the descriptor is ready fd=open(“name”, O_NONBLOCK); fcntl(fd, F_SETSIG, SIGRTMIN + 1);  Its better to use RT signals (queued)  To use in your own driver  if (file->f_flags & O_NONBLOCK)  Send signal to file->f_owner (fown_struct)  signum  pid 8
  • 9. Asynchronous IO  The POSIX asynchronous I/O interface  Allows applications to initiate one or more I/O operations that are performed asynchronously.  The application can select to be notified of completion of the I/O operation in a variety of ways:  Delivery of a signal  Instantiation of a thread  No notification at all  User space asynchronous implementation  Call the driver read/write callbacks 9
  • 10. Kernel Async Support  file_operations callbacks  aio_read, aio_write  Initialize the data processing  Create workqueue item/completion/timer/…  On completion  aio_complete  Kernel 3.16  read_iter  write_iter  To use from user space call io_submit 10
  • 11. I/O Multiplexing  select, pselect  poll, ppoll  epoll  epoll_create, epoll_create1  epoll_ctl  epoll_wait, epoll_pwait 11
  • 12. select(2) system call The select( ) system call provides a mechanism for implementing synchronous multiplexing I/O A call to select( ) will block until the given file descriptors are ready to perform I/O, or until an optionally specified timeout has elapsed The watched file descriptors are broken into three sets File descriptors listed in the readfds set are watched to see if data is available for reading. File descriptors listed in the writefds set are watched to see if a write operation will complete without blocking. File descriptors in the exceptfds set are watched to see if an exception has occurred, or if out-of-band data is available (these states apply only to sockets). A given set may be NULL, in which case select( ) does not watch for that event. On successful return, each set is modified such that it contains only the file descriptors that are ready for I/O of the type delineated by that set
  • 13. Blocking for events You can use select(2) to block for events int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); nfds: number of highest file descriptor + 1 fd_sets: sets of file descriptors to block for events on, one for read, one for write, one for exceptions. timeout: NULL or structure describing how long to block. Linux updates the timeout structure according to how much time was left to the time out.
  • 14. File Descriptor Sets fd_set is a group of file descriptor for actions or reports: void FD_CLR(int fd, fd_set *set); Remove fd from this set. int FD_ISSET(int fd, fd_set *set); Is fd in the set? void FD_SET(int fd, fd_set *set); Add fd to this set. void FD_ZERO(fd_set *set); Zero the set.
  • 15. Driver notification example int fd1, fd2; fd_set fds_set; int ret; fd1 = open(“/dev/drv_ctl0”, O_RDWR); fd2 = open(“/dev/drv_ctl1”, O_RDWR); FD_ZERO(&fds_set); FD_SET(fd1, &fds_set); FD_SET(fd2, &fds_set); do { ret = select(fd1 + fd2 + 1, &fds_set, NULL, NULL, NULL); } while(errno == EINTR); Open two file descriptors fd1 and fd2. Create an fd set that holds them. Block for events on them. We try again if we interrupted by a signal.
  • 16. Driver notification example 2 if(ret == -1) { perror("Select failed."); exit(1); } if(FD_ISSET(fd1, &fds_set)) { printf("event at FD 1... “); ioctl(fd1, IOCTL_CLR, 1); printf("clearn); } If we have an error bail out. Check if fd1 one has a pending read even. If so, use ioctl(2) to notify driver to clear status
  • 17. pselect POSIX implementation Does not modify the timeout parameter Can set signal mask before entering to sleep Uses the timespec structure (seconds, nanoseconds)
  • 18. Poll  int poll (struct pollfd *fds, unsigned int nfds, int timeout); Unlike select(), with its inefficient three bitmask-based sets of file descriptors, poll( ) employs a single array of nfds pollfd structures #include <sys/poll.h> struct pollfd { int fd; short events; short revents; }; POSIX implementation – ppoll (with signal mask)
  • 20. Poll vs. Select poll( ) does not require that the user calculate the value of the highest- numbered file descriptor +1 poll( ) is more efficient for large-valued file descriptors. Imagine watching a single file descriptor with the value 900 via select()—the kernel would have to check each bit of each passed-in set, up to the 900th bit. select( )’s file descriptor sets are statically sized. With select( ), the file descriptor sets are reconstructed on return, so each subsequent call must reinitialize them. The poll( ) system call separates the input (events field) from the output (revents field), allowing the array to be reused without change. The timeout parameter to select( ) is undefined on return. Portable code needs to reinitialize it. This is not an issue with pselect( ) select( ) is more portable, as some Unix systems do not support poll( ).
  • 21. Event Poll Has state in the kernel O(1) instead of O(n) epoll_create(2) – initializes epoll context epoll_ctl(2) – adds/removes file descriptors from the context epoll_wait(2) – performs the actual event wait Can behave as Edge triggered Level triggered
  • 22. Example int epfd = epoll_create(0); int client_sock = socket(…..); static struct epoll_event ev; ev.events = EPOLLIN | EPOLLOUT | EPOLLERR; ev.data.fd = client_sock; int res = epoll_ctl(epfd, EPOLL_CTL_ADD, client_sock, &ev); struct epoll_event *events = malloc(SIZE); while (1) { int nfds = epoll_wait(epfd, events, MAX_EVENTS, TIMEOUT); for(int i = 0; i < nfds; i++) { int fd = events[i].data.fd; handle_io_on_socket(fd); } }
  • 23. Driver Implementation  Implement poll callback on file_operations  The driver adds a wait queue to the poll_table structure by calling the function poll_wait 23
  • 24. Thank You Code examples and more http://www.discoversdk.com/blog 24