o CPU-I/O Burst Cycle
o CPU Scheduler
o Preemptive Scheduling
o First-Come First-Serve Scheduling, FCFS
o Shortest-Job-First Scheduling, SJF
o Priority Scheduling
o Round Robin Scheduling
o Multilevel Queue Scheduling
o Multilevel Feedback-Queue Scheduling
o Contention Scope
o Pthread Scheduling
o Approaches to Multiple-Processor Scheduling
o Processor Affinity
o Load Balancing
o Multicore Processors
o Virtualization and Scheduling (Optional, Omitted from 9th edition )
REAL-TIME CPU SCHEDULING
o Minimizing Latency
o Priority-Based Scheduling
o Rate-Monotonic Scheduling
o Earliest-Deadline-First Scheduling
o Proportional Share Scheduling
o POSIX Real-Time Scheduling
OPERATING SYSTEMEXAMPLES (OPTIONAL)
o Example: Linux Scheduling (was 5.6.3)
o Example: Windows XP Scheduling (was 5.6.2)
o Deterministic Modeling
o Queuing Models
Almost allprograms have some alternating cycle ofCPU number crunchingandwaiting for I/O of some kind. (Even a simple fetch frommemorytakesa
long time relative to CPU speeds.) Ina simple system running a single process, the time spent waiting for I/O is wasted, and those CPU cycles are lost
forever. A schedulingsystemallows one processto use the CPU while another is waitingfor I/O, therebymaking full use ofotherwise lost CPU cycles.
The challenge is to make the overall systemas "efficient" and"fair" as possible, subject to varying and often dynamic conditions, andwhere "efficient"
and "fair" are somewhat subjective terms, oftensubject to shifting prioritypolicies.
CPU-I/O Burst Cycle: Almost all processes alternate betweentwo statesina continuingcycle, as shownin
Figure 6.1 below:(a) A CPU burst of performingcalculations, and (b) An I/O burst, waiting for data
transfer in or out of the system. CPU bursts varyfrom process to process, andfrom programto program.
CPU Scheduler: Whenever the CPU becomes idle, it is the job ofthe CPU Scheduler (a.k.a. the short-term
scheduler) to select another process from the readyqueue to runnext. The storage structure for the
readyqueue andthe algorithmusedto select the next process are not necessarilya FIFO queue. There
are severalalternatives to choose from, as well as numerous adjustable parameters for each algorithm,
which is the basic subject of this entire chapter. (Note that the readyqueue is not necessarilya first-in,
first-out (FIFO) queue. As we shall see whenwe consider the various scheduling algorithms, a ready
queue can be implementedas a FIFO queue, a priorityqueue, a tree, or simplyanunorderedlinkedlist.
Conceptually, however, all the processes inthe readyqueue are lined upwaiting for a chance to run on
the CPU. The records inthe queues are generallyprocess control blocks (PCBs) ofthe processes.)
Preemptive Scheduling: CPU scheduling decisions take place under one of four conditions:
o When a process switches from the running state to the waitingstate, such as for anI/O request
or invocationof the wait() systemcall.
o When a process switches from the running state to the readystate, for example inresponse to
o When a process switches from the waiting state to the readystate, sayat completion ofI/O or a returnfrom wait().
o When a process terminates.
For conditions 1 and 4 there is nochoice - A new process must be selected. For conditions 2 and3 there is a choice - To either continue
running the current process, or select a different one. If scheduling takesplace onlyunder conditions 1 and 4, the system is saidto be
non-preemptive, or cooperative. Under these conditions, once a process starts runningit keeps running, until it either voluntarilyblocks
or until it finishes. Otherwise the system is saidto be preemptive. Windows usednon-preemptive schedulingup to Windows 3.x, and
startedusingpre-emptive scheduling with Win95. Macs used non-preemptive prior to OSX, and pre-emptive since then. Note that pre-
emptive scheduling is onlypossible onhardware that supports a timer interrupt.
Note that pre-emptive schedulingcancause problems whentwo processes share data, because one process mayget interruptedin
the middle ofupdating shareddata structures (Chapter 5 examinedthisissue ingreater detail). Preemptioncanalsobe a problem if the
kernel is busyimplementing a system call (e.g. updating critical kernel data structures) when the preemptionoccurs. Most mo dern
UNIXes deal withthis problem bymaking the process wait untilthe systemcallhas either completedor blockedbefore allowing the
preemption Unfortunatelythis solutionis problematic for real-time systems, as real-time response can nolonger be guaranteed. Some
critical sections of code protect themselves fromconcurrencyproblems by disabling interrupts before entering the critical sectionandre-
enabling interrupts on exiting the section. Needless to say, this shouldonlybe done inrare situations, and onlyonveryshort pieces of
code that will finishquickly, (usuallyjust a fewmachine instructions.)
Dispatcher: The dispatcher is the module that gives control ofthe CPU to the process selectedbythe scheduler. Thisfunction involves: (a)
Switching context. (b) Switching to user mode. (c) Jumping to the proper locationinthe newlyloadedprogram. The dispatcher needs to be as
fast as possible, as it is runon everycontext switch. The time consumedbythe dispatcher is known as dispatchlatency.
There are severaldifferent criteriato consider when trying to select the "best"schedulingalgorithmfor a particular situationandenvironment,
o CPU utilization - Ideallythe CPU would be busy100% of the time, soas to waste 0 CPU cycles. On a real system CPU usage should
range from40% (lightlyloaded) to 90% (heavilyloaded.)
o Throughput - Number of processes completedper unit time. Mayrange from 10/second to 1/hour depending onthe specific
o Turnaroundtime - Time requiredfor a particular process to complete, fromsubmissiontime to completion (Wall clocktime).
o Waiting time - It is the time processes spendin the readyqueue waiting their turn to get onthe CPU. The CPU-scheduling algorithm
does not affect the amount oftime during whicha process executes or does I/O. It affects onlythe amount of time that a pro cess
spends waiting in the readyqueue. Waitingtime is the sum ofthe periods spent waitinginthe readyqueue. (Loadaverage - The
average number of processessittinginthe readyqueue waiting their turn to get into the CPU. Reportedin1-minute, 5-minute, and
o Response time - The time takeninan interactive programfrom the issuance of a commandto the commence ofa response to that
In general one wants to optimize the average value of a criteria (Maximize CPU utilizationandthroughput, andminimize all the
others.)However sometimes one wants to dosomethingdifferent, suchas to minimize the maximum response time. Sometimesit is
most desirable to minimize the variance of a criteria thanthe actualvalue. I.e. users are more accepting ofa consistent predictable
systemthan aninconsistent one, evenif it is a little bit slower.
The following subsections willexplainseveral common schedulingstrategies, lookingat onlya single CPU burst (inmilliseconds) eachfor a small number
of processes. Obviouslyreal systems have to dealwith a lot more simultaneous processes executing their CPU-I/O burst cycles.
First-Come First-Serve Scheduling, FCFS
FCFS can yield some verylong average wait times, particularlyif the first process to get there takes a long time. For example, consider the
following three processes:
In the first Gantt chart below, processP1 arrives first. The average waiting time for the three processes is ( 0 + 24 + 27 ) / 3 = 17.0 ms. In the second
Gantt chart below, the same three processes have anaverage wait time of ( 0 + 3 + 6 ) / 3 = 3.0 ms. The total runtime for the three bursts is the same,
but in the secondcase two ofthe three finish much quicker, and the other process is onlydelayedbya short amount.
FCFS can alsoblock the system in a busydynamic system inanother way, knownas the convoyeffect. Whenone CPU intensive processblocks
the CPU, a number of I/O intensive processes can get backed upbehind it, leavingthe I/O devices idle. Whenthe CPU hog finallyrelinquishes
the CPU, then the I/O processes pass through the CPU quickly, leaving the CPU idle while everyone queues upfor I/O, and thenthe cycle
repeats itselfwhenthe CPU intensive processgets back to the readyqueue.
Shortest-Job-First Scheduling, SJF
The idea behindthe SJF algorithm is to pick the quickest fastest little job that needs to be
done, get it out of the wayfirst, andthen pickthe next smallest fastest jobto donext.
(Technicallythis algorithmpicks a process basedon the next shortest CPU burst, not the
overall process time.) For example, the Gantt chart below is based
upon the following CPU burst times, (andthe assumptionthat all jobs
arrive at the same time.). Inthis case, wait time is (0 + 3 + 9 + 16)/4 =
7.0 ms, (as opposedto 10.25 ms for FCFS for the same processes.)
For long-term batchjobs this can be done baseduponthe limits that users set for their jobs when theysubmit them. Another optionwouldbe
to statisticallymeasure the runtime characteristics of jobs, particularlyif the same tasks are runrepeatedlyandpredictably(but once again
that reallyisn't a viable option for short termCPU schedulinginthe realworld). A more
practical approachis to predict the length ofthe next burst, based onsome historical
measurement of recent burst timesfor this process.
SJF can be either preemptive or non-preemptive. Preemptionoccurs when a new process
arrives in the readyqueue that has a predictedburst time shorter thanthe time
remaininginthe process whose burst is currentlyonthe CPU. Preemptive SJFis
sometimes referred to as shortest remaining time first scheduling. For example, the followingGantt chart is based upon the followingdata
and the average wait time inthis case is (( 5 - 3 ) + ( 10 - 1 ) + ( 17 - 2 ))
/ 4 = 26 / 4 = 6.5 ms. (As opposed to 7.75 ms for non-preemptive SJF
or 8.75 for FCFS.)
Priorityscheduling is a more generalcase ofSJF, in whicheachjob is assigned a priorityand the
job with the highest prioritygets scheduled first. (SJF uses the inverse of the next expected
burst time as its priority - The smaller the expectedburst, the higher the priority.). This book
uses low number for highpriorities, with0 being the highest possible priority. For example, the
following Gantt chart is baseduponthese process burst times andpriorities, andyields an
average waitingtime of 8.2 ms:
Priorities can be assigned either internallyor externally. Internal
priorities are assignedbythe OS using criteria suchas average
burst time, ratioof CPU to I/O activity, system resource use, and
other factors available to the kernel. Externalpriorities are assigned byusers, basedon the importance ofthe job, fees paid, etc.
Priorityscheduling can be either preemptive or non-preemptive.
Priorityscheduling can suffer froma major problemknownas indefinite blocking, or starvation, inwhich a low-prioritytask can wait forever
because there are always some other jobs aroundthat have higher priority. One common solutionto this problem is aging, inwhich priorities
of jobs increase the longer theywait. Under thisscheme a low-priorityjobwill eventuallyget its priorityraisedhigh enoughthat it gets run.
Round Robin Scheduling
Round robinschedulingis similar to FCFS scheduling, except that CPU bursts are
assignedwith limits called time quantum. When a process is given the CPU, a timer is
set for whatever value has beenset for a time quantum. If the process finishes its
burst before the time quantum timer expires, thenit is swapped out of the CPU just
like the normal FCFSalgorithm. If the timer goesoff first, thenthe process is
swappedout of the CPU andmovedto the back endof the
readyqueue. The readyqueue is maintainedas a circular
queue, so whenallprocesses have hada turn, then the
scheduler gives the first process another turn, andso on. RR
scheduling cangive the effect ofallprocessors sharing the CPU equally,
althoughthe average wait time can be longer than withother scheduling
algorithms. Inthe following example the average wait time is 5.66 ms.
The performance of RR is sensitive to the time quantum selected. If the
quantum is large enough, then RRreduces to the FCFSalgorithm;If it is
very small, then eachprocessgets 1/nth ofthe processor time andshare
the CPU equally. BUT, a real systeminvokes overhead for every context
switch, and the smaller the time quantumthe more context switches there
are. (See Figure 6.4 below.) Most modern systems use time quantum
between10 and100 milliseconds, andcontext switch timeson the order of
10 microseconds, sothe overheadis smallrelative to the time quantum.
Turn around time alsovarieswith quantum time, ina non-apparent
manner. Consider, for example the processes shown inFigure 6.5. In
general, turnaroundtime is minimizedifmost processes finishtheir
next cpu burst within one time quantum. For example, with three
processes of 10 ms bursts each, the average turnaroundtime for 1 ms
quantum is 29, and for 10 ms quantum it reduces to 20. However, ifit
is made toolarge, then RRjust degeneratesto FCFS. A rule ofthumb
is that 80% of CPU bursts shouldbe smaller thanthe time quantum.
Multilevel Queue Scheduling
When processes can be readilycategorized, thenmultiple separate
queues canbe established, eachimplementingwhatever scheduling
algorithmis most appropriate for that type ofjob, and/or with
different parametric adjustments. Scheduling must also be done
betweenqueues, that is schedulingone queue to get time relative to
other queues. Two common options are strict priority(no jobin a
lower priorityqueue runs until all higher priorityqueues are empty)
and round-robin(each queue gets a time slice inturn, possiblyof different sizes.) Note that under thisalgorithmjobs cannot switch from
queue to queue - Once theyare assigneda queue, that is their queue untiltheyfinish.
Multilevel Feedback-Queue Scheduling
Multilevel feedbackqueue scheduling is similar to the ordinarymultilevel queue scheduling describedabove, except jobs may be movedfrom
one queue to another for a varietyof reasons:(A) Ifthe characteristics of a jobchange betweenCPU-intensive andI/O intensive, thenit may
be appropriate to switcha job fromone queue to another. (B)Agingcanalso be incorporated, so that a jobthat haswaitedfor a long time can
get bumpedupinto a higher priorityqueue for a while.
Multilevel feedbackqueue scheduling is the most flexible, because it canbe tunedfor anysituation. But it is alsothe most complex to
implement because of all the adjustable parameters. Some ofthe parameters whichdefine one of these systems include: (A) The number of
queues. (B) The scheduling algorithm for each queue. (C) The methods usedto upgrade or demote processesfrom one queue to an other.
(Which maybe different.)(D) The methodusedto determine which queue a process enters initially.