Scheduling

The document discusses Linux scheduling. It aims for O(1) scheduling by using per-CPU run queues and priority-based scheduling. Processes are assigned both a static and dynamic priority. The dynamic priority is adjusted based on factors like how interactive or I/O-bound a process is. Processes are scheduled from priority-based run queues, with higher priority processes getting longer time quanta. Real-time processes have the highest priorities. The scheduler uses priority arrays and swapping between active and expired queues to avoid starvation and provide fairness.

Uploaded by

Owner JustACode

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views

Scheduling

Uploaded by

Owner JustACode

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 31

Scheduling

in Linux
COMS W4118
Spring 2008
Scheduling Goals
 O(1) scheduling; 2.4 scheduler iterated through
 Run queue on each invocation
 Task queue at each epoch
 Scale well on multiple processors
 per-CPU run queues
 SMP affinity
 Interactivity boost
 Fairness
 Optimize for one or two runnable processes
2
Basic Philosophies
 Priority is the primary scheduling mechanism
 Priority is dynamically adjusted at run time
 Processes denied access to CPU get increased
 Processes running a long time get decreased
 Try to distinguish interactive processes from non-
interactive
 Bonus or penalty reflecting whether I/O or compute
bound
 Use large quanta for important processes
 Modify quanta based on CPU use
 Quantum != clock tick
 Associate processes to CPUs
 Do everything in O(1) time
3
The Run Queue
 140 separate queues, one for each priority
level
 Actually, two sets, active and expired
 Priorities 0-99 for real-time processes
 Priorities 100-139 for normal processes;
value set via nice() system call

4
Runqueue for O(1) Scheduler
priority array Higher priority
more I/O
priority queue 800ms quanta
. .
active . .
. . priority queue lower priority
more CPU
10ms quanta
expired
priority array
priority queue

. .
. .
. . priority queue
5
Scheduler Runqueue
 A scheduler runqueue is a list of tasks that are
runnable on a particular CPU.
 A rq structure maintains a linked list of those
tasks.
 The runqueues are maintained as an array
runqueues, indexed by the CPU number.
 The rq keeps a reference to its idle task
 The idle task for a CPU is never on the scheduler
runqueue for that CPU (it's always the last choice)
 Access to a runqueue is serialized by
acquiring and releasing rq->lock
Basic Scheduling Algorithm
 Find the highest-priority queue with a
runnable process
 Find the first process on that queue
 Calculate its quantum size
 Let it run
 When its time is up, put it on the expired list
 Repeat

7
The Highest Priority Process
 There is a bit map indicating which queues
have processes that are ready to run
 Find the first bit that’s set:
 140 queues  5 integers
 Only a few compares to find the first that is non-
zero
 Hardware instruction to find the first 1-bit
 bsfl on Intel
 Time depends on the number of priority levels,
not the number of processes
8
Scheduling Components
 Static Priority
 Sleep Average
 Bonus
 Interactivity Status
 Dynamic Priority

9
Static Priority
 Each task has a static priority that is set based
upon the nice value specified by the task.
 static_prio in task_struct
 The nice value is in a range of 0 to 39, with the
default value being 20. Only privileged tasks
can set the nice value below 20.
 For normal tasks, the static priority is 100 + the
nice value.
 Each task has a dynamic priority that is set
based upon a number of factors
Sleep Average
 Interactivity heuristic: sleep ratio
 Mostly sleeping: I/O bound
 Mostly running: CPU bound
 Sleep ratio approximation
 sleep_avg in the task_struct
 Range: 0 .. MAX_SLEEP_AVG (10 ms)
 When process wakes up (is made runnable),
recalc_task_prio adds in how many ticks it was sleeping
(blocked), up to some maximum value
(MAX_SLEEP_AVG)
 When process is switched out, schedule subtracts the
number of ticks that a task actually ran (without
blocking)

11
Bonus and Dynamic Priority
/* We scale the actual sleep average
* [0 .... MAX_SLEEP_AVG] into the
* -5 ... 0 ... +5 bonus/penalty range.

 Dynamic priority (prio in task_struct) is calculated in

effective_prio from static priority and bonus (which in
turn is derived from sleep_avg)
 Roughly speaking, the bonus is a number in [-5, 5] that
measures what percentage of the time the process was
sleeping recently; 0 is neutral, 5 helps, -5 hurts:

DP = SP − bonus + 5
DP = min(139, max(100, DP))
12
Calculating Time Slices
 time_slice in the task_struct
 Calculate Quantum where
 If (SP < 120): Quantum = (140 − SP) × 20
 if (SP >= 120): Quantum = (140 − SP) × 5
where SP is the static priority
 Higher priority process get longer quanta
 Basic idea: important processes should run longer
 As we will see, other mechanisms are used for quick
interactive response

13
Typical Quanta
Priority: Static Pri Niceness Quantum

Highest 100 -20 800 ms

High 110 -10 600 ms

Normal 120 0 100 ms

Low 130 10 50 ms

Lowest 139 20 5 ms
14
Interactive Processes
 A process is considered interactive if
bonus − 5 >= (Static Priority / 4) − 28
 Low-priority processes have a hard time becoming
interactive:
 A high static priority (100) becomes interactive when its
average sleep time is greater than 200 ms
 A default static priority process becomes interactive when
its sleep time is greater than 700 ms
 Lowest priority (139) can never become interactive
 The higher the bonus the task is getting and the
higher its static priority, the more likely it is to be
considered interactive.

15
Using Quanta
 At every time tick (in scheduler_tick) , decrement the quantum of
the current running process (time_slice)
 If the time goes to zero, the process is done
 Check interactive status:
 If non-interactive, put it aside on the expired list
 If interactive, put it at the end of the active list
 Exceptions: don’t put on active list if:
 If higher-priority process is on expired list
 If expired task has been waiting more than STARVATION_LIMIT
 If there’s nothing else at that priority, it will run again immediately
 Of course, by running so much, its bonus will go down, and so
will its priority and its interactive status

16
Avoiding Starvation
 The system only runs processes from active
queues, and puts them on expired queues when
they use up their quanta
 When a priority level of the active queue is empty,
the scheduler looks for the next-highest priority
queue
 After running all of the active queues, the active and
expired queues are swapped
 There are pointers to the current arrays; at the end
of a cycle, the pointers are switched
17
The Priority Arrays
struct prio_array {
unsigned int nr_active;
unsigned long bitmap[5];
struct list_head queue[140];
};
struct rq {
spinlock_t lock;
unsigned_long nr_running;
struct prio_array *active, *expired;
struct prio_array arrays[2];
task_struct *curr, *idle;
…
};

18
Swapping Arrays
struct prioarray *array =
rq->active;
if (array->nr_active == 0) {
rq->active = rq->expired;
rq->expired = array;
}

19
Why Two Arrays?
 Why is it done this way?
 It avoids the need for traditional aging
 Why is aging bad?
 It’s O(n) at each clock tick

20
The Traditional Algorithm
for(pp = proc; pp < proc+NPROC; pp++) {
if (pp->prio != MAX)
pp->prio++;
if (pp->prio > curproc->prio)
reschedule();
}
Every process is examined, quite frequently (This code
is taken almost verbatim from 6th Edition Unix, circa
1976.)
21
Linux is More Efficient
 Processes are touched only when they start
or stop running
 That’s when we recalculate priorities,
bonuses, quanta, and interactive status
 There are no loops over all processes or
even over all runnable processes

22
Real-Time Scheduling
 Linux has soft real-time scheduling
 No hard real-time guarantees
 All real-time processes are higher priority than any
conventional processes
 Processes with priorities [0, 99] are real-time
 saved in rt_priority in the task_struct
 scheduling priority of a real time task is: 99 - rt_priority
 Process can be converted to real-time via
sched_setscheduler system call

23
Real-Time Policies
 First-in, first-out: SCHED_FIFO
 Static priority
 Process is only preempted for a higher-priority process
 No time quanta; it runs until it blocks or yields voluntarily
 RR within same priority level
 Round-robin: SCHED_RR
 As above but with a time quanta (800 ms)
 Normal processes have SCHED_OTHER
scheduling policy

24
Multiprocessor Scheduling
 Each processor has a separate run queue
 Each processor only selects processes from its own
queue to run
 Yes, it’s possible for one processor to be idle while
others have jobs waiting in their run queues
 Periodically, the queues are rebalanced: if one
processor’s run queue is too long, some processes
are moved from it to another processor’s queue

25
Locking Runqueues
 To rebalance, the kernel sometimes needs to move
processes from one runqueue to another
 This is actually done by special kernel threads
 Naturally, the runqueue must be locked before this
happens
 The kernel always locks runqueues in order of
increasing indexes
 Why? Deadlock prevention!

26
Processor Affinity
 Each process has a bitmask saying what
CPUs it can run on
 Normally, of course, all CPUs are listed
 Processes can change the mask
 The mask is inherited by child processes
(and threads), thus tending to keep them on
the same CPU
 Rebalancing does not override affinity
27
Load Balancing
 To keep all CPUs busy, load balancing
pulls tasks from busy runqueues to idle
runqueues.
 If schedule finds that a runqueue has no
runnable tasks (other than the idle task), it
calls load_balance
 load_balance also called via timer
 schedule_tick calls rebalance_tick
 Every tick when system is idle
 Every 100 ms otherwise
Load Balancing
 load_balance looks for the busiest runqueue
(most runnable tasks) and takes a task that is
(in order of preference):

inactive (likely to be cache cold)

high priority
 load_balance skips tasks that are:

likely to be cache warm (hasn't run for
cache_decay_ticks time)

currently running on a CPU

not allowed to run on the current CPU (as
indicated by the cpus_allowed bitmask in the
task_struct)
Optimizations
 If next is a kernel thread, borrow the MM
mappings from prev
 User-level MMs are unused.
 Kernel-level MMs are the same for all kernel
threads
 If prev == next
 Don’t context switch

30
Sleep Time and Bonus
Average Sleep Time (ms) Bonus Time Slice Granularity
000 to 100 0 5120
100 to 200 1 2560
200 to 300 2 1280
300 to 400 3 640
400 to 500 4 320
500 to 600 5 160
600 to 700 6 80
700 to 800 7 40
800 to 900 8 20
900 to 999 9 10
31
1 second 10 10

Operating Systems
No ratings yet
Operating Systems
38 pages
Lecture Linux_Scheduling
No ratings yet
Lecture Linux_Scheduling
30 pages
OS_Scheduling
No ratings yet
OS_Scheduling
12 pages
Adding A Scheduling Policy To The Linux Kernel
No ratings yet
Adding A Scheduling Policy To The Linux Kernel
34 pages
Linux Scheduler
No ratings yet
Linux Scheduler
40 pages
Linux Scheduling Presented by Quontrasolutions
No ratings yet
Linux Scheduling Presented by Quontrasolutions
17 pages
Comparison Cpu Scheduling of Linux and Windows
No ratings yet
Comparison Cpu Scheduling of Linux and Windows
4 pages
Linux Scheduling
No ratings yet
Linux Scheduling
28 pages
COS 318: Operating Systems CPU Scheduling: Andy Bavier Computer Science Department Princeton University
No ratings yet
COS 318: Operating Systems CPU Scheduling: Andy Bavier Computer Science Department Princeton University
27 pages
Linux Scheduling
No ratings yet
Linux Scheduling
20 pages
Yvr18 220
No ratings yet
Yvr18 220
17 pages
CH 06
No ratings yet
CH 06
94 pages
Chapter 5-CPU Scheduling
No ratings yet
Chapter 5-CPU Scheduling
26 pages
Os 05 Scheduling
No ratings yet
Os 05 Scheduling
25 pages
Linux Scheduling 2002
No ratings yet
Linux Scheduling 2002
21 pages
CS330 Operating Systems Lec03
No ratings yet
CS330 Operating Systems Lec03
9 pages
scheduling-linux
No ratings yet
scheduling-linux
20 pages
He-Dieu-Hanh - Kai-Li - Cpuscheduling - (Cuuduongthancong - Com)
No ratings yet
He-Dieu-Hanh - Kai-Li - Cpuscheduling - (Cuuduongthancong - Com)
26 pages
Term Paper On Compare CPU Scheduling of LINUX and UNIX: Operateing System (Cse316)
No ratings yet
Term Paper On Compare CPU Scheduling of LINUX and UNIX: Operateing System (Cse316)
14 pages
Chapter 03 CPU Scheduling New
No ratings yet
Chapter 03 CPU Scheduling New
40 pages
Operating Systems: CPU Scheduling
No ratings yet
Operating Systems: CPU Scheduling
12 pages
Module 2.3 Linux Scheduling
No ratings yet
Module 2.3 Linux Scheduling
14 pages
Chap 19
No ratings yet
Chap 19
31 pages
Chap 19
No ratings yet
Chap 19
31 pages
Chap 9
No ratings yet
Chap 9
31 pages
Linux Scheduling: Linux Kernel Development by Robert Love Chapter 4
No ratings yet
Linux Scheduling: Linux Kernel Development by Robert Love Chapter 4
33 pages
Fedora 12 Scheduling Criteria & Algorithms
No ratings yet
Fedora 12 Scheduling Criteria & Algorithms
48 pages
04 1 Scheduling
No ratings yet
04 1 Scheduling
43 pages
Cpu Scheduling
No ratings yet
Cpu Scheduling
66 pages
05 Scheduling
No ratings yet
05 Scheduling
46 pages
A Systematic Evaluation of Transient Execution Attacks and Defenses - 15th May 2019 (1811.05441)
No ratings yet
A Systematic Evaluation of Transient Execution Attacks and Defenses - 15th May 2019 (1811.05441)
47 pages
CH 06
No ratings yet
CH 06
94 pages
Scheduling: Ren-Song Ko National Chung Cheng University
No ratings yet
Scheduling: Ren-Song Ko National Chung Cheng University
57 pages
Scheduling
No ratings yet
Scheduling
30 pages
Scheduling
No ratings yet
Scheduling
40 pages
Assignment of Operating System: Assigned by - Pinki Roy
No ratings yet
Assignment of Operating System: Assigned by - Pinki Roy
13 pages
CPU Scheduling
No ratings yet
CPU Scheduling
5 pages
Lesson 17
No ratings yet
Lesson 17
12 pages
Chapter-02-scheduling
No ratings yet
Chapter-02-scheduling
23 pages
CPU Scheduling
No ratings yet
CPU Scheduling
78 pages
CS211 Lec 13
No ratings yet
CS211 Lec 13
36 pages
Chapter 03 New 2023
No ratings yet
Chapter 03 New 2023
38 pages
Lect 3
No ratings yet
Lect 3
36 pages
05 CPU Scheduling
No ratings yet
05 CPU Scheduling
8 pages
Process
No ratings yet
Process
54 pages
CS2106 Cheatsheet
No ratings yet
CS2106 Cheatsheet
6 pages
CPU Scheduling
No ratings yet
CPU Scheduling
6 pages
4. Process Scheduling
No ratings yet
4. Process Scheduling
34 pages
Operating Sytems: B.Tech Ii Yr (Term 08-09) Unit 2 PPT Slides Text Books
No ratings yet
Operating Sytems: B.Tech Ii Yr (Term 08-09) Unit 2 PPT Slides Text Books
52 pages
part3
No ratings yet
part3
50 pages
Processes
No ratings yet
Processes
33 pages
Process Management - Scheduling
No ratings yet
Process Management - Scheduling
35 pages
Module 1(Part 2-Process scheduling
No ratings yet
Module 1(Part 2-Process scheduling
23 pages
Chapter 6: CPU Scheduling: Silberschatz, Galvin and Gagne ©2013 Operating System Concepts - 9 Edition
No ratings yet
Chapter 6: CPU Scheduling: Silberschatz, Galvin and Gagne ©2013 Operating System Concepts - 9 Edition
33 pages
CH 5. CPUscheduling
No ratings yet
CH 5. CPUscheduling
29 pages
Bfs V Cfs - Groves Knockel Schulte
No ratings yet
Bfs V Cfs - Groves Knockel Schulte
12 pages
Cpu Scheduling: Dr.P.Suresh
No ratings yet
Cpu Scheduling: Dr.P.Suresh
29 pages
MODULE 2 - Process and Scheduling
No ratings yet
MODULE 2 - Process and Scheduling
25 pages
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
LPIC-1 Primer
From Everand
LPIC-1 Primer
John Greene
4.5/5 (3)
Welcome To World's Longest Website - Growing With Every New Visitor!4
No ratings yet
Welcome To World's Longest Website - Growing With Every New Visitor!4
1 page
Welcome To World's Longest Website - Growing With Every New Visitor!10
No ratings yet
Welcome To World's Longest Website - Growing With Every New Visitor!10
1 page
Welcome To World's Longest Website - Growing With Every New Visitor!
No ratings yet
Welcome To World's Longest Website - Growing With Every New Visitor!
1 page
Memory Architecture: Chapter 5 in Hennessy & Patterson
No ratings yet
Memory Architecture: Chapter 5 in Hennessy & Patterson
23 pages
5introduction To Mutual Funds - Varsity by Zerodha
No ratings yet
5introduction To Mutual Funds - Varsity by Zerodha
7 pages
Lettering Phrases Practice Worksheet
No ratings yet
Lettering Phrases Practice Worksheet
1 page
CSE 451 Midterm Exam November 1999: Running
No ratings yet
CSE 451 Midterm Exam November 1999: Running
6 pages
Custom Lettering + Calligraphy Practice Worksheet: Recommended Supplies
No ratings yet
Custom Lettering + Calligraphy Practice Worksheet: Recommended Supplies
1 page
8the Mutual Fund Fact-Sheet - Varsity by Zerodha
No ratings yet
8the Mutual Fund Fact-Sheet - Varsity by Zerodha
7 pages
Ques 9
No ratings yet
Ques 9
1 page
11the Debt Funds (Part 1) - Varsity by Zerodha
No ratings yet
11the Debt Funds (Part 1) - Varsity by Zerodha
7 pages
4the Retirement Problem (Part 2) - Varsity by Zerodha
No ratings yet
4the Retirement Problem (Part 2) - Varsity by Zerodha
9 pages
14the Debt Funds (Part 4) - Varsity by Zerodha
No ratings yet
14the Debt Funds (Part 4) - Varsity by Zerodha
8 pages
Operating Systems: Deadlock
No ratings yet
Operating Systems: Deadlock
22 pages
Lect 27 Threads PDF
No ratings yet
Lect 27 Threads PDF
19 pages
Operating Systems: Deadlock, Memory Management
No ratings yet
Operating Systems: Deadlock, Memory Management
18 pages
Operating Systems: Memory Management
No ratings yet
Operating Systems: Memory Management
18 pages
Rogrammable Ogic Ontroller: Dr. Mohammad Salah
No ratings yet
Rogrammable Ogic Ontroller: Dr. Mohammad Salah
53 pages
IMCA Syllabus 2018
No ratings yet
IMCA Syllabus 2018
28 pages
C-Programlama - Fuat Küçük Ders Notları (İngilizce)
No ratings yet
C-Programlama - Fuat Küçük Ders Notları (İngilizce)
220 pages
Architecture: Digital Signal Controller TMS320F2812
No ratings yet
Architecture: Digital Signal Controller TMS320F2812
16 pages
Assembly: Language
No ratings yet
Assembly: Language
5 pages
Eduqas GCSE Computer Science Guidance For Teaching
No ratings yet
Eduqas GCSE Computer Science Guidance For Teaching
52 pages
Module-2-OS-BCS303
No ratings yet
Module-2-OS-BCS303
81 pages
Elet 3405 HW 4
0% (1)
Elet 3405 HW 4
6 pages
Service Manual, Rev. N: Kodak Dryview 8300 Laser Imager
No ratings yet
Service Manual, Rev. N: Kodak Dryview 8300 Laser Imager
284 pages
1.1-2 System Specification
No ratings yet
1.1-2 System Specification
35 pages
2016A IP Question
No ratings yet
2016A IP Question
46 pages
Holiday Homework SESSION: 2024-25 Class Xi Subject Computer Science/Informatics Practices
No ratings yet
Holiday Homework SESSION: 2024-25 Class Xi Subject Computer Science/Informatics Practices
3 pages
Lab Report
No ratings yet
Lab Report
24 pages
Introduction To Programming: Deitel & Deitel: Chapter 1
No ratings yet
Introduction To Programming: Deitel & Deitel: Chapter 1
11 pages
TSN2101/TOS2111 - Tutorial 1 (Introduction To Operating Systems)
No ratings yet
TSN2101/TOS2111 - Tutorial 1 (Introduction To Operating Systems)
5 pages
Section 01
No ratings yet
Section 01
78 pages
Cao PDF
No ratings yet
Cao PDF
64 pages
PDC Week 2 (Performance Metrice, Amdahl's Law)
No ratings yet
PDC Week 2 (Performance Metrice, Amdahl's Law)
18 pages
Analyze The Diagram Below. Discuss The Concepts and Components Involved in The Illustration Using 5 To 7 Sentences
No ratings yet
Analyze The Diagram Below. Discuss The Concepts and Components Involved in The Illustration Using 5 To 7 Sentences
2 pages
Undestanding Assembly Language
100% (1)
Undestanding Assembly Language
28 pages
XII CS II
No ratings yet
XII CS II
3 pages
Questions Based On Information Technology and IT Audit: Computer Computing Applications
100% (1)
Questions Based On Information Technology and IT Audit: Computer Computing Applications
3 pages
CA 13 VectorProcessors
No ratings yet
CA 13 VectorProcessors
16 pages
CH14 COA9e Processor Structure and Function
No ratings yet
CH14 COA9e Processor Structure and Function
40 pages
12 - Processor Structure and Function
No ratings yet
12 - Processor Structure and Function
73 pages
ARM7 Processor Architecture
No ratings yet
ARM7 Processor Architecture
33 pages
Test Bank for Atmel AVR Microcontroller MEGA and XMEGA in Assembly and C 1st Edition Han Way 1133607292 9781133607298 - Download Now And Never Miss A Chapter
100% (19)
Test Bank for Atmel AVR Microcontroller MEGA and XMEGA in Assembly and C 1st Edition Han Way 1133607292 9781133607298 - Download Now And Never Miss A Chapter
32 pages
ECE Embedded Syllabus
No ratings yet
ECE Embedded Syllabus
2 pages
1 s2.0 S0010465518303990 Main
No ratings yet
1 s2.0 S0010465518303990 Main
11 pages
OSqbank1 2024
No ratings yet
OSqbank1 2024
5 pages

Scheduling

Uploaded by

Scheduling

Uploaded by

Scheduling

 Dynamic priority (prio in task_struct) is calculated in

Highest 100 -20 800 ms

High 110 -10 600 ms

Normal 120 0 100 ms

You might also like