0% found this document useful (0 votes)

65 views

Ksync Notes

1. Kernel synchronization is required to protect shared resources accessed by different kernel execution contexts like processes, interrupts, and kernel threads. 2. In a uni-processor system, locks may not be needed if preemption is disabled during critical sections, but interrupts could still cause issues if they access shared state. 3. In a multi-processor system, preemption disabling is not sufficient for synchronization - locks are needed to prevent processes on different CPUs from accessing shared resources simultaneously. Interrupt disabling also does not help if interrupts in one CPU can enable other CPUs.

Uploaded by

RAMU

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

65 views

Ksync Notes

Uploaded by

RAMU

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Kernel Synchronization

Two main aspects of OS synchronization design are, (i) provide synchronization constructs
for the user space processes and threads, (ii) meet the synchronization requirements of the
OS software itself. In this course, we will deal with the latter aspect. However, user space
synchronization is an interesting design question: what kind of kernel support (if at all any)
required for user-space synchronization? System call APIs for user-space synchronization
support is provided by futex construct in Linux kernel (See man pages).
From the synchronization perspective of kernel code itself, following questions arise.
1. When and why locking is necessary?
2. Is it required in a uni-processor system?
3. How kernel synchronization different than user-space synchronization?
Resources can be shared across multiple contexts like process context (in system call han-
dlers), interrupt or bottom half contexts and kernel threads. In such scenarios, synchroniza-
tion requirements can be different. Let us analyze the different setups and scenarios one by
one.

Uni-processor synchronization
• A system with single CPU can execute one context at a time. So, why use locks?
Consider a scenario when P1 executing a system call S1 in kernel context gets switched
by another process P2 who is also executing in kernel context and access shared state
used in the implementation of S1.
• Can we avoid locks? If we disable preemption when executing system calls (a.k.a. non-
preemptable kernels), the context switch is avoided.
• What about interrupts arising during the system call execution? If there is a shared
state between the system call handler and interrupt handler, should we use locks?
Note that, interrupt handlers can not be switched out by process contexts. As a result
situation could arise. What should be the solution? Oops ... we have to disable
interrupts
• What if a shared state is accessed by two different interrupt handlers? Locking would
not help... interrupt disabling is required.
To summarize, depending on the entities involved in access of shared state, a synchronization
technique should be used. It is advisable to use the lowest possible level of serialization for
better responsiveness of the system. For example, disabling interrupts implicitly disables
preemption, but results in possible delay in interrupt processing and should be avoided if
possible.

Multi-processor synchronization
Synchronization requirements in a multi-processor system becomes tricky.
• With shared state between process contexts (syscall handlers), can preemption dis-
abling work? Obviously not, because another process can execute on another processor
potentially accessing a shared resource. Using locks becomes inevitable.
• What about shared state between process context and an interrupt context? Does
interrupt disabling solve the problem? Why or why not? A processor can disable
interrupts on the local processor which means interrupts can be serviced on other
processors. So, the solution is: locking + local interrupt disabling
• Similarly, protecting shared state between multiple interrupt handlers requires combin-
ing interrupt disabling with locking.

There is a basic bottleneck in accessing the synchronization construct from different proces-
sors. The cache coherency overheads cause performance degradation. From the OS design
point of view, minimizing lock contention from different processors is always desirable (but
may not be always possible). An example alternate design is per-CPU data structures where
every CPU has the local copy and the OS takes care of the consistency requirements across
the different copies.

Synchronization techniques
There could be many ways to approach the problem of meeting synchronization requirement
on shared resources.

1. Every execution entity checks for availability of an entry token in a tight loop till the
time of acquiring the token. Example: spinlocks, rwlocks etc.

2. As a modification to the tight loop behavior, an execution entity may sleep if the
resource is busy and woken up latter when the resource is free. Example: mutex,
semaphore etc.

3. What about not waiting for the ‘entry token’ at all? Example: RCU, RLU and trans-
actional memory.

While using different type of locks, one should bear in mind the tradeoffs of using different
types of locks. Possible tradeoffs: lock acquisition overhead vs. lock acquisition latency, lock
granularity vs. code complexity and CPU cycle wastage due to lock wait vs. context switch
overhead. Most operating systems (including Linux) use architecture support as a building
block to build synchronization constructs. However, in theory it is possible to implement locks
without assuming any architecture support. (Refer standard OS books: Galvin, Tanenbom
...).

Spinlocks
1. An execution context “spins” around the lock till it grabs a hold of it.

2. Which of the synchronization requirements are not met by spinlocks?

3. Useful when lock to be held for short durations, why?

A wrong implementation!

lock (spinlock sl)

{
1. while(sl != 0);
2. sl = 1;
}

unlock(spinlock sl)
{
sl = 0;
}

2
Line #1 and #2 in lock() procedure can be interleaved, resulting in multiple entities holding
the lock. One way to avoid interleaving is to execute the two operations (checking for the
lock availability and setting the lock value) in an atomic manner. An atomic operation
is an uninterruptible and indivisible operation. Should be careful about atomicity while
writing code which is executed simultaneously (a.k.a. critical section) by more than one
execution contexts. If a critical section consists of many instructions, atomicity can not be
guaranteed because an interrupt might leave the machine in an intermediate state. Similarly,
an operation which can be further divided into many sub-operations while a state resulting
due to a sub-operation is visible from another CPU can be a non-atomic operation. For
example, an atomic increment of a memory location should ensure that other CPUs either
access the memory location before increment operation or after completion of the increment
operation. A critical section consisting of
1. Multiple C instructions → non-atomic

2. Single C statement → if multiple assembly instructions → non-atomic

3. Single assembly instruction → can be non-atomic!

What kind of single instructions can be non-atomic? Answer: Read-Modify-Write instruc-
tions leave a memory state inconsistent with the processor state and could lead to consistency
issues. Most arithmetic operations are non-atomic. For example, inc (Mem) is non-atomic
unless other processors are denied reading the memory variable during the instruction execu-
tion. Prefixing an instruction with lock assembly prefix prohibits other processor accessing
memory at the same time (memory bus is locked). An example implementation of atomic
increment is shown below.
atomic_inc( atomic_t *var)
{
lock; inc (var)
}
A plethora of atomic operations are supported by Linux. atomic add, atomic sub,
atomic inc and test etc. are some examples of atomic API of Linux.
Homework: Refer Linux kernel source to understand implementation of various atomic
operations.
Coming back to our wrong spinlock implementation, if line#1 and #2 can be atomically
executed, it can help the cause. One of the basic hardware instruction used to implement
locks is cmpxchng or cmpandswap. A cmpxchng instruction compares the content of a register
with the memory content and swap them if they are equal. The logic of cmpxchng instruction
is shown below.
cmpxchng

cmpxchng destination [Mem/Reg] source[Reg]

implicit registers :- eax and eflags

if eax == val(destination)
then
zeroflag = 1
destination = source
else
zeroflag = 0
eax = destination

3
The destination operand is compared with eax register and if equal, the content of the
destination is copied onto the source operand. Additionally, the zero flag bit of eflags
register is set. Otherwise, the content of destination is copied onto the eax register. Can we
rewrite the spinlock code using cmpxchng instruction to implement a spinlock? Note that,
the cmpxchng instruction is non-atomic and should be prefixed by a lock prefix to ensure
atomicity.
A spinlock can be easily implemented using this basic construct as shown below.
lock (spinlock *sl)
{
1: mov ecx, 1

2: xor eax, eax

lock; cmpxchng (sl), ecx
jnz 2b
ret
}

unlock (spinlock *sl)

{
mov (sl) $0
}
There are other instructions like xchng which can be used to implement spinlocks.
Homework: Understand the syntax of xchng instruction and implement a spinlock using
this instruction.
Notes
1. Read and write operations require same type of locking
2. Context holding the spinlock should not (can not?) sleep
3. A context holding spinlock should not ideally be switched, Linux does not employ any
techniques for deadlock avoidance, it is up to the programmers like us!
4. So, many a times a spinlock is accompanied by disabling interrupts

Spinlock with interrupt disabling

Many a times, the context intending to hold the lock would require disabling the interrupts
at the same time. Linux provides APIs like spinlock irqsave and spinlock irqrestore
to achieve the above objective. Sometimes, bottom-halves require disabling instead of inter-
rupts. Similar APIs are provided for the bottom halves.
spinlock_irqsave (spinlock *sl)
{
local_irq_disable();
spin_lock(sl);
}

spinlock_irqrestore (spinlock *sl)

{
spin_unlock(sl);
local_irq_enable();
}
The ordering of interrupt disabling and lock acquire/release matters.

4
Read-write locks
1. Multiple readers are allowed but only one writer allowed.
2. Useful when a significant percentage of access does not modify any shared data.
3. Some form of reader-count need to be maintained.

Value 0x01000000 --> unlocked

0x00000000 --> locked for writing
0x00FFFFFF --> one reader
0x00FFFFFE --> two readers
.....
Logic: A reader wants to acquire lock decrements lock value. If the lock value is negative
implies or . Otherwise, read-lock is granted. A write lock is granted only when the
lock value is 0x01000000. So, how many readers can hold the lock simultaneously?
So, Is everything great about read-write locks? What could be the issues? The readers
can dominate the lock acquisition causing delays in write operations. Additionally, it requires
more operations to implement this locking mechanism.

Read-Copy-Update (RCU)

Figure 1: Logical view of RCU operations.

• A reader can access to the shared object without holding any locks albeit with the
condition that it will disable preemption and will not sleep.
• A writer creates a separate copy on which update is performed. After updation, it
updates the old memory location pointer using an atomic operation.
• So what happens to the old copy? When should it be freed? Only after all references
taken by the readers are released.
• What is the maximum number of readers that would hold reference to the old value?

5
Semaphores
• Semaphores are waiting locks. When an execution context wants to acquire a lock held
by another execution entity, it puts itself into a sleep-wait state.

• On a lock release, one of the waiting entity for the lock is woken up.

• In Linux, wait queues are extensively used to implement semaphores.

• In practice linux kernel extensively use the following special semaphore realizations:
mutexes or binary semaphore and read-write semaphores.

An example semaphore data structure (similar to one used in the kernel) is shown here.

struct semaphore{
spinlock semlock;
int count;
struct list_head waitq;
}

For a mutex implementation, the count variable (count) is initialized to one. For a generic
counting semaphore, it can be set to a value which indicates maximum number of execution
contexts allowed in a critical section. A positive value of count implies a free lock while any
other value indicates that lock is being used. There are two main operations on semaphore—
down() (lock) and up() (unlock). Note that down() and up() operations require atomicity
as they can be executed by multiple execution entities in a concurrent manner. A spinlock
(semlock) is used to ensure atomicity of the semaphore operations. Further, a wait queue
is associated with every semaphore to account for execution entities waiting (sleeping) for
the semaphore. Both down() and up() procedures acquire the semlock and simultenously
disable interrupts while manipulating the elements in the semaphore structure.
down(): If count value is greater than zero, the semaphore is granted after decrementing
the count value. Otherwise, the context self-sleeps—adds itself to waitq, releases semlock
and relinquishes the CPU. When the context is woken up (may be by a signal or genuinely by
release of the semaphore), first it acquires the semaphore lock (semlock), checks if it is woken
up by semaphore release. If the context is woken up by a semaphore release, semaphore is
granted to this execution context, otherwise it re-enters the self-sleep mode.
up(): After acquiring semlock, the releasing context checks if the wait queue (waitq)
is empty. In this case, simply the count variable is incremented before returning from the
function. Otherwise, the first process in the wait queue is woken up before finishing the up()
routine by releasing the semlock.

p150-638-v6 0 HYLED5010
100% (1)
p150-638-v6 0 HYLED5010
12 pages
Node.js 63 Interview Questions and Answers
From Everand
Node.js 63 Interview Questions and Answers
John Edward Cooper Berg
No ratings yet
Transfuser 2 User Guide
No ratings yet
Transfuser 2 User Guide
205 pages
Chapter 76 - Engine Controls: BHT-206L4-MM-9
No ratings yet
Chapter 76 - Engine Controls: BHT-206L4-MM-9
22 pages
SMP Locking PDF
No ratings yet
SMP Locking PDF
22 pages
Kernel Kernel Synchronization
No ratings yet
Kernel Kernel Synchronization
9 pages
CS347 04 Process Sync
No ratings yet
CS347 04 Process Sync
14 pages
Semaphore Basics
No ratings yet
Semaphore Basics
6 pages
Synchronization Linux
No ratings yet
Synchronization Linux
35 pages
OS
No ratings yet
OS
6 pages
5.2.concurrency-locks
No ratings yet
5.2.concurrency-locks
26 pages
15 Synchronization
No ratings yet
15 Synchronization
120 pages
Locking in Linux kernel
No ratings yet
Locking in Linux kernel
62 pages
Synchronization
No ratings yet
Synchronization
9 pages
16 Synchronization
No ratings yet
16 Synchronization
9 pages
Merged 2
No ratings yet
Merged 2
21 pages
9-Operating Systems -Synchronization, interprocess communication, deadlock(1)
No ratings yet
9-Operating Systems -Synchronization, interprocess communication, deadlock(1)
162 pages
Lecture 05
No ratings yet
Lecture 05
8 pages
Lecture 06
No ratings yet
Lecture 06
16 pages
ES - RTOS_Part 6 (5th sem) VTU
No ratings yet
ES - RTOS_Part 6 (5th sem) VTU
37 pages
11 Locks
No ratings yet
11 Locks
27 pages
Lec08 Notes
No ratings yet
Lec08 Notes
3 pages
Lab Synchronization
No ratings yet
Lab Synchronization
20 pages
Lock
No ratings yet
Lock
53 pages
Mpsync
No ratings yet
Mpsync
14 pages
Implementing Locks: How To Write Correct Concurrent Programs? No Race
No ratings yet
Implementing Locks: How To Write Correct Concurrent Programs? No Race
4 pages
lecture_27_28_29
No ratings yet
lecture_27_28_29
17 pages
Hardware and Software Synchronization Advanced Computer Architecture COMP 140 Thursday June 26, 2014
No ratings yet
Hardware and Software Synchronization Advanced Computer Architecture COMP 140 Thursday June 26, 2014
33 pages
Lab3 Synchronization
No ratings yet
Lab3 Synchronization
17 pages
005 Readerwriter
No ratings yet
005 Readerwriter
33 pages
Os Galvinsilberschatz Sol
No ratings yet
Os Galvinsilberschatz Sol
134 pages
Concurrency: Mutual Exclusion and Synchronization
No ratings yet
Concurrency: Mutual Exclusion and Synchronization
38 pages
Concurrency: Mutual Exclusion and Synchronization
No ratings yet
Concurrency: Mutual Exclusion and Synchronization
38 pages
Basic Operating System Concepts: A Review
No ratings yet
Basic Operating System Concepts: A Review
53 pages
OS 13 - Locks PDF
No ratings yet
OS 13 - Locks PDF
26 pages
Module3 Process Synchronization
No ratings yet
Module3 Process Synchronization
40 pages
CSC 553 Operating Systems: Lecture 4 - Concurrency: Mutual Exclusion and Synchronization
No ratings yet
CSC 553 Operating Systems: Lecture 4 - Concurrency: Mutual Exclusion and Synchronization
32 pages
Lec07 Exclusion
No ratings yet
Lec07 Exclusion
33 pages
Operating Systems: Synchronization
No ratings yet
Operating Systems: Synchronization
26 pages
OS3_4
No ratings yet
OS3_4
12 pages
Unit V: Task Communication
No ratings yet
Unit V: Task Communication
21 pages
OS Process Synchronization Unit 3
No ratings yet
OS Process Synchronization Unit 3
58 pages
Network Programming: Inter Process Communication
No ratings yet
Network Programming: Inter Process Communication
82 pages
Basic Operating System Concepts
No ratings yet
Basic Operating System Concepts
54 pages
Futex Seminar Report
No ratings yet
Futex Seminar Report
33 pages
UNIT 2
No ratings yet
UNIT 2
15 pages
DeadLock Vs Spinlock
No ratings yet
DeadLock Vs Spinlock
3 pages
Kernel Internals: Santosh Sam Koshy Santoshk@cdac - in Centre For Development of Advanced Computing, Hyderabad
No ratings yet
Kernel Internals: Santosh Sam Koshy Santoshk@cdac - in Centre For Development of Advanced Computing, Hyderabad
54 pages
Proceedings of The Bsdcon 2002 Conference: Usenix Association
No ratings yet
Proceedings of The Bsdcon 2002 Conference: Usenix Association
10 pages
Synchronization
No ratings yet
Synchronization
81 pages
Synchronization Primitives
No ratings yet
Synchronization Primitives
14 pages
1.interprocess Communication Mechanisms 2.memory Management and Virtual Memory
No ratings yet
1.interprocess Communication Mechanisms 2.memory Management and Virtual Memory
45 pages
Synchronization OS
No ratings yet
Synchronization OS
83 pages
8 Synchronization
100% (1)
8 Synchronization
112 pages
TOPCIT Reviewer OS and ComArch
No ratings yet
TOPCIT Reviewer OS and ComArch
20 pages
Chapter 6 - Synchronization Tools - Part 2
No ratings yet
Chapter 6 - Synchronization Tools - Part 2
32 pages
Lab 3
No ratings yet
Lab 3
18 pages
Process Synchn-2
No ratings yet
Process Synchn-2
33 pages
OS Process Synchronization Unit 3
No ratings yet
OS Process Synchronization Unit 3
55 pages
UNIT-3 Process synchronization and deadlock
No ratings yet
UNIT-3 Process synchronization and deadlock
47 pages
000-Lecture 7 -Process Synchronization
No ratings yet
000-Lecture 7 -Process Synchronization
24 pages
Hack into your Friends Computer
From Everand
Hack into your Friends Computer
Magelan Cyber Security
No ratings yet
Linux Shell Scripting Simplified: A Practical Guide with Examples
From Everand
Linux Shell Scripting Simplified: A Practical Guide with Examples
William E. Clark
No ratings yet
A Map of The Networking Code
No ratings yet
A Map of The Networking Code
41 pages
PHD Screening Test Lab
No ratings yet
PHD Screening Test Lab
4 pages
Packet Through The Linux Network Stack
No ratings yet
Packet Through The Linux Network Stack
32 pages
C Faq
No ratings yet
C Faq
306 pages
Final Year Mathematics Syllabus
No ratings yet
Final Year Mathematics Syllabus
5 pages
A Fully Automated Deep Packet Inspection Verification System With Machine Learning
No ratings yet
A Fully Automated Deep Packet Inspection Verification System With Machine Learning
7 pages
System Programming With Linux Debugging Using C and C++ Programming Topics
No ratings yet
System Programming With Linux Debugging Using C and C++ Programming Topics
19 pages
Week7 Assignment 7 V3 UPDATED
No ratings yet
Week7 Assignment 7 V3 UPDATED
18 pages
Bits Arithmatic
No ratings yet
Bits Arithmatic
14 pages
C Program To Implement A Stack: Problem Description
No ratings yet
C Program To Implement A Stack: Problem Description
9 pages
Interfacing With Device Drivers
100% (1)
Interfacing With Device Drivers
6 pages
Telecommunication Handbook
80% (5)
Telecommunication Handbook
288 pages
9/15: Security Problems With TCP/IP: 1 Threat Model
No ratings yet
9/15: Security Problems With TCP/IP: 1 Threat Model
5 pages
Using U Boot
No ratings yet
Using U Boot
5 pages
Memory Cell in Computer
No ratings yet
Memory Cell in Computer
26 pages
Tutor Vista Paper
No ratings yet
Tutor Vista Paper
2 pages
Instrument Landing System
No ratings yet
Instrument Landing System
14 pages
ManageEngine RecoveryManager Plus Best Practices Guide
No ratings yet
ManageEngine RecoveryManager Plus Best Practices Guide
5 pages
Chapter One 1.0 Background of Study
No ratings yet
Chapter One 1.0 Background of Study
33 pages
Cisco Router-On-A-Stick With Switch
No ratings yet
Cisco Router-On-A-Stick With Switch
4 pages
Lc40f22e457 PDF
100% (1)
Lc40f22e457 PDF
110 pages
Data Sheet TRIO-20 - 0-27 - 6 - BCD - 00379 - EN - Rev A
No ratings yet
Data Sheet TRIO-20 - 0-27 - 6 - BCD - 00379 - EN - Rev A
4 pages
Appendix: A Simple Machine Language
No ratings yet
Appendix: A Simple Machine Language
2 pages
VW MKIII Total Front Suspension Rebuild
No ratings yet
VW MKIII Total Front Suspension Rebuild
16 pages
HDL Introduction
No ratings yet
HDL Introduction
9 pages
Sunil Kumar Project
No ratings yet
Sunil Kumar Project
11 pages
Android Hidden Codes
No ratings yet
Android Hidden Codes
3 pages
Maintanence Schedule For Signaling Gears Used Over Indian Railways.
100% (1)
Maintanence Schedule For Signaling Gears Used Over Indian Railways.
38 pages
GSM Based Home Automation Using Arm9
No ratings yet
GSM Based Home Automation Using Arm9
3 pages
Mcgraw-Hill ©the Mcgraw-Hill Companies, Inc., 2001
No ratings yet
Mcgraw-Hill ©the Mcgraw-Hill Companies, Inc., 2001
38 pages
Advanced Data Structures and Algorithms
No ratings yet
Advanced Data Structures and Algorithms
4 pages
drw-1000 Manual
No ratings yet
drw-1000 Manual
59 pages
Infineon XMC4300 DataSheet v01 02 EN-3364105
No ratings yet
Infineon XMC4300 DataSheet v01 02 EN-3364105
105 pages
With 8051 Microcontroller: Analog To Digital Convertor Interface
No ratings yet
With 8051 Microcontroller: Analog To Digital Convertor Interface
37 pages
42pfl9900d 10 Pss Eng
No ratings yet
42pfl9900d 10 Pss Eng
3 pages
Equipment Maintenance Schedule (For Check)
No ratings yet
Equipment Maintenance Schedule (For Check)
2 pages
Lecture-5 (8086 Hardware Specifications - Pin Specification and Timing Diagrams) Notes
No ratings yet
Lecture-5 (8086 Hardware Specifications - Pin Specification and Timing Diagrams) Notes
42 pages
Disruptive Technology
No ratings yet
Disruptive Technology
12 pages
MReport
No ratings yet
MReport
125 pages
74LS42 PDF
No ratings yet
74LS42 PDF
3 pages
Final PPT Bloo Website
No ratings yet
Final PPT Bloo Website
13 pages
Microsoft Project
No ratings yet
Microsoft Project
4 pages
How To Configure A Remote Desktop Licensing Server For Vspace 6 PDF
No ratings yet
How To Configure A Remote Desktop Licensing Server For Vspace 6 PDF
12 pages
CPU Scheduling
No ratings yet
CPU Scheduling
39 pages

Ksync Notes

Uploaded by

Ksync Notes

Uploaded by

Kernel Synchronization

2. Which of the synchronization requirements are not met by spinlocks?

3. Useful when lock to be held for short durations, why?

lock (spinlock sl)

2. Single C statement → if multiple assembly instructions → non-atomic

3. Single assembly instruction → can be non-atomic!

cmpxchng destination [Mem/Reg] source[Reg]

2: xor eax, eax

unlock (spinlock *sl)

Spinlock with interrupt disabling

spinlock_irqrestore (spinlock *sl)

Value 0x01000000 --> unlocked

Figure 1: Logical view of RCU operations.

• In Linux, wait queues are extensively used to implement semaphores.

You might also like