0% found this document useful (0 votes)
2 views

12. Operating Systems - Part 2

The document outlines key concepts related to IT infrastructure and operating systems, including system calls, APIs, device drivers, memory management, and failover clustering. It discusses the importance of managing hardware resources, ensuring high availability through clustering, and enhancing operating system performance and security through patching, hardening, and user account management. Additionally, it emphasizes the need for proper configuration and recovery strategies to maintain system integrity and performance.

Uploaded by

samir.elsagheer
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

12. Operating Systems - Part 2

The document outlines key concepts related to IT infrastructure and operating systems, including system calls, APIs, device drivers, memory management, and failover clustering. It discusses the importance of managing hardware resources, ensuring high availability through clustering, and enhancing operating system performance and security through patching, hardening, and user account management. Additionally, it emphasizes the need for proper configuration and recovery strategies to maintain system integrity and performance.

Uploaded by

samir.elsagheer
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 41

IT Infrastructure

Architecture
Infrastructure Building Blocks
and Concepts

Operating Systems – Part 2


(chapter 12)
APIs and system calls

• System calls are programming functions


 Provide a hardware-independent interface to tasks the operating system can perform
for applications

• Example:

 int read(int handle, void *buffer, int nbyte);

 translates into the system call

 READ(FILEHANDLE, DESTINATION DATA POINTER, NUMBER_OF_BYTES)


APIs and system calls

• The operating system takes care of:


 Looking-up the file in a file allocation table
 Looking up the disk blocks on disk
 Instructing the disk controller to fetch the needed disk blocks
 Copy the disk blocks to memory
 Providing a pointer to the disk blocks in memory
APIs and system calls

• System calls are grouped and presented to application processes as


Application Programming Interfaces (APIs)
• APIs describe the available system calls in an operating system and how
they can be used by programmers in their applications
• Each operating system has its own API
 UNIX and Linux use the POSIX standard
 Windows has its own API
Device drivers

• The operating system manages all hardware


• I/O devices are controlled using device drivers
 Pieces of software that interact with the device's
hardware (like an Ethernet network adapter or a SAS
disk adapter)
 They provide an Application Programming Interface
(API) to the operating system
Memory management

• The operating system:


 Allocates and de-allocates memory on behalf of applications
 Manages what happens when the amount of requested memory exceeds the physical
amount of memory

• Memory management includes:


 Cache management
 Paging
 High volume data transfers
 Memory management units (MMUs)
 Thin memory provisioning (memory overcommitting)
 Direct Memory Access (DMA)

• The operating system takes care of all of this and just provides chunks of
memory to applications
DMA

• Direct Memory Access (DMA) allows devices to access main memory


directly, without CPU assistance
 Devices can transfer data to or from memory much faster than if it had to go through
the CPU
 It frees up the CPU to perform other tasks
Paging and swapping

• Paging and swapping are both memory management techniques


• Paging: a process's virtual address space is divided into blocks called pages
 When main memory is low, pages can be moved from main memory to disk, called
"paging out“
 When a process needs a page that is paged out, it is read from disk and put it back into
main memory, a process called "paging in“

• Swapping: Entire processes are moved between main memory and disk
• Paging is usually not a problem, but swapping should be avoided as much
as possible
 Swapping makes the system extremely slow
Shells, CLIs and GUIs

• A shell provides a user interface to the operating system


 The primary purpose of shells is to launch other programs, started by end users or
scripts

• Two types of shells:


 Command-Line Interfaces (CLIs)
 The user types commands on a keyboard on a command-prompt
 Examples: UNIX shells (bash, sh, csh) and Windows’ cmd.exe (also known as a DOS box)
 Graphical User Interfaces (GUIs)
 The user uses a mouse to click on icons or buttons
 Examples Microsoft Windows and X Windows (UNIX and Linux)
Operating system configuration

• The configuration of an operating system is stored in an operating system


specific database or in text files
• Examples:
 Windows registry
 Files in the Linux /etc directory
 AIX Object Data Manager (ODM) database

• For most used configuration parameters, user-friendly tools are provided


 These tools still edit the text files, but that is hidden from the user
Operating system availability
Failover clustering

• A failover cluster is:


 A group of independent servers running identical operating systems (known as
“nodes”)
 Connected via a network
 Controlled by cluster software running on the nodes

• Every active application has a standby counterpart available on a passive


node
 It sits idle until a failover is needed
 After a failover, this standby application becomes active and provides service to clients

• A failover cluster provides high availability to applications


 It manages each running application within a node as a package of application
components, called a resource pool or an application package
Failover clustering

• Examples of cluster software products are:


 Parallel Sysplex
 For IBM mainframes
 HACMP
 For IBM AIX UNIX
 MC/Service Guard
 For HP-UX UNIX
 Windows Cluster Service
 For Microsoft Windows
 Heartbeat and Pacemaker
 For Linux
Failover clustering

• A resource pool is the single unit of failover within a cluster. It typically


contains:
 Application name and identifier
 Start script for the application
 Stop script for the application
 Monitor script for the application
 Continuously checks the status of the application
 If the application does not work as expected, a restart or failover is initiated
 Virtual IP address the application can be addressed with
 Mount points for storage – the disks that must be available to the application
Failover clustering

• A cluster network typically


consists of redundant physical
Ethernet connections
• Carries heartbeats between all
nodes in the cluster
• A heartbeat allows nodes to
detect the unavailability of
nodes by regularly sending
packets to each other's network
interfaces
Failover clustering

• All nodes are able to access data on shared storage


 Every individual disk is mounted to one active application only at any given time
 This usage of shared storage is also called ‘shared nothing clustering’

• Distributed Lock Management (DLM) clustering:


 Each cluster node can access the same resource, for instance a disk, at the same time
 A lock mechanism is responsible to manage data to avoid corruption
Failover clustering

• In case of for instance a server crash or a power outage, all applications


running on that server node will not be brought down cleanly
 When the applications are restarted on another node in the cluster, standard crash
recovery should take place
 The file system must take care of performing file system checks before mounting
 The application must perform its standard recovery on startup

• Application recovery in case of a failover is identical to an application


startup following a server power failure
Failover clustering

• A spare node could be added to a


cluster to handle failovers
• This is called a N+1 cluster
 N represents the number of nodes with
active applications

• N+2 or N+3 can provide more


redundancy
Failover clustering

• An alternative is an N to N cluster
• There is no spare idle node
• Each node has some spare
capacity
Failover clustering

• The advantage of an N+N cluster is that the available hardware is always


used
• All memory and CPU cycles in the operating system can be used by all
running applications
 When a failover occurs, less memory and CPU cycles are available to the applications,
possibly leading to some performance degradation
Voting and quorum disks

• Most clusters contain two nodes


• In a cluster with an even number of nodes, if the nodes are disconnected
from each other, the status of the other nodes is unknown to each node
• One of two situations occurs:
 Each node decides that it has lost contact with the active cluster
 Both nodes decide to stop (effectively bringing down the cluster)
 Each node decides that the other node must be down
 Each node decides to be the new active node in the cluster
 Known as a split-brain situation
Voting and quorum disks

• A voting mechanism determines which part


of the cluster is faulty and which part of
the cluster is working properly
• In a two-node cluster there is no majority
possible in a voting system
• Therefore a virtual third node is used
 Usually in the form of a shared disk called a
quorum disk
 Installed at a third location
Voting and quorum disks

• The quorum acts as one vote in the voting


system
• The quorum disk is always assigned to one
(and only one) node at any time
• A faulty node releases it quorum
assignment automatically
• The working node gets two votes: one from
itself, the other from the quorum disk
• The faulty node will stop working, because
it has only one vote
Cluster-aware applications

• Cluster-aware applications run active instances on multiple nodes


• Examples:
 Oracle RAC (Real Application Cluster)
 Microsoft SQL Server Always On Failover Cluster
 Microsoft Exchange Server

• Enhances switch-over times in case of a failure


 In case of a failure, the application does not need to be started on another node before
it can service clients

• Cluster-aware applications provide scalability in addition to high


availability
 Client requests can be distributed among multiple cluster nodes
 Handle increased demand and traffic by adding additional nodes to the cluster
Operating system performance
Operating system performance

• The performance of an operating system is dependent on:


 The performance of the underlying hardware
 The type of load generated by the applications
 The configuration of the operating system itself

• Some operating system performance can be gained by:


 Increasing memory
 Decreasing the kernel size
Increasing memory

• An operating system should have enough memory to run all applications


needed at any time
• When an application needs more than the available memory, memory is
freed:
 Moving less used memory pages to disk
 Paging
 Some paging is not bad
Increasing memory
 When memory is really low, moving an entire application’s allocated memory to disk
 Swapping
 Ruins the performance of an operating system
 Data stored on disk is at least three orders of magnitude slower than data stored in RAM
memory
 Swapping must be avoided at all times
 Increase memory
 Run less (demanding) applications
Increasing memory

• Increasing memory benefits the operating systems’ performance


• All memory not used by applications is used to cache disk blocks
 This is the main reason why the performance of operating systems usually increases
when memory is added

• Operating systems use highly sophisticated algorithms to optimize disk


caching
• In general, tweaking the memory management system of an operating
system provides little benefits
Decreasing kernel size

• Some operating systems (like UNIX and Linux) allow tuning kernel
parameters of the operating system
 Unused features (like support for IPv6 or floppy disk drives) can be switched off, leading
to a smaller kernel size

• To create a smaller kernel, the kernel must be recompiled or re-linked


 This is a highly automated, low risk operation on most UNIX and Linux systems
 A restart of the operating system is needed after a kernel rebuild

• Not all operating systems allow rebuilding the kernel


 For instance, the Windows kernel cannot be rebuilt
Decreasing kernel size

• A smaller kernel has the following benefits:


 It simplifies the kernel:
 Lower risk of crashes
 Smaller security attack surface
 The kernel must be in memory at all times
 It cannot be paged or swapped-out
 A smaller kernel will free up memory for applications and disk caching
 Switched-off features don't need patching to keep them up-to-date
 The operating system starts faster when the kernel is small
Operating system security
Patching

• All operating system vendors provide small software updates called


patches:
 Fixing bugs or design flaws
 Closing security holes
 Small improvements
Patching

• In general, patches come in three categories:


 Regular patches
 Meant to fix low priority software bugs
 Some regular patches fix multiple bugs at once
 Hot-fixes
 Repairs a bug or flaw in the operating system that needs to be fixed fast
 Used to close a security hole or to fix an error introduced by another patch or service pack
 Hot-fixes should be installed as soon as possible
 Service packs
 Also known as support packs or patch packs
 A collection of patches and hot-fixes that are packed together and can be installed in one
deployment
 Sometimes service packs also introduce new functionality to the operating system
Patching

• It is good practice to install all patches, hot-fixes, and service packs as


soon as possible
• Test them before deploying in production
 They could introduce unwanted effects in the infrastructure

• Patches hot-fixes, and service packs are usually provided with release
notes
 They describe what changes are made to the operating system
 Read release notes before installing the patch!
 When a patch or hot fix does not have impact on a specific deployment it can be
discarded
Hardening

• Hardening is a step by step process of configuring an operating system to


protect it against security threats
• The operating system is stripped down to support only essential services
and processes
 Unnecessary protocols and subsystems are switched off
 Unused user accounts are removed or disabled
 All new and relevant hot-fixes, patches, and service packs are applied

• Harden all operating systems in the infrastructure using a hardened


operating system configuration template
 This template is used to instantiate new operating systems
 Ensure security is optimal and is consistent in all deployments
Virus scanning

• Windows, Linux and end user operating systems are vulnerable to viruses
 It is good practice to install a virus scanner

• Virus scanners can have an impact on the performance of the operating


system
 The virus scanner must be configured to only scan high risk files and directories based
on a risk analysis
 For instance, it makes no sense to protect a database table file with a virus scanner
Host-based firewalls

• Most operating systems, including Windows, Linux, and UNIX, provide a


built-in host-based firewall
• A host-based firewall is a software firewall
 Part of an operating system
 Protecting an individual host from unwanted network traffic

• Host-based firewalls typically block all incoming network traffic


Host-based firewalls

• Rule sets define which type of traffic is allowed to communicate with the
operating system, based on:
 Source and destination IP address
 TCP or UDP port
 The running process sending and/or receiving the network traffic

• It is good practice to enable host-based firewalls on all machines


 Servers
 End user devices
Limiting user accounts

• Operating systems have local user accounts that can login to the operating
system
• Most operating systems also have a special super user account called
"root", "supervisor", "admin", or "administrator“
 These accounts have almost unlimited power
 They should be used only to provide permissions to user accounts bound to a physical
person
 Under normal circumstances, these accounts should never be used
 It should be possible to do all work using a user-bound account with sufficient rights
Hashed passwords

• Operating systems should only store hashed passwords


 When a user logs in, her password is hashed
 The hashed password is compared to the stored hash
 If the two are equal the login succeeds

• There is no way to calculate or extract the original password from the


hashed one
 The hashed passwords should never be disclosed
 When weak passwords are used, brute force of dictionary attacks can be used to find the
passwords

You might also like