0% found this document useful (0 votes)
130 views50 pages

Otd Yair

The document discusses performance monitoring on Linux systems. It explains why monitoring is important for system health, troubleshooting, and capacity planning. It describes tools that can be used to monitor key metrics like CPU, memory, storage, and network usage. These include built-in commands like ps and top, files in the /proc filesystem, and utilities from the sysstat project like sar, iostat, and mpstat. Additional third-party tools like Nagios and Cacti are also mentioned.

Uploaded by

doronofek
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
130 views50 pages

Otd Yair

The document discusses performance monitoring on Linux systems. It explains why monitoring is important for system health, troubleshooting, and capacity planning. It describes tools that can be used to monitor key metrics like CPU, memory, storage, and network usage. These include built-in commands like ps and top, files in the /proc filesystem, and utilities from the sysstat project like sar, iostat, and mpstat. Additional third-party tools like Nagios and Cacti are also mentioned.

Uploaded by

doronofek
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Performance Monitoring

on Linux Systems

Yair Gany - SGI


[email protected]

June 25th, 2007


About SGI
• 25 years of innovation in
High Performance Computing
• Solely concentrated in Linux systems
• Maker of the largest Linux systems in the world
– Clusters and SSI (Single System Image)
• Focusing on large memory systems

Performance Monitoring on Linux Systems


Agenda
• Why ?
• What ?
• How ?
• When ?

Performance Monitoring on Linux Systems


Why Monitor ?
• System Health
– Installation issues
– Performance Troubleshooting
• “We thought 4 CPUs would be enough … “
• “2 striped disks do not yield enough bandwidth for the scratch space”

– Problem Troubleshooting
• Application time out due to poor performance

There is another reason …

Performance Monitoring on Linux Systems


Why Monitor ?
Capacity planning

• Analyze current resources


• Predict resource consumption over time
• Plan !

Performance Monitoring on Linux Systems


What are we monitoring ?
• The usual suspects
– CPU (Load Average, CPU Balance )
– Memory (Application Memory Consumption, Leaks ? )
– Storage (Bandwidth, IOPS )
– Network (Bandwidth, Latency)

• The unusual suspects ….


– Paging
– Inter Process Communication

Performance Monitoring on Linux Systems


A word of advice
• Virtualization changes the picture
• Linux may not be running on the “bare metal”
• If it runs on a virtual machine – Metrics need to
be adjusted
• Example : Your system may “think” it is using
80% of a CPU, but if the virtual machine was
given only 10% of the real CPU – you are only
using 8% of the real power of the CPU

Performance Monitoring on Linux Systems


How do we monitor ?
• Built-in Tools
– Static commands (e.g ps)
– Real time displays (e.g top)
– /proc file system
– Sysstat project (sar, iostat etc … )
• Additional tools
– Nagios

Performance Monitoring on Linux Systems


Let’s start simple
• xload
– Displays a periodically updating
histogram of the system Load Average
– Frequently used as a site wide meter

Another option:
• Performance Monitors available
with the various distros
(e.g KDE system guard)
– Same idea with more information

Performance Monitoring on Linux Systems


ps – Process Status
• Displays list of active processes
• Performance diagnostic tool
• Has many flags
• Common flags :
• –a or –e is used to display all processes – not just those
attached to the current virtual terminal
• Use -f and/or -l to reveal useful information

Performance Monitoring on Linux Systems


top
• Refreshed display of crucial system Information
+ List of top cpu consuming processes
Useful Information
Load Average
Cpu Stats
Memory and Swap
Process list

top is also used to


identify memory hogs –
not just cpu hogs!

top movie

Performance Monitoring on Linux Systems


A cool alternative to top : htop
• Why talk ….
• Lets see !

htop demo
Source and binaries : http://htop.sourceforge.net

Performance Monitoring on Linux Systems


/proc filesystem
• A pseudo-filesystem used to access kernel
performance metrics (as well as to update some)
• All the necessary information is there – usually
not in a user friendly manner…
The information is better suited to be accessed
by programs or scripts
• For example, if we do “ls /proc” we see an entry
for every running process, as well as entries for
system level metrics

Performance Monitoring on Linux Systems


/proc - example

Performance Monitoring on Linux Systems


/proc – process info

Performance Monitoring on Linux Systems


/proc – cpuinfo and meminfo
cpuinfo

meminfo

Performance Monitoring on Linux Systems


sysstat project
• A project to mine data from /proc and make it
available for display and performance record
keeping
• Attention – in many distros the sysstat rpm is NOT
installed by default …
• Includes the following utilities :
– iostat – monitors system I/O devices (i.e disks)
– mpstat – monitors cpu activity
– sar – system activity reporter

Performance Monitoring on Linux Systems


iostat and mpstat

Performance Monitoring on Linux Systems


sar
• sar has 3 roles:
– Display metrics – either current or from
previous day’s file
– Creates daily performance files with all
system metrics
– Extracts data from saved performance files for
further analysis (like with excel or database)

Performance Monitoring on Linux Systems


sar
• sar can be used interactively to examine current
performance

• sar has many arguments that are able to display


many system performance metrics

Performance Monitoring on Linux Systems


sar
• To enable long term statistics one must
chkconfig sysstat on
• On the initial start, also run /etc/init.d/sysstat start
• sar uses cron to sample statistics periodicaly and
store them under /var/log/sa
• In default, sar records data every 10 minutes and
stores the data in daily files
• The sysadmin can change the frequency and the
naming convention to override the default

Performance Monitoring on Linux Systems


sar – using statistics
[root@localhost init.d]# sar -B -f /var/log/sa/sa24 -H
localhost.localdomain;600;2004-02-24 14:30:00 UTC;0.57;7.60;52313;698;10596;12721
localhost.localdomain;599;2004-02-24 14:40:00 UTC;2.48;28.25;58419;976;10717;14022
localhost.localdomain;600;2004-02-24 14:50:00 UTC;4.34;9.55;61104;993;10602;14539
localhost.localdomain;600;2004-02-24 15:00:00 UTC;35.70;5.78;71026;1005;10616;16529
localhost.localdomain;600;2004-02-24 15:10:00 UTC;114.21;21.06;95643;2583;10627;21770
localhost.localdomain;600;2004-02-24 15:20:00
localhost.localdomain;600;2004-02-24 18:10:00 UTC;0.05;4.58;82698;4861;11113;19734
localhost.localdomain;600;2004-02-24 18:20:00 UTC;0.13;5.04;83208;4862;11086;19831
localhost.localdomain;600;2004-02-24 18:30:00 UTC;0.11;8.31;80067;4812;11224;19220

Performance Monitoring on Linux Systems


Performance Monitoring on Linux Systems
Other tools
• Nagios http://www.nagios.org
– Nagios is a host and service monitor designed to inform you of network
problems before your clients, end-users or managers do
• Monitoring of network services (SMTP, POP3, HTTP, NNTP, PING, etc.)
• Monitoring of host resources (processor load, disk and memory usage, running
processes, log files, etc.)
• Monitoring of environmental factors such as temperature
• Simple plugin design that allows users to easily develop their own host and
service checks

• Cacti http://cacti.net
– Complete network monitoring and graphing tool
– Very powerful graphing system
• Oracle Management Pack for Linux
• Open SpeedShop http://oss.sgi.com/projects/openspeedshop
– Multi platform tool to support Performance analysis of applications

Performance Monitoring on Linux Systems


Thank You !

Yair Gany - SGI


[email protected]

Performance Monitoring on Linux Systems


Performance Monitoring
on Linux Systems

Yair Gany - SGI


[email protected]

June 25th, 2007


About SGI
• 25 years of innovation in
High Performance Computing
• Solely concentrated in Linux systems
• Maker of the largest Linux systems in the world
– Clusters and SSI (Single System Image)
• Focusing on large memory systems

Performance Monitoring on Linux Systems


Agenda
• Why ?
• What ?
• How ?
• When ?

Performance Monitoring on Linux Systems


Why Monitor ?
• System Health
– Installation issues
– Performance Troubleshooting
• “We thought 4 CPUs would be enough … “
• “2 striped disks do not yield enough bandwidth for the scratch space”

– Problem Troubleshooting
• Application time out due to poor performance

There is another reason …

Performance Monitoring on Linux Systems


Why Monitor ?
Capacity planning

• Analyze current resources


• Predict resource consumption over time
• Plan !

Performance Monitoring on Linux Systems


What are we monitoring ?
• The usual suspects
– CPU (Load Average, CPU Balance )
– Memory (Application Memory Consumption, Leaks ? )
– Storage (Bandwidth, IOPS )
– Network (Bandwidth, Latency)

• The unusual suspects ….


– Paging
– Inter Process Communication

Performance Monitoring on Linux Systems


A word of advice
• Virtualization changes the picture
• Linux may not be running on the “bare metal”
• If it runs on a virtual machine – Metrics need to
be adjusted
• Example : Your system may “think” it is using
80% of a CPU, but if the virtual machine was
given only 10% of the real CPU – you are only
using 8% of the real power of the CPU

Performance Monitoring on Linux Systems


How do we monitor ?
• Built-in Tools
– Static commands (e.g ps)
– Real time displays (e.g top)
– /proc file system
– Sysstat project (sar, iostat etc … )
• Additional tools
– Nagios

Performance Monitoring on Linux Systems


Let’s start simple
• xload
– Displays a periodically updating
histogram of the system Load Average
– Frequently used as a site wide meter

Another option:
• Performance Monitors available
with the various distros
(e.g KDE system guard)
– Same idea with more information

Performance Monitoring on Linux Systems


ps – Process Status
• Displays list of active processes
• Performance diagnostic tool
• Has many flags
• Common flags :
• –a or –e is used to display all processes – not just those
attached to the current virtual terminal
• Use -f and/or -l to reveal useful information

Performance Monitoring on Linux Systems


top
• Refreshed display of crucial system Information
+ List of top cpu consuming processes
Useful Information
Load Average
Cpu Stats
Memory and Swap
Process list

top is also used to


identify memory hogs –
not just cpu hogs!

top movie

Performance Monitoring on Linux Systems


A cool alternative to top : htop
• Why talk ….
• Lets see !

htop demo
Source and binaries : http://htop.sourceforge.net

Performance Monitoring on Linux Systems


/proc filesystem
• A pseudo-filesystem used to access kernel
performance metrics (as well as to update some)
• All the necessary information is there – usually
not in a user friendly manner…
The information is better suited to be accessed
by programs or scripts
• For example, if we do “ls /proc” we see an entry
for every running process, as well as entries for
system level metrics

Performance Monitoring on Linux Systems


/proc - example

Performance Monitoring on Linux Systems


/proc – process info

Performance Monitoring on Linux Systems


/proc – cpuinfo and meminfo
cpuinfo

meminfo

Performance Monitoring on Linux Systems


sysstat project
• A project to mine data from /proc and make it
available for display and performance record
keeping
• Attention – in many distros the sysstat rpm is NOT
installed by default …
• Includes the following utilities :
– iostat – monitors system I/O devices (i.e disks)
– mpstat – monitors cpu activity
– sar – system activity reporter

Performance Monitoring on Linux Systems


iostat and mpstat

Performance Monitoring on Linux Systems


sar
• sar has 3 roles:
– Display metrics – either current or from
previous day’s file
– Creates daily performance files with all
system metrics
– Extracts data from saved performance files for
further analysis (like with excel or database)

Performance Monitoring on Linux Systems


sar
• sar can be used interactively to examine current
performance

• sar has many arguments that are able to display


many system performance metrics

Performance Monitoring on Linux Systems


sar
• To enable long term statistics one must
chkconfig sysstat on
• On the initial start, also run /etc/init.d/sysstat start
• sar uses cron to sample statistics periodicaly and
store them under /var/log/sa
• In default, sar records data every 10 minutes and
stores the data in daily files
• The sysadmin can change the frequency and the
naming convention to override the default

Performance Monitoring on Linux Systems


sar – using statistics
[root@localhost init.d]# sar -B -f /var/log/sa/sa24 -H
localhost.localdomain;600;2004-02-24 14:30:00 UTC;0.57;7.60;52313;698;10596;12721
localhost.localdomain;599;2004-02-24 14:40:00 UTC;2.48;28.25;58419;976;10717;14022
localhost.localdomain;600;2004-02-24 14:50:00 UTC;4.34;9.55;61104;993;10602;14539
localhost.localdomain;600;2004-02-24 15:00:00 UTC;35.70;5.78;71026;1005;10616;16529
localhost.localdomain;600;2004-02-24 15:10:00 UTC;114.21;21.06;95643;2583;10627;21770
localhost.localdomain;600;2004-02-24 15:20:00
localhost.localdomain;600;2004-02-24 18:10:00 UTC;0.05;4.58;82698;4861;11113;19734
localhost.localdomain;600;2004-02-24 18:20:00 UTC;0.13;5.04;83208;4862;11086;19831
localhost.localdomain;600;2004-02-24 18:30:00 UTC;0.11;8.31;80067;4812;11224;19220

Performance Monitoring on Linux Systems


Click to add title

Performance Monitoring on Linux Systems


Other tools
• Nagios http://www.nagios.org
– Nagios is a host and service monitor designed to inform you of network
problems before your clients, end-users or managers do
• Monitoring of network services (SMTP, POP3, HTTP, NNTP, PING, etc.)
• Monitoring of host resources (processor load, disk and memory usage, running
processes, log files, etc.)
• Monitoring of environmental factors such as temperature
• Simple plugin design that allows users to easily develop their own host and
service checks
• Cacti http://cacti.net
– Complete network monitoring and graphing tool
– Very powerful graphing system
• Oracle Management Pack for Linux
• Open SpeedShop http://oss.sgi.com/projects/openspeedshop
– Multi platform tool to support Performance analysis of applications

Performance Monitoring on Linux Systems


Thank You !

Yair Gany - SGI


[email protected]

Performance Monitoring on Linux Systems

You might also like