0% found this document useful (0 votes)

172 views

Linux Performance Tools (LinuxCon NA) - Brendan Gregg

Uploaded by

Francisco Badaró

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

172 views

Linux Performance Tools (LinuxCon NA) - Brendan Gregg

Uploaded by

Francisco Badaró

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 90

Oct, 2014

Linux Performance Tools

Brendan Gregg
Senior Performance Architect
Performance Engineering Team

[email protected]
@brendangregg

A quick tour of many tools…
•  Massive AWS EC2 Linux cloud
–  Tens of thousands of instances
–  Autoscale by ~3k each day
–  CentOS and Ubuntu
•  FreeBSD for content delivery
–  Approx 33% of US Internet traffic at night
•  Performance is criRcal
–  Customer saRsfacRon: >50M subscribers
–  $$$ price/performance
–  Develop tools for cloud-‐wide analysis;
use server tools as needed
•  Just launched in Europe!
Brendan Gregg
•  Senior Performance Architect, Ne8lix
–  Linux and FreeBSD performance
–  Performance Engineering team (@coburnw)
•  Recent work:
–  Linux perf-‐tools, using crace & perf_events
–  Systems Performance, PrenRce Hall
•  Previous work includes:
–  USE Method, flame graphs, uRlizaRon &
latency heat maps, DTrace tools, ZFS L2ARC
•  Twijer @brendangregg (these slides)
Agenda
•  Methodologies & Tools
•  Tool Types:
–  Observability
•  Basic
•  Intermediate
•  Advanced
–  Benchmarking
–  Tuning
–  StaRc
•  Tracing

Aim: to show what can be done
Knowing that something can be done is more important than
knowing how to do it.
Methodologies & Tools
Methodologies & Tools
•  There are dozens of performance tools for Linux
–  Packages: sysstat, procps, coreuRls, …
–  Commercial products
•  Methodologies can provide guidance for
choosing and using tools effecRvely
An3-‐Methodologies
•  The lack of a deliberate methodology…
•  Street Light AnR-‐Method:
–  1. Pick observability tools that are
•  Familiar
•  Found on the Internet, or at random
–  2. Run tools
–  3. Look for obvious issues
•  Drunk Man AnR-‐Method:
–  Tune things at random unRl the problem goes away
Methodologies
•  For example, the USE Method:
–  For every resource, check:
•  URlizaRon
•  SaturaRon
•  Errors
•  5 Whys:
–  Ask “why?” 5 Rmes
•  Other methods include:
–  Workload characterizaRon, drill-‐down analysis, event
tracing, baseline stats, staRc performance tuning, …
•  Start with the quesRons, then find the tools
Command Line Tools
•  Useful to study even if you never use them:
GUIs and commercial products ocen use the
same interfaces
Kernel

/proc, /sys, …

$ vmstat 1!
procs -----------memory---------- ---swap-- …!
r b swpd free buff cache si so …!
9 0 0 29549320 29252 9299060 0 …!
2 0 0 29547876 29252 9299332 0 …!
4 0 0 29548124 29252 9299460 0 …!
5 0 0 29548840 29252 9299592 0 …!
Tool Types

Type Characteris.c
Observability Watch acRvity. Safe, usually, depending
on resource overhead.
Benchmarking Load test. CauRon: producRon tests can
cause issues due to contenRon.
Tuning Change. Danger: changes could hurt
performance, now or later with load.
StaRc Check conﬁguraRon. Should be safe.
Observability Tools
How do you measure these?
Observability Tools: Basic
•  upRme
•  top (or htop)
•  ps
•  vmstat
•  iostat
•  mpstat
•  free
upRme
•  One way to print load averages:

$ uptime!
07:42:06 up 8:16, 1 user, load average: 2.27, 2.84, 2.91!

•  A measure of resource demand: CPUs + disks

–  Other OSes only show CPUs: easier to interpret
•  ExponenRally-‐damped moving averages with
Rme constants of 1, 5, and 15 minutes
–  Historic trend without the line graph
•  Load > # of CPUs, may mean CPU saturaRon
–  Don’t spend more than 5 seconds studying these

top (or htop)
•  System and per-‐process interval summary:
$ top - 18:50:26 up 7:43, 1 user, load average: 4.11, 4.91, 5.22!
Tasks: 209 total, 1 running, 206 sleeping, 0 stopped, 2 zombie!
Cpu(s): 47.1%us, 4.0%sy, 0.0%ni, 48.4%id, 0.0%wa, 0.0%hi, 0.3%si, 0.2%st!
Mem: 70197156k total, 44831072k used, 25366084k free, 36360k buffers!
Swap: 0k total, 0k used, 0k free, 11873356k cached!
!
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5738 apiprod 20 0 62.6g 29g 352m S 417 44.2 2144:15 java
1386 apiprod 20 0 17452 1388 964 R 0 0.0 0:00.02 top
1 root 20 0 24340 2272 1340 S 0 0.0 0:01.51 init
2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd
•  %CPU is summed across all CPUs
[…]!

•  Can miss short-‐lived processes (atop won’t)

•  Can consume noRceable CPU to read /proc
htop
ps
•  Process status lisRng (eg, “ASCII art forest”):
$ ps -ef f!
UID PID PPID C STIME TTY STAT TIME CMD!
[…]!
root 4546 1 0 11:08 ? Ss 0:00 /usr/sbin/sshd -D!
root 28261 4546 0 17:24 ? Ss 0:00 \_ sshd: prod [priv]!
prod 28287 28261 0 17:24 ? S 0:00 \_ sshd: prod@pts/0 !
prod 28288 28287 0 17:24 pts/0 Ss 0:00 \_ -bash!
prod 3156 28288 0 19:15 pts/0 R+ 0:00 \_ ps -ef f!
root 4965 1 0 11:08 ? Ss 0:00 /bin/sh /usr/bin/svscanboot!
root 4969 4965 0 11:08 ? S 0:00 \_ svscan /etc/service!
[…]!

•  Custom ﬁelds:
$ ps -eo user,sz,rss,minflt,majflt,pcpu,args!
USER SZ RSS MINFLT MAJFLT %CPU COMMAND!
root 6085 2272 11928 24 0.0 /sbin/init!
[…]!
vmstat
•  Virtual memory staRsRcs and more:
$ vmstat –Sm 1!
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----!
r b swpd free buff cache si so bi bo in cs us sy id wa!
8 0 0 1620 149 552 0 0 1 179 77 12 25 34 0 0!
7 0 0 1598 149 552 0 0 0 0 205 186 46 13 0 0!
8 0 0 1617 149 552 0 0 0 8 210 435 39 21 0 0!
8 0 0 1589 149 552 0 0 0 0 218 219 42 17 0 0!
[…]!

•  USAGE: vmstat [interval [count]]

•  First output line has some summary since boot
values (should be all; parRal is confusing)
•  High level CPU summary. “r” is runnable tasks.
iostat
•  Block I/O (disk) stats. 1st output is since boot.
$ iostat -xmdz 1!
!
Linux 3.13.0-29 (db001-eb883efa) 08/18/2014 _x86_64_ (16 CPU)!
!
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s \ ...!
xvda 0.00 0.00 0.00 0.00 0.00 0.00 / ...!
xvdb 213.00 0.00 15299.00 0.00 338.17 0.00 \ ...!
xvdc 129.00 0.00 15271.00 3.00 336.65 0.01 / ...!
md0 0.00 0.00 31082.00 3.00 678.45 0.01 \ ...!

Workload

•  Very useful
... \ avgqu-sz await r_await w_await svctm %util!
... / 0.00 0.00 0.00 0.00 0.00 0.00!
... \ 126.09 8.22 8.22 0.00 0.06 86.40!
set of stats ... / 99.31 6.47 6.47 0.00 0.06 86.00!
... \ 0.00 0.00 0.00 0.00 0.00 0.00!

ResulRng Performance
mpstat
•  MulR-‐processor staRsRcs, per-‐CPU:
$ mpstat –P ALL 1!
[…]!
08:06:43 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle!
08:06:44 PM all 53.45 0.00 3.77 0.00 0.00 0.39 0.13 0.00 42.26!
08:06:44 PM 0 49.49 0.00 3.03 0.00 0.00 1.01 1.01 0.00 45.45!
08:06:44 PM 1 51.61 0.00 4.30 0.00 0.00 2.15 0.00 0.00 41.94!
08:06:44 PM 2 58.16 0.00 7.14 0.00 0.00 0.00 1.02 0.00 33.67!
08:06:44 PM 3 54.55 0.00 5.05 0.00 0.00 0.00 0.00 0.00 40.40!
08:06:44 PM 4 47.42 0.00 3.09 0.00 0.00 0.00 0.00 0.00 49.48!
08:06:44 PM 5 65.66 0.00 3.03 0.00 0.00 0.00 0.00 0.00 31.31!
08:06:44 PM 6 50.00 0.00 2.08 0.00 0.00 0.00 0.00 0.00 47.92!
[…]!

•  Look for unbalanced workloads, hot CPUs.

free
•  Main memory usage:
$ free -m!
total used free shared buffers cached!
Mem: 3750 1111 2639 0 147 527!

-/+ buffers/cache:
Swap: 0
436
0
3313!
0!

•  buﬀers: block device I/O cache

•  cached: virtual page cache
Observability Tools: Basic
Observability Tools: Intermediate
•  strace
•  tcpdump
•  netstat
•  nicstat
•  pidstat
•  swapon
•  lsof
•  sar (and collectl, dstat, etc.)
strace
•  System call tracer:
$ strace –tttT –p 313!
1408393285.779746 getgroups(0, NULL) = 1 <0.000016>!
1408393285.779873 getgroups(1, [0]) = 1 <0.000015>!
1408393285.780797 close(3) = 0 <0.000016>!
1408393285.781338 write(1, "LinuxCon 2014!\n", 15LinuxCon 2014!!
) = 15 <0.000048>!

•  Eg, -‐jt: Rme (us) since epoch; -‐T: syscall Rme (s)
•  Translates syscall args
–  Very helpful for solving system usage issues
•  Currently has massive overhead (ptrace based)
–  Can slow the target by > 100x. Use extreme cauRon.
tcpdump
•  Sniﬀ network packets for post analysis:
$ tcpdump -i eth0 -w /tmp/out.tcpdump!
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes!
^C7985 packets captured!
8996 packets received by filter!
1010 packets dropped by kernel!
# tcpdump -nr /tmp/out.tcpdump | head !
reading from file /tmp/out.tcpdump, link-type EN10MB (Ethernet) !
20:41:05.038437 IP 10.44.107.151.22 > 10.53.237.72.46425: Flags [P.], seq 18...!
20:41:05.038533 IP 10.44.107.151.22 > 10.53.237.72.46425: Flags [P.], seq 48...!
20:41:05.038584 IP 10.44.107.151.22 > 10.53.237.72.46425: Flags [P.], seq 96...!
[…]!

•  Study packet sequences with Rmestamps (us)

•  CPU overhead opRmized (socket ring buﬀers),
but can sRll be signiﬁcant. Use cauRon.
netstat
•  Various network protocol staRsRcs using -‐s:
•  A mulR-‐tool: $ netstat –s!
[…]!
-‐i: interface stats Tcp:!
736455 active connections openings!
-‐r: route table 176887 passive connection openings!
33 failed connection attempts!
1466 connection resets received!
default: list conns 3311 connections established!
91975192 segments received!
•  netstat -‐p: shows 180415763 segments send out!
223685 segments retransmited!
process details! 2 bad segments received.!
39481 resets sent!
[…]!
•  Per-‐second interval TcpExt:!
12377 invalid SYN cookies received!
with -‐c […]!
2982 delayed acks sent!
nicstat
•  Network interface stats, iostat-‐like output:
$ ./nicstat 1!
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat!
21:21:43 lo 823.0 823.0 171.5 171.5 4915.4 4915.4 0.00 0.00!
21:21:43 eth0 5.53 1.74 15.11 12.72 374.5 139.8 0.00 0.00!
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat!
21:21:44 lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00!
21:21:44 eth0 20.42 3394.1 355.8 85.94 58.76 40441.3 0.00 0.00!
Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat!
21:21:45 lo 1409.1 1409.1 327.9 327.9 4400.8 4400.8 0.00 0.00!
21:21:45 eth0 75.12 4402.3 1398.9 1513.2 54.99 2979.1 0.00 0.00!
[…]!

•  Check network throughput and interface %uRl

•  I wrote this years ago; Tim Cook ported to Linux
pidstat
•  Very useful process stats. eg, by-‐thread, disk I/O:
$ pidstat -t 1!
Linux 3.2.0-54 (db002-91befe03) !08/18/2014 !_x86_64_ !(8 CPU)!
!
08:57:52 PM TGID TID %usr %system %guest %CPU CPU Command!
08:57:54 PM 5738 - 484.75 39.83 0.00 524.58 1 java!
08:57:54 PM - 5817 0.85 0.00 0.00 0.85 2 |__java!
08:57:54 PM - 5931 1.69 1.69 0.00 3.39 4 |__java!
08:57:54 PM - 5981 0.85 0.00 0.00 0.85 7 |__java!
08:57:54 PM - 5990 0.85 0.00 0.00 0.85 4 |__java!
[…]!
$ pidstat -d 1!
[…]!
08:58:27 PM PID kB_rd/s kB_wr/s kB_ccwr/s Command!
08:58:28 PM 5738 0.00 815.69 0.00 java!
[…]!

•  I usually prefer this over top(1)

swapon
•  Show swap device usage:
$
swapon -s!

Filename Type Size Used Priority!
/dev/sda3 partition 5245212 284 -1!

•  If you have swap enabled…

lsof
•  More a debug tool, lsof(8) shows ﬁle descriptor
usage, which for some apps, equals current
acRve network connecRons:
# lsof -iTCP -sTCP:ESTABLISHED!
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME!
sshd 755 root 3r IPv4 13576887 0t0 TCP bgregg-test-i-f106:ssh->prod100.netflix.com:
15241 (ESTABLISHED)!
platforms 2614 app1 8u IPv4 14618 0t0 TCP localhost:33868->localhost:5433 (ESTABLISHED)!
postgres 2648 app1 7u IPv4 14619 0t0 TCP localhost:5433->localhost:33868 (ESTABLISHED)!
epic_plug 2857 app1 7u IPv4 15678 0t0 TCP localhost:33885->localhost:5433 (ESTABLISHED)!
postgres 2892 app1 7u IPv4 15679 0t0 TCP localhost:5433->localhost:33885 (ESTABLISHED)!
[…]!

•  I’d prefer to: echo /proc/PID/fd | wc -l!

sar
•  System AcRvity Reporter. Many stats, eg:
$ sar -n TCP,ETCP,DEV 1!
Linux 3.2.55 (test-e4f1a80b) !08/18/2014 !_x86_64_ !(8 CPU)!
!
09:10:43 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s!
09:10:44 PM lo 14.00 14.00 1.34 1.34 0.00 0.00 0.00!
09:10:44 PM eth0 4114.00 4186.00 4537.46 28513.24 0.00 0.00 0.00!
!
09:10:43 PM active/s passive/s iseg/s oseg/s!
09:10:44 PM 21.00 4.00 4107.00 22511.00!
!
09:10:43 PM atmptf/s estres/s retrans/s isegerr/s orsts/s!
09:10:44 PM 0.00 0.00 36.00 0.00 1.00!
[…]!

•  Archive or live mode: (interval [count])

•  Well designed. Header naming convenRon,
logical groups: TCP, ETCP, DEV, EDEV, …
Observability: sar
Other Tools
•  You may also use collectl, atop, dstat, or another
measure-‐all tool
•  The tool isn’t important
•  It’s important to have a way to measure
everything you want
•  In cloud environments, you are probably using a
monitoring product, developed in-‐house or
commercial. Same method applies…
How does your monitoring tool
measure these?
Observability Tools: Intermediate
Advanced Observability Tools
•  Misc:
–  ltrace, ss, iptraf, ethtool, snmpget, lldptool, iotop,
blktrace, slabtop, /proc, pcstat
•  CPU Performance Counters:
–  perf_events, Rptop, rdmsr
•  Advanced Tracers:
–  perf_events, crace, eBPF, SystemTap, ktap, LTTng,
dtrace4linux, sysdig
•  Some selected demos…
ss
•  More socket staRsRcs:
$ ss -mop!
State Recv-Q Send-Q Local Address:Port Peer Address:Port !
CLOSE-WAIT 1 0 127.0.0.1:42295 127.0.0.1:28527
users:(("apacheLogParser",2702,3))!
! mem:(r1280,w0,f2816,t0)!
ESTAB 0 0 127.0.0.1:5433 127.0.0.1:41312
timer:(keepalive,36min,0) users:(("postgres",2333,7))!
! mem:(r0,w0,f0,t0)!
[…]!
$ ss –i!
State Recv-Q Send-Q Local Address:Port Peer Address:Port !
CLOSE-WAIT 1 0 127.0.0.1:42295 127.0.0.1:28527 !
cubic wscale:6,6 rto:208 rtt:9/6 ato:40 cwnd:10 send 145.6Mbps rcv_space:32792!
ESTAB 0 0 10.144.107.101:ssh 10.53.237.72:4532 !
cubic wscale:4,6 rto:268 rtt:71.5/3 ato:40 cwnd:10 send 1.5Mbps rcv_rtt:72
rcv_space:14480!
[…]!
iptraf
iotop
•  Block device I/O (disk) by process:
$ iotop!
Total DISK READ: 50.47 M/s | Total DISK WRITE: 59.21 M/s!
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND !
959 be/4 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [flush-202:1]!
6641 be/4 root 50.47 M/s 82.60 M/s 0.00 % 32.51 % java –Dnop –X!
1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init!
2 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kthreadd]!
3 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/0]!
4 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/0:0]!
5 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/u:0]!
6 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/0]!
[…]!

•  Needs kernel support enabled

–  CONFIG_TASK_IO_ACCOUNTING
slabtop
•  Kernel slab allocator memory usage:
$ slabtop!
Active / Total Objects (% used) : 4692768 / 4751161 (98.8%)!
Active / Total Slabs (% used) : 129083 / 129083 (100.0%)!
Active / Total Caches (% used) : 71 / 109 (65.1%)!
Active / Total Size (% used) : 729966.22K / 738277.47K (98.9%)!
Minimum / Average / Maximum Object : 0.01K / 0.16K / 8.00K!
!
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME !
3565575 3565575 100% 0.10K 91425 39 365700K buffer_head!
314916 314066 99% 0.19K 14996 21 59984K dentry!
184192 183751 99% 0.06K 2878 64 11512K kmalloc-64!
138618 138618 100% 0.94K 4077 34 130464K xfs_inode!
138602 138602 100% 0.21K 3746 37 29968K xfs_ili!
102116 99012 96% 0.55K 3647 28 58352K radix_tree_node!
97482 49093 50% 0.09K 2321 42 9284K kmalloc-96!
22695 20777 91% 0.05K 267 85 1068K shared_policy_node!
21312 21312 100% 0.86K 576 37 18432K ext4_inode_cache!
16288 14601 89% 0.25K 509 32 4072K kmalloc-256!
[…]!
pcstat
•  Show page cache residency by ﬁle:
# ./pcstat data0*!
|----------+----------------+------------+-----------+---------|!
| Name | Size | Pages | Cached | Percent |!
|----------+----------------+------------+-----------+---------|!
| data00 | 104857600 | 25600 | 25600 | 100.000 |!
| data01 | 104857600 | 25600 | 25600 | 100.000 |!
| data02 | 104857600 | 25600 | 4080 | 015.938 |!
| data03 | 104857600 | 25600 | 25600 | 100.000 |!
| data04 | 104857600 | 25600 | 16010 | 062.539 |!
| data05 | 104857600 | 25600 | 0 | 000.000 |!
|----------+----------------+------------+-----------+---------|!

•  Uses the mincore(2) syscall. Useful for database

performance analysis.
perf_events (counters)
•  Performance Monitoring Counters (PMCs):
$ perf list | grep –i hardware!
cpu-cycles OR cycles [Hardware event]!
stalled-cycles-frontend OR idle-cycles-frontend [Hardware event]!
stalled-cycles-backend OR idle-cycles-backend [Hardware event]!
instructions [Hardware event]!
[…]!
branch-misses [Hardware event]!
bus-cycles [Hardware event]!
L1-dcache-loads [Hardware cache event]!
L1-dcache-load-misses [Hardware cache event]!
[…]!

rNNN (see 'perf list --help' on how to encode it) [Raw hardware event … !
mem:<addr>[:access] [Hardware breakpoint]!

•  IdenRfy CPU cycle breakdowns, esp. stall types

–  PMCs not enabled by-‐default in clouds (yet)
–  Can be Rme-‐consuming to use (CPU manuals)
•  Use ﬂame graphs to visualize sampled stack traces
perf_events CPU Flame Graph

Kernel
TCP/IP

Broken GC
Java stacks Locks
epoll
(missing Idle
frame Time
thread
pointer)
Rptop

•  IPC by process, %MISS, %BUS

•  Needs some love. perfmon2 library integraRon?
•  SRll can’t use it in clouds yet (needs PMCs enabled)
rdmsr
•  Model Speciﬁc Registers (MSRs), unlike PMCs, can
be read by default in Xen guests
–  Timestamp clock, temp, power, …
–  Use rdmsr(1) from the msr-‐tools package to read them
–  Uses include (hjps://github.com/brendangregg/msr-‐cloud-‐tools):
ec2-guest#
[...]!
./showboost!

TIME
06:11:35
C0_MCYC
6428553166
C0_ACYC
7457384521
UTIL
51%
RATIO
116%
MHz!
2900!
06:11:40 6349881107 7365764152 50% 115% 2899!
06:11:45 6240610655 7239046277 49% 115% 2899! Real CPU MHz
[...]!
ec2-guest# ./cputemp 1!
CPU1 CPU2 CPU3 CPU4!
61 61 60 59!
60 61 60 60! CPU Temperature
[...]!
More Advanced Tools…
•  Some others worth menRoning:
Tool Descrip.on
ltrace Library call tracer
ethtool Mostly interface tuning; some stats
snmpget SNMP network host staRsRcs
lldptool Can get LLDP broadcast stats
blktrace Block I/O event tracer
/proc Many raw kernel counters
pmu-‐tools On-‐ and off-‐core CPU counter tools
Advanced Tracers
•  Many opRons on Linux:
–  perf_events, crace, eBPF, SystemTap, ktap, LTTng,
dtrace4linux, sysdig
•  Most can do staRc and dynamic tracing
–  StaRc: pre-‐defined events (tracepoints)
–  Dynamic: instrument any socware (kprobes,
uprobes). Custom metrics on-‐demand. Catch all.
•  Many are in-‐development.
–  I’ll summarize their state later…
Linux Observability Tools
Linux Observability Tools
Benchmarking Tools
Benchmarking Tools
•  MulR:
–  UnixBench, lmbench, sysbench, perf bench
•  FS/disk:
–  dd, hdparm, fio
•  App/lib:
–  ab, wrk, jmeter, openssl
•  Networking:
–  ping, hping3, iperf, jcp, traceroute, mtr, pchar
AcRve Benchmarking
•  Most benchmarks are misleading or wrong
–  You benchmark A, but actually measure B, and
conclude that you measured C
•  AcRve Benchmarking
1.  Run the benchmark for hours
2.  While running, analyze and confirm the
performance limiter using observability tools
•  We just covered those tools – use them!
lmbench
•  CPU, memory, and kernel micro-‐benchmarks
•  Eg, memory latency by stride size:
$ lat_mem_rd 100m 128 > out.latencies!
some R processing…!

L2 cache
Main
Memory
L1 cache
L3 cache
ﬁo
•  FS or disk I/O micro-‐benchmarks
$ fio --name=seqwrite --rw=write --bs=128k --size=122374m!
[…]!
seqwrite: (groupid=0, jobs=1): err= 0: pid=22321!
write: io=122374MB, bw=840951KB/s, iops=6569 , runt=149011msec!
clat (usec): min=41 , max=133186 , avg=148.26, stdev=1287.17!
lat (usec): min=44 , max=133188 , avg=151.11, stdev=1287.21!
bw (KB/s) : min=10746, max=1983488, per=100.18%, avg=842503.94,
stdev=262774.35!
cpu : usr=2.67%, sys=43.46%, ctx=14284, majf=1, minf=24!
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%!
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%!
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%!
issued r/w/d: total=0/978992/0, short=0/0/0!
lat (usec): 50=0.02%, 100=98.30%, 250=1.06%, 500=0.01%, 750=0.01%!
lat (usec): 1000=0.01%!
lat (msec): 2=0.01%, 4=0.01%, 10=0.25%, 20=0.29%, 50=0.06%!
lat (msec): 100=0.01%, 250=0.01%!

•  Results include basic latency distribuRon

pchar
•  Traceroute with bandwidth per hop!
$ pchar 10.71.83.1!
[…]!
4: 10.110.80.1 (10.110.80.1)!
Partial loss: 0 / 5 (0%)!
Partial char: rtt = 9.351109 ms, (b = 0.004961 ms/B), r2 = 0.184105!
stddev rtt = 4.967992, stddev b = 0.006029!
Partial queueing: avg = 0.000000 ms (0 bytes)!
Hop char: rtt = --.--- ms, bw = 1268.975773 Kbps!
Hop queueing: avg = 0.000000 ms (0 bytes)!
5: 10.193.43.181 (10.193.43.181)!
Partial loss: 0 / 5 (0%)!
Partial char: rtt = 25.461597 ms, (b = 0.011934 ms/B), r2 = 0.228707!
stddev rtt = 10.426112, stddev b = 0.012653!
Partial queueing: avg = 0.000000 ms (0 bytes)!
Hop char: rtt = 16.110487 ms, bw = 1147.210397 Kbps!
Hop queueing: avg = 0.000000 ms (0 bytes)!
[…]!

•  Needs love. Based on pathchar (Linux 2.0.30).

Benchmarking Tools
Tuning Tools
Tuning Tools
•  Generic interfaces:
–  sysctl, /sys
•  Many areas have custom tuning tools:
–  ApplicaRons: their own config
–  CPU/scheduler: nice, renice, taskset, ulimit, chcpu
–  Storage I/O: tune2fs, ionice, hdparm, blockdev, …
–  Network: ethtool, tc, ip, route
–  Dynamic patching: stap, kpatch
Tuning Methods
•  ScienRfic Method:
1.  QuesRon
2.  Hypothesis
3.  PredicRon
4.  Test
5.  Analysis
•  Any observa3onal or benchmarking tests you can
try before tuning?
•  Consider risks, and see previous tools
Tuning Tools
StaRc Tools
StaRc Tools
•  StaRc Performance Tuning: check the staRc state
and configuraRon of the system
–  CPU types
–  Storage devices
–  File system capacity
–  File system and volume configuraRon
–  Route table
–  State of hardware
•  What can be checked on a system without load
StaRc Tools
Tracing
Tracing Frameworks: Tracepoints

•  StaRcally placed at logical places in the kernel

•  Provides key event details as a “format” string
Tracing Frameworks: + probes

•  kprobes: dynamic kernel tracing

–  funcRon calls, returns, line numbers
•  uprobes: dynamic user-‐level tracing
Tracing Tools
•  OpRons:
–  crace
–  perf_events
–  eBPF
–  SystemTap
–  ktap
–  LTTng
–  dtrace4linux
–  Oracle Linux DTrace
–  sysdig
•  Too many choices, and many sRll in-‐development
Imagine Linux with Tracing
•  With a programmable tracer, high level tools can
be wrijen, such as:
–  iosnoop
–  iolatency
–  opensnoop
–  …
iosnoop
•  Block I/O (disk) events with latency:
# ./iosnoop –ts!
Tracing block I/O. Ctrl-C to end.!
STARTs ENDs COMM PID TYPE DEV BLOCK BYTES LATms!
5982800.302061 5982800.302679 supervise 1809 W 202,1 17039600 4096 0.62!
5982800.302423 5982800.302842 supervise 1809 W 202,1 17039608 4096 0.42!
5982800.304962 5982800.305446 supervise 1801 W 202,1 17039616 4096 0.48!
5982800.305250 5982800.305676 supervise 1801 W 202,1 17039624 4096 0.43!
[…]!

# ./iosnoop –h!
USAGE: iosnoop [-hQst] [-d device] [-i iotype] [-p PID] [-n name] [duration]!
-d device # device string (eg, "202,1)!
-i iotype # match type (eg, '*R*' for all reads)!
-n name # process name to match on I/O issue!
-p PID # PID to match on I/O issue!
-Q # include queueing time in LATms!
-s # include start time of I/O (s)!
-t # include completion time of I/O (s)!
-h # this usage message!
duration # duration seconds, and use buffers!
[…]!
iolatency
•  Block I/O (disk) latency distribuRons:
# ./iolatency !
Tracing block I/O. Output every 1 seconds. Ctrl-C to end.!
!
>=(ms) .. <(ms) : I/O |Distribution |!
0 -> 1 : 2104 |######################################|!
1 -> 2 : 280 |###### |!
2 -> 4 : 2 |# |!
4 -> 8 : 0 | |!
8 -> 16 : 202 |#### |!
!
>=(ms) .. <(ms) : I/O |Distribution |!
0 -> 1 : 1144 |######################################|!
1 -> 2 : 267 |######### |!
2 -> 4 : 10 |# |!
4 -> 8 : 5 |# |!
8 -> 16 : 248 |######### |!
16 -> 32 : 601 |#################### |!
32 -> 64 : 117 |#### |!
[…]!
opensnoop
•  Trace open() syscalls showing filenames:
# ./opensnoop -t!
Tracing open()s. Ctrl-C to end.!
TIMEs COMM PID FD FILE!
4345768.332626 postgres 23886 0x8 /proc/self/oom_adj!
4345768.333923 postgres 23886 0x5 global/pg_filenode.map!
4345768.333971 postgres 23886 0x5 global/pg_internal.init!
4345768.334813 postgres 23886 0x5 base/16384/PG_VERSION!
4345768.334877 postgres 23886 0x5 base/16384/pg_filenode.map!
4345768.334891 postgres 23886 0x5 base/16384/pg_internal.init!
4345768.335821 postgres 23886 0x5 base/16384/11725!
4345768.347911 svstat 24649 0x4 supervise/ok!
4345768.347921 svstat 24649 0x4 supervise/status!
4345768.350340 stat 24651 0x3 /etc/ld.so.cache!
4345768.350372 stat 24651 0x3 /lib/x86_64-linux-gnu/libselinux…!
4345768.350460 stat 24651 0x3 /lib/x86_64-linux-gnu/libc.so.6!
4345768.350526 stat 24651 0x3 /lib/x86_64-linux-gnu/libdl.so.2!
4345768.350981 stat 24651 0x3 /proc/filesystems!
4345768.351182 stat 24651 0x3 /etc/nsswitch.conf!
[…]!
funcgraph
•  Trace a graph of kernel code flow:
# ./funcgraph -Htp 5363 vfs_read!
Tracing "vfs_read" for PID 5363... Ctrl-C to end.!
# tracer: function_graph!
#!
# TIME CPU DURATION FUNCTION CALLS!
# | | | | | | | |!
4346366.073832 | 0) | vfs_read() {!
4346366.073834 | 0) | rw_verify_area() {!
4346366.073834 | 0) | security_file_permission() {!
4346366.073834 | 0) | apparmor_file_permission() {!
4346366.073835 | 0) 0.153 us | common_file_perm();!
4346366.073836 | 0) 0.947 us | }!
4346366.073836 | 0) 0.066 us | __fsnotify_parent();!
4346366.073836 | 0) 0.080 us | fsnotify();!
4346366.073837 | 0) 2.174 us | }!
4346366.073837 | 0) 2.656 us | }!
4346366.073837 | 0) | tty_read() {!
4346366.073837 | 0) 0.060 us | tty_paranoia_check();!
[…]!
kprobe
•  Dynamically trace a kernel funcRon call or return,
with variables, and in-‐kernel filtering:
# ./kprobe 'p:open do_sys_open filename=+0(%si):string' 'filename ~ "*stat"'!
Tracing kprobe myopen. Ctrl-C to end.!
postgres-1172 [000] d... 6594028.787166: open: (do_sys_open
+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"!
postgres-1172 [001] d... 6594028.797410: open: (do_sys_open
+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"!
postgres-1172 [001] d... 6594028.797467: open: (do_sys_open
+0x0/0x220) filename="pg_stat_tmp/pgstat.stat”!
^C!
Ending tracing...!

•  Add -‐s for stack traces; -‐p for PID filter in-‐kernel.
•  Quickly confirm kernel behavior; eg: did a
tunable take effect?
Imagine Linux with Tracing
•  These tools aren’t using dtrace4linux, SystemTap,
ktap, or any other add-‐on tracer
•  These tools use exis.ng Linux capabili.es
–  No extra kernel bits, not even kernel debuginfo
–  Just Linux’s built-‐in 8race profiler
–  Demoed on Linux 3.2
•  Solving real issues now
crace
•  Added by Steven Rostedt and others
since 2.6.27
•  Already enabled on our servers (3.2+)
–  CONFIG_FTRACE, CONFIG_FUNCTION_PROFILER, …
–  Use directly via /sys/kernel/debug/tracing
•  My front-‐end tools to aid usage
–  hjps://github.com/brendangregg/perf-‐tools
–  Unsupported hacks: see WARNINGs
–  Also see the trace-‐cmd front-‐end, as well as perf
•  lwn.net: “Ftrace: The Hidden Light Switch”
My perf-‐tools (so far…)
Tracing Summary
•  crace
•  perf_events
•  eBPF
•  SystemTap
•  ktap
•  LTTng
•  dtrace4linux
•  sysdig
perf_events
•  aka “perf” command
•  In Linux. Add from linux-‐tools-‐common, …
•  Powerful mulR-‐tool and profiler
–  interval sampling, CPU performance counter events
–  user and kernel dynamic tracing
–  kernel line tracing and local variables (debuginfo)
–  kernel filtering, and in-‐kernel counts (perf stat)
•  Not very programmable, yet
–  limited kernel summaries. May improve with eBPF.
perf_events Example
# perf record –e skb:consume_skb -ag!
^C[ perf record: Woken up 1 times to write data ]!
[ perf record: Captured and wrote 0.065 MB perf.data (~2851 samples) ]!
# perf report!
[...]!
74.42% swapper [kernel.kallsyms] [k] consume_skb!
|!
--- consume_skb!
arp_process!
arp_rcv!
__netif_receive_skb_core! Summarizing stack
__netif_receive_skb!
netif_receive_skb! traces for a tracepoint
virtnet_poll!
net_rx_action!

__do_softirq! perf_events can do
irq_exit!
do_IRQ! many things – hard to
ret_from_intr!
default_idle!
pick just one example
cpu_idle!
start_secondary!
[…]!
eBPF
•  Extended BPF: programs on tracepoints
–  High performance filtering: JIT
–  In-‐kernel summaries: maps
•  Linux in 3.18? Enhance perf_events/crace/…?
# ./bitesize 1!
writing bpf-5 -> /sys/kernel/debug/tracing/events/block/block_rq_complete/filter!
!
I/O sizes:!
Kbytes : Count!
4 -> 7 : 131!
8 -> 15 : 32!
16 -> 31 : 1! in-‐kernel summary
32 -> 63 : 46!
64 -> 127 : 0!
128 -> 255 : 15!
[…]!
SystemTap
•  Fully programmable, fully featured
•  Compiles tracing programs into kernel modules
–  Needs a compiler, and takes Rme
•  “Works great on Red Hat”
–  I keep trying on other distros and have hit trouble in
the past; make sure you are on the latest version.
–  I’m liking it a bit more acer finding ways to use it
without kernel debuginfo (a difficult requirement in
our environment). Work in progress.
•  Ever be mainline?
ktap
•  Sampling, staRc & dynamic tracing
•  Lightweight, simple. Uses bytecode.
•  Suited for embedded devices
•  Development appears suspended acer
suggesRons to integrate with eBPF (which itself is
in development)
•  ktap + eBPF would be awesome: easy,
lightweight, fast. Likely?
sysdig
•  sysdig: InnovaRve new tracer. Simple expressions:
sysdig fd.type=file and evt.failed=true!
sysdig evt.type=open and fd.name contains /etc!
sysdig -p"%proc.name %fd.name" "evt.type=accept and proc.name!=httpd”!

•  Replacement for strace? (or “perf trace” will)

•  Programmable “chisels”. Eg, one of mine:
# sysdig -c fileslower 1!
TIME PROCESS TYPE LAT(ms) FILE!
2014-04-13 20:40:43.973 cksum read 2 /mnt/partial.0.0!
2014-04-13 20:40:44.187 cksum read 1 /mnt/partial.0.0!
2014-04-13 20:40:44.689 cksum read 2 /mnt/partial.0.0!

[…]!

•  Currently syscalls and user-‐level processing only. It is

opRmized, but I’m not sure it can be enough for kernel tracing
Present & Future
•  Present:
–  crace can serve many needs today
–  perf_events some more, esp. with debuginfo
–  ad hoc SystemTap, ktap, … as needed
•  Future:
–  crace/perf_events/ktap with eBPF, for a fully
featured and mainline tracer?
–  One of the other tracers going mainline?
The Tracing Landscape, Oct 2014
(my opinion)
(less brutal)

dtrace4L.
ktap
sysdig
Ease of use

perf stap

crace
(alpha) (mature)
Stage of eBPF
(brutal)

Development
Scope & Capability
In Summary
In Summary…
•  Plus diagrams for benchmarking, tuning, tracing
•  Try to start with the quesRons (methodology), to
help guide your use of the tools
•  I hopefully turned some unknown unknowns into
known unknowns
References & Links
–  Systems Performance: Enterprise and the Cloud, PrenRce Hall, 2014
–  hjp://www.brendangregg.com/linuxperf.html
–  hjp://www.brendangregg.com/perf.html#FlameGraphs
–  nicstat: hjp://sourceforge.net/projects/nicstat/
–  Rptop: hjp://Rptop.gforge.inria.fr/
•  Tiptop: Hardware Performance Counters for the Masses, Erven Rohou, Inria
Research Report 7789, Nov 2011.
–  crace & perf-‐tools
•  hjps://github.com/brendangregg/perf-‐tools
•  hjp://lwn.net/ArRcles/608497/
–  MSR tools: hjps://github.com/brendangregg/msr-‐cloud-‐tools
–  pcstat: hjps://github.com/tobert/pcstat
–  eBPF: hjp://lwn.net/ArRcles/603983/
–  ktap: hjp://www.ktap.org/
–  SystemTap: hjps://sourceware.org/systemtap/
–  sysdig: hjp://www.sysdig.org/
–  hjp://www.slideshare.net/brendangregg/linux-‐performance-‐analysis-‐and-‐tools
–  Tux by Larry Ewing; Linux® is the registered trademark of Linus Torvalds in the U.S.
and other countries.
Thanks
•  QuesRons?
•  hjp://slideshare.net/brendangregg
•  hjp://www.brendangregg.com
•  [email protected]
•  @brendangregg

Java 17 Backend Development: Design backend systems using Spring Boot, Docker, Kafka, Eureka, Redis, and Tomcat
From Everand
Java 17 Backend Development: Design backend systems using Spring Boot, Docker, Kafka, Eureka, Redis, and Tomcat
Elara Drevyn
No ratings yet
Kubernetes and Cloud Native Associate
No ratings yet
Kubernetes and Cloud Native Associate
2 pages
Web Applications and Security It Code 402 Notes
100% (3)
Web Applications and Security It Code 402 Notes
25 pages
primerHW SWinterface
No ratings yet
primerHW SWinterface
107 pages
D-Bus Tutorial: Havoc Pennington
No ratings yet
D-Bus Tutorial: Havoc Pennington
19 pages
J Nativememory Linux PDF
No ratings yet
J Nativememory Linux PDF
37 pages
QEMU - Crash Course Wiki
No ratings yet
QEMU - Crash Course Wiki
6 pages
Linux Performance Analysis New Tools and Old Secrets: Brendan Gregg
No ratings yet
Linux Performance Analysis New Tools and Old Secrets: Brendan Gregg
75 pages
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
From Everand
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
Joerg Christian Seubert
No ratings yet
Linux Performance 2018: Brendan Gregg
No ratings yet
Linux Performance 2018: Brendan Gregg
26 pages
Virtual Memory Behavior in Red Hat Linux Advanced ...
No ratings yet
Virtual Memory Behavior in Red Hat Linux Advanced ...
10 pages
Notes - On - Linux Kernel
No ratings yet
Notes - On - Linux Kernel
2 pages
Paging On The x86 Architecture
No ratings yet
Paging On The x86 Architecture
6 pages
High Performance Computing 5.2
No ratings yet
High Performance Computing 5.2
294 pages
Debugging Tools Intro - DWARF, ELF, GDB:Binutils, Build-Id
No ratings yet
Debugging Tools Intro - DWARF, ELF, GDB:Binutils, Build-Id
33 pages
Kernel Tracing Using eBPF
No ratings yet
Kernel Tracing Using eBPF
31 pages
《DPDK Cookbook - Intel® Developer Zone》
No ratings yet
《DPDK Cookbook - Intel® Developer Zone》
107 pages
Aix Quick Sheet
No ratings yet
Aix Quick Sheet
2 pages
When eBPF Meets TLS!: A Security Focused Introduction To eBPF
No ratings yet
When eBPF Meets TLS!: A Security Focused Introduction To eBPF
77 pages
Linux Unit1
No ratings yet
Linux Unit1
23 pages
Introduction To Blue Tooth Networking
No ratings yet
Introduction To Blue Tooth Networking
47 pages
Anatomy of A Program in Memory
No ratings yet
Anatomy of A Program in Memory
19 pages
Got and PLT PDF
100% (1)
Got and PLT PDF
9 pages
Ebpf
100% (1)
Ebpf
16 pages
Linux Kernel Primer
No ratings yet
Linux Kernel Primer
540 pages
LINUX Kernel: Introduction To The Kernel
100% (1)
LINUX Kernel: Introduction To The Kernel
105 pages
P4 Tutorial
No ratings yet
P4 Tutorial
107 pages
Generic Access Profile - Bluetooth Technology Website
No ratings yet
Generic Access Profile - Bluetooth Technology Website
7 pages
Memory Cell in Computer
No ratings yet
Memory Cell in Computer
26 pages
Basics of Embedded Linux
100% (1)
Basics of Embedded Linux
55 pages
Bluetooth Profiles: What Is A Bluetooth Profile?
No ratings yet
Bluetooth Profiles: What Is A Bluetooth Profile?
20 pages
6WIND-Intel White Paper - Optimized Data Plane Processing Solutions Using The Intel® DPDK v2
No ratings yet
6WIND-Intel White Paper - Optimized Data Plane Processing Solutions Using The Intel® DPDK v2
8 pages
Windows Graphics Overview: David Blythe Architect Windows Graphics & Gaming Technologies Microsoft Corporation
No ratings yet
Windows Graphics Overview: David Blythe Architect Windows Graphics & Gaming Technologies Microsoft Corporation
34 pages
Mastering Linux Debugging Techniques
No ratings yet
Mastering Linux Debugging Techniques
10 pages
Mellanox OFED Linux User Manual v2.3-1.0.1
No ratings yet
Mellanox OFED Linux User Manual v2.3-1.0.1
207 pages
Ext4 File System
100% (1)
Ext4 File System
16 pages
Cross Toolchain For ARM
No ratings yet
Cross Toolchain For ARM
13 pages
Data Structure Interview C Program: Questions To Create A Copy of A Linked List
No ratings yet
Data Structure Interview C Program: Questions To Create A Copy of A Linked List
11 pages
Linux Graphics Demystified
100% (1)
Linux Graphics Demystified
49 pages
XDP Inside and Out: David S. Miller
No ratings yet
XDP Inside and Out: David S. Miller
18 pages
SELinux and AppArmor: An Introductory Comparison
100% (2)
SELinux and AppArmor: An Introductory Comparison
6 pages
Linux Kernel
No ratings yet
Linux Kernel
9 pages
UNIX Internals: Rohit Jnagal
No ratings yet
UNIX Internals: Rohit Jnagal
36 pages
IDA+VMWare - Linux Debugger
No ratings yet
IDA+VMWare - Linux Debugger
8 pages
Kernel
100% (1)
Kernel
571 pages
Red Hat Enterprise Linux-7-Performance Tuning Guide-En-US
No ratings yet
Red Hat Enterprise Linux-7-Performance Tuning Guide-En-US
78 pages
Message Modeling With DFDL: IBM Integration Bus
No ratings yet
Message Modeling With DFDL: IBM Integration Bus
42 pages
Real-Time Linux Basics
No ratings yet
Real-Time Linux Basics
4 pages
Part 1: Linux Overview: Official Mascot of Linux Kernel
No ratings yet
Part 1: Linux Overview: Official Mascot of Linux Kernel
66 pages
KVM Cheatsheet
No ratings yet
KVM Cheatsheet
1 page
Huge Page Calculation
No ratings yet
Huge Page Calculation
4 pages
Concrete Architecture of The Linux Kernel
No ratings yet
Concrete Architecture of The Linux Kernel
34 pages
Linux Insides
100% (1)
Linux Insides
292 pages
Pthread PDF
No ratings yet
Pthread PDF
33 pages
Writing Device Drivers in Linux: A Brief Tutorial
100% (10)
Writing Device Drivers in Linux: A Brief Tutorial
21 pages
Unix / Linux FAQ: with Tips to Face Interviews
From Everand
Unix / Linux FAQ: with Tips to Face Interviews
Prof. N.B. Venkateswarlu
No ratings yet
Learning SaltStack
From Everand
Learning SaltStack
Colton Myers
4/5 (1)
PostgreSQL 9 Administration Cookbook: LITE Edition
From Everand
PostgreSQL 9 Administration Cookbook: LITE Edition
Simon Riggs
3/5 (1)
Storage area network The Ultimate Step-By-Step Guide
From Everand
Storage area network The Ultimate Step-By-Step Guide
Gerardus Blokdyk
No ratings yet
WebSphere Application Server 7.0 Administration Guide
From Everand
WebSphere Application Server 7.0 Administration Guide
Steve Robinson
No ratings yet
Learning SaltStack - Second Edition
From Everand
Learning SaltStack - Second Edition
Colton Myers
No ratings yet
SFML Essentials: Getting Started with Game Development: SFML Fundamentals
From Everand
SFML Essentials: Getting Started with Game Development: SFML Fundamentals
Kameron Hussain
No ratings yet
Experiment No. 4: 1.0 Title
No ratings yet
Experiment No. 4: 1.0 Title
9 pages
CODES
No ratings yet
CODES
6 pages
Android MCQs
No ratings yet
Android MCQs
23 pages
Flash Mem Summit Jcooke Inconvenient Truths Nand
No ratings yet
Flash Mem Summit Jcooke Inconvenient Truths Nand
32 pages
FAC1002 - Algorithm
No ratings yet
FAC1002 - Algorithm
23 pages
Control Flow Python
No ratings yet
Control Flow Python
13 pages
MEDIBUS.X ProfileDefinition 9052608 Edition22 202403
No ratings yet
MEDIBUS.X ProfileDefinition 9052608 Edition22 202403
135 pages
VMW Microsoft Exchange Server 2019 On Vmware Best Practices
No ratings yet
VMW Microsoft Exchange Server 2019 On Vmware Best Practices
62 pages
Salesforce Development Training Schedule July - August
No ratings yet
Salesforce Development Training Schedule July - August
7 pages
Implementing Electronic Document Management
No ratings yet
Implementing Electronic Document Management
18 pages
T095-NET-WIFI User Manual
No ratings yet
T095-NET-WIFI User Manual
8 pages
Copy and Move Command
No ratings yet
Copy and Move Command
3 pages
1 Smart Touch Brochurepdf
No ratings yet
1 Smart Touch Brochurepdf
2 pages
001A Bizgram Asia Motherboard CPU Bundle Alone
No ratings yet
001A Bizgram Asia Motherboard CPU Bundle Alone
3 pages
Crossword Puzzle Solution - Java and C++
No ratings yet
Crossword Puzzle Solution - Java and C++
16 pages
PowerSwitch SmartFabric OS10 REST API Implementation Participant Guide
No ratings yet
PowerSwitch SmartFabric OS10 REST API Implementation Participant Guide
43 pages
PPC Notes
No ratings yet
PPC Notes
9 pages
Containers Amp Docker Emerging Roles Amp Future of Cloud Technology
No ratings yet
Containers Amp Docker Emerging Roles Amp Future of Cloud Technology
4 pages
Ceragon FibeAir IP-20C Datasheet ETSI Rev A.03
No ratings yet
Ceragon FibeAir IP-20C Datasheet ETSI Rev A.03
6 pages
Rpcgen Tutorial (ONC+ Developer's Guide)
No ratings yet
Rpcgen Tutorial (ONC+ Developer's Guide)
1 page
Windows 10 Avast Antidote
No ratings yet
Windows 10 Avast Antidote
5 pages
PFC Workshop 01 Co Dap An
No ratings yet
PFC Workshop 01 Co Dap An
4 pages
005 Administer Governance and Compliance
No ratings yet
005 Administer Governance and Compliance
36 pages
Dijkstra's Algorithm: Slide Courtesy: Uwash, UT
No ratings yet
Dijkstra's Algorithm: Slide Courtesy: Uwash, UT
31 pages
Summasphere Project Plan
No ratings yet
Summasphere Project Plan
16 pages
Ahooga Rebooting Controller
No ratings yet
Ahooga Rebooting Controller
16 pages
Lebe0007-07 1
100% (1)
Lebe0007-07 1
550 pages
Exam C1000-148: IBM Cloud Pak For Business Automation v21.0.3 Solution Architect
No ratings yet
Exam C1000-148: IBM Cloud Pak For Business Automation v21.0.3 Solution Architect
23 pages

Linux Performance Tools (LinuxCon NA) - Brendan Gregg

Uploaded by

Linux Performance Tools (LinuxCon NA) - Brendan Gregg

Uploaded by

Oct, 2014

Linux Performance Tools

• A measure of resource demand: CPUs + disks

• Can miss short-­‐lived processes (atop won’t)

• USAGE: vmstat [interval [count]]

• Look for unbalanced workloads, hot CPUs.

• buﬀers: block device I/O cache

• Study packet sequences with Rmestamps (us)

• Check network throughput and interface %uRl

• I usually prefer this over top(1)

• If you have swap enabled…

• I’d prefer to: echo /proc/PID/fd | wc -l!

• Archive or live mode: (interval [count])

• Needs kernel support enabled

• Uses the mincore(2) syscall. Useful for database

• IdenRfy CPU cycle breakdowns, esp. stall types

• IPC by process, %MISS, %BUS

• Results include basic latency distribuRon

• Needs love. Based on pathchar (Linux 2.0.30).

• StaRcally placed at logical places in the kernel

• kprobes: dynamic kernel tracing

• Replacement for strace? (or “perf trace” will)

• Currently syscalls and user-­‐level processing only. It is

You might also like

•  A measure of resource demand: CPUs + disks

•  Can miss short-‐lived processes (atop won’t)

•  USAGE: vmstat [interval [count]]

•  Look for unbalanced workloads, hot CPUs.

•  buﬀers: block device I/O cache

•  Study packet sequences with Rmestamps (us)

•  Check network throughput and interface %uRl

•  I usually prefer this over top(1)

•  If you have swap enabled…

•  I’d prefer to: echo /proc/PID/fd | wc -l!

•  Archive or live mode: (interval [count])

•  Needs kernel support enabled

•  Uses the mincore(2) syscall. Useful for database

•  IdenRfy CPU cycle breakdowns, esp. stall types

•  IPC by process, %MISS, %BUS

•  Results include basic latency distribuRon

•  Needs love. Based on pathchar (Linux 2.0.30).

•  StaRcally placed at logical places in the kernel

•  kprobes: dynamic kernel tracing

•  Replacement for strace? (or “perf trace” will)

•  Currently syscalls and user-‐level processing only. It is