We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 42
IT Infrastructure Architecture
Infrastructure Building Blocks
and Concepts
Performance Concepts
(chapter 5)Introduction
Performance isa
typical hygiene factor
Nobody notices a
highly performing
system
But when a system is
not performing well
enough, users quickly
start complaining
Processes / Information
Applications
Application platform
Infrastructure
|
ea
a)Perceived performance
* Perceived performance refers to how quickly a
system appears to perform its task
* In general, people tend to overestimate their
own patience
* People tend to value predictability in
performance
— When the performance of a system is fluctuating,
users remember a bad experience
— Even if the fluctuation is relatively rarePerceived performance
* Inform the user about how long a task will
take
— Progress bars
— Splash screens
Copying 325 items (177
{rom My Dropbox (-\My Oroptow) to Desktop (AU. \Desttop)
eg SE me OF
More detailsPerformance during
infrastructure designPerformance during infrastructure
design
¢ Asolution must be designed, implemented,
and supported to meet the performance
requirements
— Even under increasing load
* Calculating performance of a system in the
design phase is:
— Extremely difficult
— Very unreliablePerformance during infrastructure
design
* Performance must be considered:
— When the system works as expected
— When the system is in a special state, like:
+ Failing parts
* Maintenance state
* Performing backup
* Running batch jobs
* Some ways to do this are:
— Benchmarking
— Using vendor experience
— Prototyping
— User ProfilingBenchmarking
* A benchmark uses a specific test program to
assess the relative performance of an
infrastructure component
Benchmarks compare:
— Performance of various subsystems
— Across different system architecturesBenchmarking
* Benchmarks comparing the raw speed of parts
of an infrastructure
— Like the speed difference between processors or
between disk drives
— Not taking into account the typical usage of such
components
— Examples:
* Floating Point Operations Per Second — FLOPS
* Million Instructions Per Second — MIPSVendor experience
The best way to determine the performance of a
system in the design phase: use the experience of
vendors
They have a lot of experience running their
products in various infrastructure configurations
Vendors can provide:
— Tools
— Figures
— Best practicesPrototyping
Also known as proof of concept (PoC)
Prototypes measure the performance of a system
at an early stage
Building prototypes:
— Hiring equipment from suppliers
— Using datacenter capacity at a vendor’s premise
— Using cloud computing resources
Focus on those parts of the system that pose the
highest risk, as early as possible in the design
processUser profiling
Predict the load a new software system will pose on
the infrastructure before the software is actually built
Get a good indication of the expected usage of the
system
Steps:
— Define a number of typical user groups (personas)
— Create a list of tasks personas will perform on the new
system
— Decompose tasks to infrastructure actions
— Estimate the load per infrastructure action
— Calculate the total loadUser profiling personas/tasks
Persona | Number | System task | Infrastructure load | Frequency
of users asa result of the
per system task
persona
Data ‘100 ‘Start Read 100 MB data ‘Once a day
entry application | from SAN
officer
Data 100 ‘Start Transport 100 MB ‘Once a day
entry application | data to workstation
officer
Data 100 Enter new Transport 50 KB data | 40 per
entry data from workstation to | hour
officer server
Date 100 Enter new ‘Store 50 KB data to | 40 per
entry data SAN hour
officer
Data 100 Change Read 50 KB data 10 per
entry existing data | from SAN hour
officerUser profiling Infrastructure load
Infrastructure load Per day Per
second
Data transport from server to workstation (KB) 10,400,000 361.1
Data transport from workstation to server (KB) 2,050,000 71.2
Data read from SAN (KB) 10,400,000 361.1
Data written to SAN (KB) 2,050,000 71.2Performance of a running
systemManaging bottlenecks
The performance of a system is based on:
— The performance of all its components
— The interoperability of various components
A component causing the system to reach some
limit is referred to as the bottleneck of the system
Every system has at least one bottleneck that
limits its performance
If the bottleneck does not negatively influence
performance of the complete system under the
highest expected load, it is OKPerformance testing
* Load testing - shows how a system performs
under the expected load
* Stress testing - shows how a system reacts
when it is under extreme load
* Endurance testing - shows how a system
behaves when it is used at the expected load
for a long period of timePerformance testing - Breakpoint
* Ramp up the load
— Start with a small number of virtual users
— Increase the number over a period of time
* The test result shows how the performance varies with the
load, given as number of users versus response time.
Response time
Number of simulated users \,Performance testing
* Performance testing software typically uses:
— One or more servers to act as injectors
* Each emulating a number of users
* Each running a sequence of interactions
— A test conductor
* Coordinating tasks
* Gathering metrics from each of the injectors
* Collecting performance data for reporting purposesPerformance testing
* Performance testing should be done ina
production-like environment
— Performance tests in a development environment
usually lead to results that are highly unreliable
— Even when underpowered test systems perform
well enough to get good test results, the faster
production system could show performance issues
that did not occur in the tests
* To reduce cost:
— Use a temporary (hired) test environmentPerformance patternsIncreasing performance on upper
layers
80% of the performance issues are due to badly
behaving applications
Application performance can benefit from:
— Database and application tuning
— Prioritizing tasks
— Working from memory as much as possible (as
opposed to working with data on disk)
— Making good use of queues and schedulers
Typically more effective than adding compute
powerDisk caching
* Disks are mechanical devices that are slow by nature
* Caching can be implemented i:
— Disks
— Disk controllers
— Operating system
* Allnon-used memory in operating systems is used for disk cache
+ Over time, all memory gets filled with previously stored disk
requests and prefetched disk blocks, speeding up applications.
* Cache memory:
— Stores all data recently read from disk
— Stores some of the disk blocks following the recently read
disk blocksCaching
Component rime eeakes to fetch 2 MB of
Network, 1 Gbit/s 675
Hard disk, 15k rpm, 4 KB disk blocks | 105
Main memory DDR3 RAM 0.2
CPU L1 cache
0.016Web proxies
When users browse the internet, data can be
cached in a web proxy server
—Aweb proxy server is a type of cache
— Earlier accessed data can be fetched from cache,
instead of from the internet
Benefits:
— Users get their data faster
— All other users are provided more bandwidth to
the internet, as the data does not have to be
downloaded againOperational data store
* An Operational Data Store (ODS) is a read-only
replica of a part of a database, for a specific
use
* Frequently used information is retrieved from
a small ODS database
— The main database is used less for retrieving
information
— The performance of the main database is not
degradedFront-end servers
* Front-end servers serve data to end users
— Typically web servers
* To increase performance, store static data on
the front-end servers
— Pictures are a good candidate
— Significantly lowers the amount of traffic to back-
end systems
* In addition, a reverse proxy can be used
— Automatically cache most requested dataIn-memory databases
In special circumstances, entire databases can be
run from memory instead of from disk
In-memory databases are used in situations
where performance is crucial
— Real-time SCADA systems
— High performance online transaction processing
(OLTP) systems
* Asan example, in 2011 SAP AG introduced HANA, an in-
memory database for SAP systems
Special arrangements must be made to ensure
data is not lost when a power failure occursScalability
Scalability indicates the ease in with which a system
can be modified, or components can be added, to
handle increasing load
Two ways to scale a system:
— Vertical scaling (scale up) - adding resources to a single
component
— Horizontal scaling (scale out) - adding more components to
the infrastructureScalability — Vertical scaling
* Adding more resources, for example:
— Server: more memory, CPU’s
— Network switch: adding more ports
— Storage: Replace small disks by larger disks
* Vertical scaling is easy to do
* It quickly reaches a limit
— The infrastructure component is “full”Scalability — Horizontal scaling
Adding more components to the infrastructure, for
example:
— Adding servers to a web server farm
— Adding disk cabinets to a storage system
In theory, horizontal scaling scales much better
— Be aware of bottlenecks
Doubling the number of components does not
necessarily double the performance
Horizontal scaling is the basis for cloud computing
Applications must be aware of scaling infrastructure
componentsScalability — Horizontal scaling
wee Rogetion)|._{ Databere a aa
web
web ‘Application Database a as
web
server
‘Web Database
server ‘server
[Pao || san {ois
‘WebLoad balancing
* Load balancing uses multiple servers that
perform identical tasks
— Examples:
+ Web server farm
* Mail server farm
* FTP (File Transfer Protocol) server farm
* Aload balancer spreads the load over the
available machines
— Checks the current load on each server in the farm
— Sends incoming requests to the least busy serverLoad balancing
Web
a fserver
| Web
server
~~ Web
server
Server FarmLoad balancing
Advanced load balancers can spread the load based on:
— The number of connections a server has
— The measured response time of a server
The application running on a load balanced system
must be able to cope with the fact that each request
can be handled by a different server
— The load balancer should contain the states of the
application
— The load balancing mechanism can arrange that a user’s
session is always connected to the same server
— If a server in the server farm goes down, its session
information becomes inaccessible and sessions are lostLoad balancing
¢ A load balancer increases availability
— When a server in the server farm is unavailable,
the load balancer notices this and ensures no
requests are sent to the unavailable server until it
is back online again
* The availability of the load balancer itself is
very important
— Load balancers are typically setup in a failover
configurationLoad balancing
* Network load balancing:
— Spread network load over multiple network
connections
— Most network switches support port trunking
* Multiple Ethernet connections are combined to get a virtual
Ethernet connection providing higher throughput
* The load is balanced over the connections by the network
switch
* Storage load balancing:
— Using multiple disks to spread the load of reads and
writes
— Use multiple connections between servers and
storage systemsHigh performance clusters
High performance clusters provide a vast amount of
computing power by combining many computer
systems
A large number of cheap off the-shelf servers can
create one large supercomputer
Used for calculation-intensive systems
— Weather forecasts
— Geological research
— Nuclear research
— Pharmaceutical research
TOPSOO.orgGrid Computing
A computer grid is a high performance cluster that
consists of systems that are spread geographically
The limited bandwidth is the bottleneck
Examples:
— SETI@HOME
— CERN LHC Computing Grid (140 computing centers in 35
countries)
Broker firms exist for commercial exploitation of grids
Security is a concern when computers in the grid are
not under controlDesign for use
* Performance critical applications should be
designed as such
° Tips:
— Know what the system will be used for
+ A large data warehouse needs a different infrastructure
design than an online transaction processing system or a
web application
* Interactive systems are different than batch oriented
systems
— When possible, try to spread the load of the system
over the available timeDesign for use
— In some cases, special products must be used for
certain systems
+ Real-time operating systems
* In-memory databases
* Specially designed file systems
— Use standard implementation plans that are proven in
practice
+ Follow the vendor's recommended implementation
* Have the vendors check the design you created
— Move rarely used data from the main systems to other
systems
* Moving old data to a large historical database can speed up a
smaller sized databaseCapacity management
Capacity management guarantees high
performance of a system in the long term
To ensure performance stays within
acceptable limits, performance must be
monitored
Trend analyses can be used to predict
performance degradation
Anticipate on business changes (like
forthcoming marketing campaigns)