0% found this document useful (0 votes)
33 views

SQL Server Hardware Ebook - 2 100

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

SQL Server Hardware Ebook - 2 100

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 99

SQL Server Hardware

By Glenn Berry

First published by Simple Talk Publishing 2011


Copyright Glenn Berry 2011

ISBN 978-1-906434-62-5

The right of Glenn Berry to be identified as the author of this work has been asserted by him in accordance with the Copyright, Designs

and Patents Act 1988.

All rights reserved. No part of this publication may be reproduced, stored or introduced into a retrieval system, or transmitted, in

any form, or by any means (electronic, mechanical, photocopying, recording or otherwise) without the prior written consent of the

publisher. Any person who does any unauthorized act in relation to this publication may be liable to criminal prosecution and civil

claims for damages.

This book is sold subject to the condition that it shall not, by way of trade or otherwise, be lent, re-sold, hired out, or otherwise circu-

lated without the publisher’s prior consent in any form other than which it is published and without a similar condition including this

condition being imposed on the subsequent publisher.

Technical Review by Denny Cherry

Cover Image by Andy Martin

Edited by Tony Davis

Typeset & Designed by Matthew Tye & Gower Associates

Copy Edited by Gower Associates


Table of Contents
Introduction.....................................................................................11
Chapter 1: Processors and Associated Hardware..........................17
SQL Server Workload Types.................................................................................................. 18
Evaluating Processors............................................................................................................. 19
Cache size and the importance of the L2 and L3 caches........................................... 20
Clock speed........................................................................................................................22
Multi-core processors and hyper-threading.................................................................23
Hyper-threading................................................................................................................27
Processor Makes and Models................................................................................................30
Intel Xeon processors....................................................................................................... 31
Intel Itanium and Itanium 2........................................................................................... 42
AMD Opteron processors............................................................................................... 44
Server Motherboards: how to evaluate motherboards and chipsets..............................47
Number of sockets........................................................................................................... 48
Server chipsets.................................................................................................................. 49
BIOS.................................................................................................................................... 50
Memory requirements...................................................................................................... 51
Network Interface Cards.................................................................................................. 55
Choosing a Processor and Motherboard for Use with SQL Server.................................56
Summary...................................................................................................................................58

Chapter 2: The Storage Subsystem...............................................60


Disk I/O................................................................................................................................... 60
Drive Types...............................................................................................................................63
Magnetic disk drives........................................................................................................ 64
Solid-state drives...............................................................................................................65
Internal Storage...................................................................................................................... 68
Attached Storage.................................................................................................................... 69
Direct Attached Storage.................................................................................................. 69
Storage Area Network...................................................................................................... 70
RAID Configurations.............................................................................................................. 73
RAID 0 (disk striping with no parity)............................................................................ 74
RAID 1 (disk mirroring or duplexing)............................................................................ 74
RAID 5 (striping with parity)............................................................................................75
RAID 10 and RAID 0+1.................................................................................................... 76
RAID Controllers...............................................................................................................77
Provisioning and Configuring the Storage Subsystem......................................................79
Finding the read/write ratio............................................................................................81
How many disks?.............................................................................................................. 84
Configuration: SAN vs. DAS, RAID levels.................................................................... 84
Summary.................................................................................................................................. 86

Chapter 3: Benchmarking Tools..................................................... 87


Application Benchmarks....................................................................................................... 88
TPC-C benchmark............................................................................................................ 89
TPC-E benchmark............................................................................................................90
TPC-H benchmark........................................................................................................... 92
Analyzing benchmark test results..................................................................................93
Component Benchmarks...................................................................................................... 98
CPU and memory testing............................................................................................... 99
Disk I/O testing.............................................................................................................. 106
SQL Server-specific benchmarks and stress tools...................................................... 110
Summary..................................................................................................................................121

Chapter 4: Hardware Discovery.................................................. 123


CPU-Z tool.............................................................................................................................. 124
MSINFO32.............................................................................................................................. 132
Windows Task Manager....................................................................................................... 134
Computer Properties dialog.................................................................................................135
SQL Server version information......................................................................................... 136
Summary................................................................................................................................. 138
Chapter 5: Operating System Selection and Configuration...... 139
32-bit or 64-bit?......................................................................................................................140
Advantages of 64-bit versions of Windows for SQL Server..................................... 141
Disadvantages of 64-bit versions of Windows for SQL Server................................144
Windows Server: Versions and Editions............................................................................144
Windows 2000 Server.................................................................................................... 145
Windows Server 2003.....................................................................................................146
Windows Server 2003 R2...............................................................................................150
Windows Server 2008......................................................................................................151
Windows Server 2008 R2............................................................................................... 156
Microsoft Support Policies for Windows Server.............................................................. 161
Mainstream Support.......................................................................................................162
Extended Support............................................................................................................162
Out-of-support case study.............................................................................................164
Installing Windows Server and Service Packs..................................................................166
Configuring Windows Server.............................................................................................. 167
Windows Power Plans and CPU performance...........................................................168
Windows Instant File Initialization.............................................................................178
Lock pages in memory.................................................................................................... 181
Summary.................................................................................................................................184

Chapter 6: SQL Server Version and Edition Selection................ 185


32-bit or 64-bit SQL Server..................................................................................................186
SQL Server Versions and Editions......................................................................................186
SQL Server 2005..............................................................................................................187
SQL Server 2008 Editions..............................................................................................188
SQL Server 2008 Enterprise Edition Features...........................................................196
SQL Server 2008 R2........................................................................................................ 221
Summary................................................................................................................................. 231
Chapter 7: SQL Server Installation and Configuration...............233
Preparation for SQL Server Installation............................................................................ 233
Pre-Installation Checklist for SQL Server.........................................................................234
BIOS, firmware, and drivers..........................................................................................236
Windows OS....................................................................................................................238
SQL Server components................................................................................................239
Network...........................................................................................................................240
Accounts and privileges.................................................................................................240
Logical drives and directories........................................................................................241
Functional and performance testing........................................................................... 244
Fail-over clustering..........................................................................................................245
Installation media and Service Packs.......................................................................... 246
SQL Server 2008 R2 Installation.........................................................................................247
SQL Server 2008 R2 Service Packs and Cumulative Updates................................. 269
SQL Server 2008 R2 Slipstream installation...............................................................273
SQL Server 2008 R2 Instance Configuration Settings....................................................276
Summary................................................................................................................................ 284

Appendix A: Intel and AMD Processors and Chipsets................286


Processors.............................................................................................................................. 286
Intel Xeon 3000 sequence............................................................................................. 286
Intel Xeon E3 sequence................................................................................................. 289
Intel Xeon 5000 sequence............................................................................................. 289
Intel Xeon 6000 sequence.............................................................................................291
Intel Xeon 7000 sequence............................................................................................ 292
Intel Itanium 9000 series.............................................................................................. 295
Intel Itanium 9100 series.............................................................................................. 295
Intel Itanium 9300 series.............................................................................................. 296
AMD Opteron 1200 series............................................................................................ 296
AMD Opteron 2200 series............................................................................................ 296
AMD Opteron 1300 series............................................................................................. 297
AMD Opteron 2300 series............................................................................................ 297
AMD Opteron 2400 series............................................................................................ 298
AMD Opteron 8200 series............................................................................................ 298
AMD Opteron 8300 series............................................................................................ 298
AMD Opteron 8400 series............................................................................................299
AMD Opteron 4100 series............................................................................................299
AMD Opteron 6100 series............................................................................................299
Chipsets................................................................................................................................. 300
Intel 3000, 3010, 3200, 3210, 3400, 3420..................................................................... 300
Intel 5000P, 5000V, 5500, and 5520 chipsets..............................................................301
Intel 7300 and 7500 chipsets........................................................................................ 302

Appendix B: Installing a SQL Server 2008 R2


Cumulative Update......................................................................303
Obtaining a SQL Server 2008 R2 Cumulative Update............................................. 304
Installing a SQL Server 2008 R2 Cumulative Update.............................................. 308

Appendix C: Abbreviations........................................................... 315


About the Author
Glenn Berry is a Database Architect at NewsGator Technologies in Denver, Colorado.
He is a SQL Server MVP, and he has a whole collection of Microsoft certifications,
including MCITP, MCDBA, MCSE, MCSD, MCAD, and MCTS, which proves that he
likes to take tests. His expertise includes DMVs, high availability, hardware selection,
full text search, and SQL Azure. He is also an Adjunct Faculty member at University
College – University of Denver, where has been teaching since 2000. He has completed
the Master Teacher Program at Denver University – University College. He is the author
of two chapters in the book, SQL Server MVP Deep Dives, and blogs regularly at http://
sqlserverperformance.wordpress.com. Glenn is active on Twitter, where his handle is
@GlennAlanBerry.

I want to thank my editor, Tony Davis, for his tireless efforts to help make this a much
better book than it otherwise would have been. My technical editor, Denny Cherry, did a
great job of keeping me honest from a technology and knowledge perspective. Of course,
any remaining mistakes or omissions are my responsibility.

I also want to thank my employer, NewsGator Technologies. My managers and


co-workers were very supportive as I labored to complete this book over the past year. My
good friend, Alex DeBoe gave me valuable feedback on the early drafts of each chapter,
and helped motivate me to keep making progress on the book. My friends in the SQL
Server community, who are too numerous to mention, gave me lots of encouragement to
finish the book. Finally, I want to apologize to my two miniature Dachshunds, Ruby and
Roxy, who were ignored far too often over the past year as I worked on this book. I owe
you lots of belly rubs!

ix
About the Technical Reviewer
Denny Cherry has over a decade of experience managing SQL Server, including some
of the largest deployments in the world. Denny's areas of technical expertise include
system architecture, performance tuning, replication and troubleshooting. Denny
currently holds several of the Microsoft Certifications related to SQL Server for versions
2000 through 2008 including the Microsoft Certified Master, as well as having been a
Microsoft MVP for several years. Denny is a long-time member of PASS and has written
numerous technical articles on SQL Server management and how SQL Server integrates
with various other technologies for SearchSQLServer.com, as well as several books
including Securing SQL Server. Denny blogs regularly at http://itke.techtarget.com/
sql-server, as well as at http://sqlexcursions.com where information about boutique
training events can be found.

x
Introduction
At its heart, this is (yet) another book about SQL Server Performance, but with the
following crucial difference: rather than focus on tweaking queries, adding indexes, and
all the tuning and monitoring that is necessary once a SQL Server database is deployed,
we start right at the very beginning, with the bare metal server hardware on which SQL
Server is installed.

This book provides a detailed review of current and upcoming hardware, including
processors, chipsets, memory, and storage subsystems, and offers advice on how to make
the right choice for your system and your requirements. It then moves on to consider the
performance implications of the various options and configurations for SQL Server, and
the Operating System on which it is installed, covering such issues as:

• strengths and weaknesses of the various versions and editions of Windows Server, and
their suitability for use with different versions and editions of SQL Server

• how to install, patch, and configure the operating system for use with SQL Server

• SQL Server editions and licenses

• installing and configuring SQL Server, including how to acquire and install Service
Packs, Cumulative Updates, and hot-fixes

• methods for quickly and easily upgrading to newer versions of the operating system
and SQL Server with minimal down-time.

In short, this book focuses on all of the things you need to consider and complete before
you even design or create your first SQL Server database.

11
Who is this book for?
The primary audience for this book is the Database Administrator, assigned the task
of the design and subsequent maintenance of the SQL Server systems that support the
day-to-day business operations of their organization.

I've often been surprised by how little some DBAs seem to know about the hardware that
underpins their SQL Server installations. In some cases, this is because the DBA has other
interests and responsibilities, or they are just not interested in low-level hardware details.
In other cases, especially at larger companies, there are bureaucratic and organizational
roadblocks that discourage many DBAs from being knowledgeable and involved in the
selection, configuration, and maintenance of their database server hardware.

Many medium to large companies have separate departments that are responsible for
hardware selection, configuration, and maintenance, and the DBA is often completely at
their mercy, with no access or control over anything besides SQL Server itself. Conversely,
in many smaller companies, the DBA's responsibilities extend beyond SQL Server, to the
hardware and operating system, whether they like it or not. Some such DBAs may often
find themselves overwhelmed, and wishing that they had a dedicated department to take
care of the low-level details so that the DBA could concentrate on SQL Server.

If you're one of the DBAs who is responsible for everything, this book will help you be
more self-sufficient, by giving you the fundamental knowledge and resources you need to
make intelligent choices about the selection, configuration, and installation of hardware,
the operating system, and SQL Server. If you're at a larger company, it will help put you
in a better and stronger position to work effectively with other team members or depart-
ments in your organization, in choosing the appropriate hardware for your system.

In either case, this book will help you ensure that your SQL Server instances can handle
gracefully the CPU, memory, and I/O workload generated by your applications, and that
the operating system and SQL Server itself are installed, patched, and configured for
maximum performance and reliability.

12
How is the book structured?
Chapter 1 covers hardware fundamentals from a SQL Server perspective, focusing
on processors, motherboards, chipsets, and memory. Once you know how to evaluate
hardware for use with different types of SQL Server workloads, you will be in a much
better position to choose the best hardware for your available budget.

Chapter 2 delves into the storage subsystem, including consideration of storage


configuration (DAS or SAN), drive types (magnetic or solid-state), drive sizing, RAID
configuration and more. Most high-volume SQL Server workloads ultimately run into
I/O bottlenecks that can be very expensive to alleviate. Selecting, sizing, and configuring
your storage subsystem properly will reduce the chances that you will suffer from I/O
performance problems.

Chapter 3 provides a detailed examination of the database and application bench-


marking tools, such as Geekbench and SQLIO, which will allow the DBA to verify that
the system should perform adequately for the predicted workload. Never just hope for the
best! The only way to know for sure is to test.

Chapter 4 covers a number of useful hardware investigation tools, including CPU-Z and
Task Manager, which can identify precisely what kind of hardware is being used in an
existing system, from the motherboard to the processor(s), to the memory and storage
subsystem, and how that hardware is configured.

Chapter 5 is a deep dive into the different versions and editions of the Windows Server
Operating System. Once you have acquired the hardware for your database server and
racked it, someone needs to select, install, and configure the operating system. Starting at
Windows Server 2003, it discusses some of the strengths and weaknesses of the various
versions and editions of Windows Server, and their suitability for use with different
versions and editions of SQL Server. It covers how to install, patch, and configure the
operating system for use with SQL Server. Again, depending on your organization and

13
policies, you may be doing this yourself, or you may have to convince someone else to do
something in a specific way for the benefit of SQL Server.

Chapter 6 is an exploration of the various SQL Server versions and editions. Once the
operating system is installed, patched, and configured, someone needs to install, patch,
and configure SQL Server itself. Before you can do this, you need to know how to choose
the version and edition of SQL Server that is most appropriate for your business require-
ments and budget. Each new version of SQL Server has added new editions that have
different capabilities, and make this choice more complicated. Do you need to use Enter-
prise Edition, or will a lower edition serve your business needs?

Chapter 7 will cover how to properly install, patch, and configure SQL Server for
maximum performance, scalability, security and reliability. After you have acquired your
SQL Server licenses, you are finally ready to install, patch, and configure SQL Server itself
for maximum performance and reliability. Unfortunately, the setup programs for SQL
Server 2005, 2008 and 2008 R2 do not always make the best default choices for this area.
The chapter will demonstrate how to create slipstream installation media, and how to
acquire and install Service Packs, Cumulative Updates, and hot-fixes. We will also discuss
different methods for quickly and easily upgrading to newer versions of the operating
system and SQL Server with minimal down-time.

Appendix A supplements Chapter 1 and provides lower-level details on a range of recent


Intel and AMD processors and chipsets.

Appendix B supplements Chapter 7 and provides a full walk-through of the installation of


a SQL Server 2008 R2 Cumulative Update.

Appendix C contains a list of many of the abbreviations and acronyms used in this book,
with their definitions.

14
Code examples
Throughout this book are scripts demonstrating various ways to gather data concerning
the configuration of your hardware, operating system, and SQL Server instances. All
examples should run on all versions of SQL Server from SQL Server 2005 upwards, unless
specified otherwise.

To download all code samples presented in this book, visit the following URL:
www.simple-talk.com/RedGateBooks/GlennBerry/SQLServerHardware_Code.zip.

Other sources of hardware information


Enthusiast hardware review websites are extremely valuable resources in helping the DBA
stay abreast of new and upcoming hardware and technology. It is not unusual for both
Intel and AMD to introduce new CPUs and chipsets in the desktop and enthusiast space
before they are rolled out to the server space. The better hardware review sites provide
very deep and extensive background papers, reviews, and benchmarks of new products,
often weeks or months before the server version is available. The two sites I frequent
most regularly are:

• AnandTech (www.anandtech.com/) – a valuable hardware enthusiast website, started


by Anand Lal Shimpi in 1997, and including coverage of desktop-oriented hardware.
Over time, a specific section of AnandTech, called AnandTech IT (http://www.
anandtech.com/tag/IT), has become an extremely useful resource for IT related
reviews and benchmarks.

• Tom's Hardware (www.tomshardware.com) – a useful hardware review site that was


started in 1996 by Dr. Thomas Pabst and is available in several different languages. Not
quite as focused on server hardware as AnandTech IT.

15
In addition, there are several bloggers in the SQL Server community who regularly cover
hardware-related topics:

• Joe Chang (http://sqlblog.com/blogs/joe_chang/default.aspx) – a well-known


SQL Server consultant who specializes in performance and hardware evaluation.

• Linchi Shea (http://sqlblog.com/blogs/linchi_shea/default.aspx) – a SQL Server


MVP since 2002, Linchi writes frequently about server hardware, performance
and benchmarking.

• Glenn Berry (http://sqlserverperformance.wordpress.com) – yes, that's me! I write


quite often about server hardware from a SQL Server perspective.

16
Chapter 1: Processors and Associated
Hardware
Relational databases place heavy demands on their underlying hardware. Many databases
are mission-critical resources for multiple applications, where performance bottlenecks
are immediately noticeable and often very costly to the business. Despite this, many
database administrators are not very knowledgeable about server hardware.

Part of the problem is that, when evaluating a processor for a SQL Server installation, the
DBA faces an initially intimidating array of choices and considerations including, but not
limited to:

• number of sockets

• processor CPU clock speed and cache size

• processor architecture, including number of cores, hyper-threading options

• choice of associated motherboard and chipsets

• choice of controllers, network interfaces, and so on.

In the face of such an overwhelming number of options, it's easy, and relatively
common, for less experienced DBAs to make poor choices, and/or to hamstring a poten-
tially powerful system by overlooking a crucial detail. I've had direct experience with
expensive, high-performance database servers equipped with fast multi-core processors
and abundant RAM that are performing very poorly because of insufficient disk I/O
capacity to handle the requirements of a busy SQL Server workload. I have also seen
many instances where a production database server with multiple, modern, multi-core
processors was hobbled by only having 8 GB of RAM, thereby causing extreme memory
and I/O pressure on the system and very poor performance.

17
Chapter 1: Processors and Associated Hardware

This chapter will examine each of the critical aspects of evaluating and selecting a
processor and associated hardware, for your database server. It will explain the options
available in each case, offer advice regarding how to choose the most appropriate choice
of processor and chipset for SQL Server, given the nature of the workload, and discuss
other factors that will influence your choices.

SQL Server Workload Types


The type of workload that must be supported will have a huge impact on your choice of
hardware, including processors, memory, and the disk I/O subsystem, as well as on sizing
and configuration of that hardware. It will also affect your database design and indexing
decisions, as well as your maintenance strategy.

There are two primary relational workload types that SQL Server commonly has to
deal with, the first being Online Transaction Processing (OLTP) and the second being
Decision Support System / Data Warehouse (DSS/DW). OLTP workloads are charac-
terized by numerous short transactions, where the data is much more volatile than in
a DSS/DW workload. There is usually much higher write activity in an OLTP workload
than in a DSS workload and most OLTP systems generate more input/output (I/O) opera-
tions per second (IOPS) than an equivalent-sized DSS system.

A DSS or DW system usually has longer-running queries than a similar size OLTP system,
with much higher read activity than write activity, and the data is usually more static. In
such a system, it is much more important to be able to be able to process a large amount
of data quickly, than it is to support a high number of I/O operations per second.

You really should try to determine what type of workload your server will be supporting,
as soon as possible in the system design process. You should also strive to segregate OLTP
and DSS/DW workloads onto different servers and I/O subsystems whenever you can. Of
course, this is not always possible, as some workloads are mixed in nature, with character-
istics of both types of workloads.

18
Chapter 1: Processors and Associated Hardware

Throughout the process of selecting, sizing and configuring the various pieces of
necessary hardware, we'll discuss, in each case, how the type of workload will affect
your choices.

Evaluating Processors
The heart of any database server is the central processing unit (CPU). The CPU executes
instructions and temporarily stores small amounts of data in its internal data caches.
The CPU provides four basic services: fetch, decode, execute, and store. The CPU carries
out each instruction of the executing program in sequence, performing the lowest level
operations as quickly as possible.

Most people, when evaluating CPUs, focus on CPU capacity, i.e. the rated clock speed,
measured in cycles/second (Gigahertz, GHz), and on cache size, in megabytes (MB).
These are certainly important factors, but don't make the common mistake of only
focusing on these properties when comparing the expected performance of processors
from different manufacturers, or different processor families. Instead, you need
to consider the overall architecture and technology used in the processors under
comparison. As you become more familiar with how to identify and characterize a
processor based on its model number (explained later in the chapter), so it will become
easier to understand the architectural and performance differences between different
models and generations of processors.

It is also very useful to consider standardized synthetic and application-level benchmark


scores when evaluating processor performance. For example, you may want to estimate
how much more CPU capacity a new six-core processor will offer compared to an old
single-core processor. This will allow you to make a more realistic assessment of whether,
for example, a proposed server consolidation effort is feasible. These techniques are
discussed in much more detail in Chapter 3, but suffice to say here that, using them, I was
recently able to successfully consolidate the workload from four older, (2007 vintage)

19
Chapter 1: Processors and Associated Hardware

four-socket database servers into a single, new two-socket server, saving about $90K in
hardware costs and $350K in SQL Server license costs.

Cache size and the importance of the L2 and L3


caches
All server-class, Intel-compatible CPUs have multiple levels of cache. The Level 1 (L1)
cache has the lowest latency (i.e. the shortest delays associated with accessing the data),
but the least amount of storage space, while the Level 2 (L2) cache has higher latency,
but is significantly larger than the L1 cache. Finally, the Level 3 (L3) cache has the highest
latency, but is even larger than the L2 cache. In many cases, the L3 cache is shared among
multiple processor cores. In older processors, the L3 cache was sometimes external to the
processor itself, located on the motherboard.

Whenever a processor has to execute instructions or process data, it searches for the data
that it needs to complete the request in the following order:

1. internal registers on the CPU

2. L1 cache (which could contain instructions or data)

3. L2 cache

4. L3 cache

5. main memory (RAM) on the server

6. any cache that may exist in the disk subsystem

7. actual disk subsystem.

The further the processor has to follow this data retrieval hierarchy, depicted in
Figure 1.1, the longer it takes to satisfy the request, which is one reason why cache
sizes on processors have gotten much larger in recent years.

20
Chapter 1: Processors and Associated Hardware

Typical Data Retrieval Hierarchy


32KB 256KB 12MB 64GB Terabytes

Level 1 Cache Level 2 Cache Level 3 Cache


Main Memory Disk Subsystem
(Per core) (Per core) (Shared)

2ns latency 4ns latency 6ns latency 50 ns latency 20 ms latency

Figure 1.1: A typical data retrieval hierarchy.

For example, on a newer server using a 45 nm Intel Nehalem-EP processor, you might
see an L1 cache latency of around 2 nanoseconds (ns), L2 cache latency of 4 ns, L3 cache
latency of 6 ns, and main memory latency of 50 ns. When using traditional magnetic
hard drives, going out to the disk subsystem will have an average latency measured
in milliseconds. A flash-based storage product (like a Fusion-io card) would have an
average latency of around 25 microseconds. A nanosecond is a billionth of a second; a
microsecond is a millionth of a second, while a millisecond is a thousandth of a second.
Hopefully, this makes it obvious why it is so important for system performance that the
data is located as short a distance down the chain as possible.

The performance of SQL Server, like most other relational database engines, is hugely
dependent on the size of the L2 and L3 caches. Most processor families will offer
processor models with a range of different L2 and L3 cache sizes, with the cheaper
processors having smaller caches and I advise you, where possible, to favor processors
with larger L2 and L3 caches. Given the business importance of many SQL Server
workloads, economizing on the L2 and L3 cache sizes is not usually a good choice.

If the hardware budget limit for your database server dictates some form of compromise,
then I suggest you opt to economize on RAM in order to get the processor(s) you want.
My experience as a DBA suggests that it's often easier to get approval for additional RAM
at a later date, than it is to get approval to upgrade a processor. Most of the time, you will
be "stuck" with the original processor(s) for the life of the database server, so it makes
sense to get the one you need.

21
Chapter 1: Processors and Associated Hardware

Clock speed
Gordon Moore, one of the founders of Intel, first articulated what is known as "Moore's
Law" in 1965. Moore's Law states that microprocessor performance doubles roughly every
18–24 months. For the 20–30 years up until about 2003, both Intel and AMD were able
to keep up with Moore's Law, with processor performance approximately doubling about
every eighteen months. Both manufacturers increased microprocessor performance
primarily by increasing the clock speed of the processor, so that single-threaded opera-
tions completed more quickly.

One problem encountered as clock speeds increase, however, is that the processor uses
more electrical energy and it dissipates more heat. Around 2003, both Intel and AMD
began to run into reliability problems, as processor speeds approached 4GHz (with air
cooling). In addition, each stick of memory in a server uses additional electrical power,
and requires additional cooling capacity. Modern 1U servers support two CPU sockets,
and often have 12 or 18 memory slots, so they can require a significant amount of
electrical and cooling capacity.

The ever-increasing processor and memory density of 1U rack-mounted or blade servers


meant that many datacenters began to have issues as they found that use of a large
number of fully-populated racks often exceeded the power and cooling capacity of the
datacenter. As a result, both manufacturers got to the point where they could not always
allow customers to have fully-populated 42U racks of servers.

These factors, along with significant cost advantages due to the way in which SQL
Server is licensed, have led to the increasing popularity of multi-core processors for
server usage.

22
Chapter 1: Processors and Associated Hardware

Multi-core processors and hyper-threading


The problems regarding excessive power consumption and heat generation at very high
clock speeds has led to the continued and increasing popularity of multi-core processors
which, along with multi-threaded applications, can break larger tasks into smaller slices
that can be run in parallel. This means that the tasks complete quicker, and with lower
maximum power consumption.

Sockets, cores, and SQL Server licensing


I recently followed a rather confused conversation on Twitter where someone stated that
he tended to "use the terms sockets and cores interchangeably, since everyone else does."
Back in prehistoric days (around 2001), you could probably get away with this, since all
Intel and AMD processors had but a single core, and so one socket meant one physical
processor and one physical core.

However, with the advent of multi-core processors and hyper-threading, it is simply


incorrect to equate sockets to cores. The hierarchy works like this:

• physical socket – this is the slot on a motherboard where a physical processor fits

• physical core – in a multi-core processor, each physical processor unit contains


multiple individual processing cores, enabling parallel processing of tasks

• logical core – with the hyper-threading technology, a physical core can be split into
two logical cores (logical processors), again facilitating parallel execution of tasks.

In the days of single-core processors, if you wanted multiple threads of execution, your
only option was to add physical sockets and processors. In 2002, however, Intel intro-
duced the first processor with hyper-threading (covered in more detail shortly). Within
each physical processor core, hyper-threading creates two logical processors that are
visible to the operating system.

23
Chapter 1: Processors and Associated Hardware

By splitting the workload across two logical cores, hyper-threading can improve
performance by anywhere from 5–30%, depending on the application.

In 2005, AMD introduced the first dual-core processor, the Athlon 64 X2, which presented
two discrete physical processor cores to the Windows operating system, and provided
better multi-threaded performance than hyper-threading. In late 2006, Intel introduced
the first Core2 Quad, which was a processor with four physical cores (but no hyper-
threading). Since then, both AMD and Intel have been rapidly increasing the physical
core counts of their processors. AMD has a processor called the Opteron 61xx, Magny
Cours, which has 12 physical cores in a single physical processor. Intel has the Xeon 75xx,
Nehalem-EX, which has eight physical cores, plus second-generation hyper-threading.
This means that you have a total of 16 cores visible to Windows and SQL Server for each
physical processor.

The critical point to bear in mind here is that, unlike for other database products, such
as Oracle or DB2, SQL Server licensing (for "bare metal" servers) is only concerned with
physical processor sockets, not physical cores, or logical cores. This means that the
industry trend toward multiple core processors, with increasing numbers of cores, will
continue to benefit SQL Server installations, since you will get more processing power
without having to pay for additional processor licenses. Knowing this, you should always
buy processors with as many cores as possible (regardless of your workload type) in order
to maximize your CPU performance per processor license. For example, it would make
sense, in most cases, to have one quad-core processor instead of two dual-core processors.
Having additional processor cores helps increase your server workload capacity for
OLTP workloads, while it increases both your workload capacity and performance for
DSS/DW workloads.

Of course, there are some restrictions. Both SQL Server 2005 and SQL Server 2008 are
limited to 64 logical processors. In order to use more, you must be running SQL Server
2008 R2 on top of Windows Server 2008 R2, which will raise your limit to 256 logical
processors. SQL Server 2008 R2 Enterprise Edition has a license limit of eight physical
processor sockets. If you need more than eight physical processors, you will need to run
SQL Server 2008 R2 Datacenter Edition.

24
Chapter 1: Processors and Associated Hardware

Nevertheless, within these restrictions, and given the rapid advancement in processor
architecture, the sky is more or less the limit in terms of available processing power. In
mid-2011, AMD will release the Bulldozer processor, which will have 16 physical cores per
physical processor, while Intel has recently released the Xeon E7, Westmere-EX family,
which has 10 physical cores, plus second-generation hyper-threading, which means that
you will have a total of 20 logical cores visible to Windows per physical processor. So, with
eight SQL Server 2008 R2 Enterprise Edition licenses, and eight Intel E7-8870 processors,
you can get up to 160 logical processors!

As a result of such advancements in processor technology, it is becoming more and


more common to utilize excess processing power to relieve the pressure on other parts
of the system.

Making use of excess CPU capacity: data and backup


compression
While processor power has increased rapidly, the performance of the main memory
and the disk subsystem has not improved nearly as quickly, so there is now a very large
disparity between processor performance and the performance of other components of
the hardware system. As such, it is becoming increasingly common to use surplus CPU
capacity to perform tasks such as data compression, backup compression, and log stream
compression, in order to help relieve pressure on other parts of the system.

In particular, the idea of using excess processor capacity to compress and decompress
data as it is written to, and read from, the disk subsystem, has become much more
prevalent. SQL Server 2008 and 2008 R2 provide both data compression (Page or Row)
as well as SQL Server native backup compression. In SQL Server 2008, these are all
Enterprise Edition-only features. In SQL Server 2008 R2, backup compression is included
in Standard Edition. SQL Server 2008 R2 also adds Unicode data compression for
nvarchar and nchar data types, which makes data compression even more useful in
many cases. A good example of this would be if you are using Unicode data types to store
mostly Western European language characters, which Unicode data compression does an

25
Chapter 1: Processors and Associated Hardware

excellent job of compressing, often being able to reduce the required storage space by up
to 50%. Data compression is explained in more detail in Chapter 7.

Both of these features can be very effective in reducing stress on your I/O subsystem,
since data is compressed and decompressed by the CPU before it is written to, or read
from, the disk subsystem. This reduces I/O activity and saves disk space, at the cost of
extra CPU pressure. In many cases, you will see a significant net performance gain from
this tradeoff from I/O to CPU, especially if you were previously I/O bound. SQL Server
data compression can also reduce memory pressure, since the compressed data stays
compressed in memory after it is read in off the disk subsystem, only being decompressed
if the data is changed while it is in the buffer pool. Keep in mind that you need to be
more selective about using data compression on individual indexes with OLTP workloads
than you do with DSS/DW workloads. This is because the CPU cost of compressing and
decompressing highly volatile data goes up quickly with more volatile tables and indexes
in an OLTP system.

Another new feature in SQL Server 2008 is log stream compression for database
mirroring, which is enabled by default. This feature uses the CPU to compress the log
stream on the Principal instance before it is sent over the network to the Mirror instance.
This can dramatically reduce the network bandwidth required for database mirroring,
at the cost, again, of some extra CPU activity. Generally, this also gives better overall
database mirroring performance.

Bearing the increasing importance of various forms of compression in mind, you may
want to purposely overprovision your CPU capacity realizing that you may be devoting
some of this extra capacity to compression activity. It is not unusual to see CPU utili-
zation go up by 10–15% during a database backup that uses SQL Server native backup
compression, while heavy use of data compression can also cause increased CPU
utilization. Compared to the cost of additional I/O capacity and storage space, having
the best available CPU and using its excess capacity for compression can be a very cost-
effective solution.

26
Chapter 1: Processors and Associated Hardware

Hyper-threading
Another processor feature to consider (or perhaps reconsider) is Intel hyper-threading.
Hyper-threading (HT) is Intel's marketing term for its simultaneous multi-threading
architecture where each physical processor core is split into two logical cores. Note that
"simultaneous" doesn't mean that you can have two threads running simultaneously on
two logical cores; the threads run alternately, one working while the other is idle.

Unlike physical cores, completely separate logical cores have to share some resources,
such as the L2 cache, but they do offer a noticeable performance benefit under some
circumstances, with some types of workloads.

Hyper-threading was originally implemented in 2002, as part of the NetBurst architecture


in the Northwood-based Pentium 4 processors, and equivalent Xeon family. Intel had
noticed that, quite often, the processor was waiting on data from main memory. Rather
than waste processor cycles during this wait time, the idea was to have a second logical
processor inside the physical core that could work on something different whenever the
first logical processor was stalled waiting on data from main memory.

Many server and workstation applications lend themselves to parallel, multi-threaded


execution. Intel hyper-threading technology enables simultaneous multi-threading
within each processor core, up to two threads per core. Hyper-threading reduces
computational latency, making better use of every clock cycle. For example, while
one thread is waiting for a result or event, another thread is executing in that core,
to maximize the work from each clock cycle. This idea works pretty well for desktop
applications. The classic example is doing an anti-virus scan in the background, while in
the foreground the user is working on a document in another application. One logical
processor handles the virus scan, while the other logical processor handles the foreground
activity. This allows the foreground application to remain more responsive to the user
while the virus scan is in progress, which is a much better situation than with a single
non-hyper-threaded processor.

27
Chapter 1: Processors and Associated Hardware

Unfortunately, the initial implementation of hyper-threading on the Pentium 4 Netburst


architecture did not work very well on many server workloads (such as SQL Server). The
L2 data cache was shared between both logical processors, which caused performance
issues as the L2 cache had to be constantly reloaded as application context switched
between each logical processor. This behavior was commonly known as cache thrashing,
which often led to an overall performance decrease for server applications such as
Microsoft Exchange and Microsoft SQL Server.

As such, it was pretty common for database administrators to disable hyper-threading for
all SQL Server workloads. However, the newer (second-generation) implementation of
hyper-threading (as used in the Intel Nehalem or Westmere based Xeon 5500, 5600, 6500,
and 7500 families) seems to work much better for many server workloads, and especially
with OLTP workloads. It is also interesting that every TPC-E benchmark submission that
I have seen on Nehalem-EP, Nehalem-EX, and Westmere-EP based platforms has had
hyper-threading enabled.

My rule-of-thumb advice for the use of hyper-threading is to enable it for workloads


with a high OLTP character, and disable it for OLAP/DW workloads. This may initially
seem counter-intuitive, so let me explain in a little more detail. For an OLAP workload,
query execution performance will generally benefit from allowing the query optimizer to
"parallelize" individual queries. This is the process where it breaks down a query, spreads
its execution across multiple CPUs and threads, and then re-assembles the result. For an
ideal OLAP workload, it would be typical to leave the Max Degree of Parallelism (MAXDOP)
setting at its default value of zero (unbounded), allowing the optimizer to spread the
query execution across as many cores as are available (see the Max degree of parallelism
section in Chapter 7 for full details).

Unfortunately, it turns out that this only works well when you have a high number of
true physical cores (such as those offered by some of the newer AMD Opteron 6100 series
Magny Cours processors). Complex, long-running queries simply do not run as well on
logical cores, so enabling HT tends to have a detrimental impact on performance.

28
Chapter 1: Processors and Associated Hardware

In contrast, an OLTP workload is, or should be, characterized by a large number of short
transactions, which the optimizer will not parallelize as there would be no performance
benefit; in fact, for an ideal OLTP workload, it would be common to restrict any single
query to using one processor core, by setting MAXDOP to 1. However, this doesn't mean
that OLTP workloads won't benefit from HT! The performance of a typical, short OLTP
query is not significantly affected by running on a logical core, as opposed to a physical
core so, by enabling HT for an OLTP workload, we can benefit from "parallelization" in
the sense that more cores are available to process more separate queries in a given time.
This improves capacity and scalability, without negatively impacting the performance of
individual queries.

Of course, no real workload is perfectly OLTP, or perfectly OLAP; the only way to know
the impact of HT is to run some tests with your workload. Even if you are still using the
older, first generation implementation of hyper-threading, it is a mistake to always disable
it without first investigating its impact in your test environment, under your workload.

A final, important point to remember is that Windows and SQL Server cannot tell the
difference between hyper-threaded logical processors and true dual-core or multi-core
processors. One easy way to get some hardware information, including the number of
logical cores, is via the T-SQL DMV query shown in Listing 1.1 (requires VIEW SERVER
STATE permission). It returns the logical CPU count, which is the total number of CPUs
visible to the operating system, the hyper-thread ratio (which can be a combination of
actual multiple cores and hyper-threaded cores), the number of physical CPUs in the
system and, finally, the amount of physical RAM installed in a system.

- - Hardware information from SQL Server 2005/2008/2008R2


- - (Cannot distinguish between HT and multi-core)
SELECT cpu_count AS [Logical CPU Count] ,
hyperthread_ratio AS [Hyperthread Ratio] ,
cpu_count / hyperthread_ratio AS [Physical CPU Count] ,
physical_memory_in_bytes / 1048576 AS [Physical Memory (MB)]
FROM sys.dm_os_sys_info ;

Listing 1.1: CPU configuration.

29
Chapter 1: Processors and Associated Hardware

The hyperthread_ratio column treats both multi-core and hyper-threading the same
(which they are, as far as the logical processor count goes), so it cannot tell the difference
between a quad-core processor and a dual-core processor with hyper-threading enabled.
In each case, these queries would report a hyperthread_ratio of 4.

Processor Makes and Models


In this section, we'll review details of recent, current, and upcoming processors from both
Intel and AMD, with a focus on factors that are important for SQL Server workloads.
These include the following items:

• specific model number

• manufacturing process technology

• rated clock speed

• number of cores and threads

• cache sizes and types

• supported instruction sets.

This will help you evaluate the processors in your existing database servers and it will
help you when the time comes to buy a new database server. Additional reference infor-
mation for each processor series can be found in Appendix A.

Since SQL Server only runs on the Microsoft Windows operating system, we only
have to worry about Intel-compatible processors for this discussion. All modern Intel-
compatible x86 and x64 processors are Complex Instruction Set Computer (CISC)
processors, as opposed to Reduced Instruction Set Computer (RISC) processors. Intel
also has the EPIC-based Itanium and Itanium2 IA64 processors, which are popular with
some larger companies.

30
Chapter 1: Processors and Associated Hardware

My overall philosophy is that, for each physical socket in a SQL Server database server,
you should buy and use the very best individual processor. This is a somewhat different
strategy from the one I use when it comes to normal laptop or workstation hardware
selection but, to summarize what we've discussed to this point, my reasoning is
outlined below.

• SQL Server is licensed by physical processor socket, so you really want to get as much
performance and capacity as you can for each processor license that you purchase.

• The incremental cost of getting the top-of-the-line processor for each socket is quite
small compared to the overall system cost (especially when you factor in SQL Server
license costs).

• You can use excess processor performance and capacity to perform data compression
or backup compression, which will relieve pressure on your storage subsystem at a
lower cost than many other solutions.

• By selecting the best available processor, you may be able to run your workload on a
two-socket machine instead of a four-socket machine. If you can do this, the savings
in SQL Server license costs can more than pay for your hardware (for the server itself).
The reduced SQL Server license costs would effectively make your two-socket server
free, from a hardware-cost perspective.

Intel is currently pretty dominant in the server arena, both in terms of performance and
market share. AMD is Intel's main competitor here, but they have been struggling to
keep up with Intel in terms of performance since about 2007. However, AMD does offer
processors that are very competitive from a cost perspective.

Intel Xeon processors


In the x86/x64 server space, Intel has various incarnations of the Xeon family. The first
generation of the Xeon (which replaced the Pentium Pro), released back in 1998, was
based on the Pentium II processor.

31
Chapter 1: Processors and Associated Hardware

The main current branches of the family are:

• Xeon 3000 series – for single-socket motherboards

• Xeon 5000 series – for dual-socket motherboards

• Xeon 6000 series – for dual – or quad-socket motherboards

• Xeon 7000 series – for multi-socket motherboards.

The Intel Itanium family, discussed later in the chapter, uses the 9000 numerical
sequence for its processor numbers.

It's much easier to understand some of the differences between the models if you
know how to decode the model numbers, as shown in Figure 1.2, that Intel uses for
their processors.

Figure 1.2: 2010 Intel Xeon Processor numbering.

Both Intel Xeon and Intel Itanium processor numbers are categorized in four-digit
numerical sequences, and may have an alpha prefix to indicate electrical power usage and
performance. The alpha prefixes are as follows:

• Xxxxx meaning Performance

• Exxxx meaning Mainstream

• Lxxxx meaning Power-Optimized.

32
Chapter 1: Processors and Associated Hardware

So, for example, a Xeon X7460 is a high-end Performance processor for multi-processor
systems, an Intel Xeon E5540 is a Mainstream dual-processor, while an Intel Xeon L5530
is a Power-Optimized dual-processor. The final three digits denote the generation and
performance of the processor; for example, an X7460 processor would be newer and
probably more capable than an X7350 processor. Higher numbers for the last three digits
of the model number mean a newer generation in the family, i.e. 460 is a newer gener-
ation than 350, in this example.

Got all that? Good because, just to confuse everyone, Intel has now (as of April 2011)
rolled out a new processor numbering scheme for its processors! All very recent or future
processors with be identified by a Product Line (E3, E5, or E7), followed by a dash, then a
four-digit numerical sequence, an optional alpha suffix (L, used only to denote low-power
models) to denote the Product Family, and then a version number (v2, v3, and so on), as
shown in Figure 1.3.

Figure 1.3: 2011 Intel Xeon Processor numbering.

33
Chapter 1: Processors and Associated Hardware

The four-digit sequence describing the product family breaks down as follows. The first
number after the dash denotes the CPU "wayness," which means the maximum number
of CPUs allowed in a node (which is generally an entire machine). System vendors
can combine multiple nodes with special interconnects to build a server with 16 or 32
processors, for example. The second number denotes the Socket Type of the CPU, which
can be 2, 4, 6, or 8. This refers to the physical and electrical socket design into which the
processor plugs. The third and further digits are the Processor Stock Keeping Unit (SKU).
Higher numbers for the SKU mean higher-performance parts (so a 70 SKU would be a
higher-performance model than a 20 SKU).

Finally, an "L" suffix, if present, will denote a Low-Power model (i.e. a model optimized to
reduce electrical power usage). First-generation releases of a given processor will have no
version number, but subsequent generations of the same processor will be denoted by v2,
v3, and so on. Intel has already released the first E3 – and E7 – processors, while the E5 –
product line will be released later in 2011.

In my opinion, for SQL Server usage you should always choose the Performance
models with the X model prefix (or higher SKU numbers, in the new naming system).
The additional cost of an X series Xeon processor, compared to an E series, is minimal
compared to the overall hardware and SQL Server license cost of a database server
system.

You should also avoid the power-optimized L series, since they can reduce processor
performance by 20–30% while only saving 20–30 watts of power per processor, which is
pretty insignificant compared to the overall electrical power usage of a typical database
server.

Disabling power management features

On mission-critical database servers, it is extremely important to disable power management features in


the system BIOS or set it to OS Control. This is discussed in much more detail in Chapter 5.

34
Chapter 1: Processors and Associated Hardware

Figures 1.4 and 1.5 display the CPU information for a couple of different processors, using
the CPU-Z tool (see Chapter 4 for more detail). Figure 1.4 shows the information for an
Intel Xeon X5550 processor.

Figure 1.4: Intel Xeon X5550 information displayed in CPU-Z.

The newer Xeon X5550 is a much more capable processor than the older Xeon
E5440, shown in Figure 1.5. Even though the former runs at a slightly slower clock
speed (2.67 GHz vs. 2.83 GHz), it uses a newer micro-architecture and has nearly
double the performance of the E5440 on most processor and memory performance
component benchmarks.

35
Chapter 1: Processors and Associated Hardware

Figure 1.5: Intel Xeon E5440 information displayed in CPU-Z.

Intel Tick-Tock release strategy


Since 2006, Intel has adopted a Tick-Tock strategy for developing and releasing new
processor models. Every two years, they introduce a new processor family, incorporating
a new microarchitecture; this is the Tock release. One year after the Tock release, they
introduce a new processor family that uses the same microarchitecture as the previous
year's Tock release, but using a smaller manufacturing process technology and usually
incorporating other improvements such as larger cache sizes or improved memory
controllers. This is the Tick release.

36
Chapter 1: Processors and Associated Hardware

This Tick-Tock release strategy benefits the DBA in a number of ways. It offers better
predictability regarding when major (Tock) and minor (Tick) releases will be available.
This helps the DBA plan upgrades.

Tick releases are usually socket-compatible with the previous year's Tock release, which
makes it easier for the system manufacturer to make the latest Tick release processor
available in existing server models quickly, without completely redesigning the system. In
most cases, only a BIOS update is required to allow an existing system to use a newer Tick
release processor. This makes it easier for the DBA to maintain servers that are using the
same model number (such as a Dell PowerEdge R710 server), since the server model will
have a longer manufacturing lifespan.

As a DBA, you need to know where a particular processor falls in Intel's processor family
tree if you want to be able to meaningfully compare the relative performance of two
different processors. Historically, processor performance has nearly doubled with each
new Tock release, while performance usually goes up by 20–25% with a Tick release.

Some of the recent Tick-Tock releases are shown in Figure 1.6.

Figure 1.6: Intel's Tick-Tock release strategy.

37
Chapter 1: Processors and Associated Hardware

The manufacturing process technology refers to the size of the individual circuits and
transistors on the chip. The Intel 4004 (released in 1971) series used a 10-micron process;
the smallest feature on the processor was 10 millionths of a meter across. By contrast, the
Intel Xeon Westmere 56xx series (released in 2010) uses a 32 nm process. For comparison,
a nanometer is one billionth of a meter, so 10-microns would be 10,000 nanometers! This
ever-shrinking manufacturing process is important for two main reasons:

• increased performance and lower power usage – even at the speed of light, distance
matters, so having smaller components that are closer together on a processor means
better performance and lower power usage

• lower manufacturing costs – since you can produce more processors from a standard
silicon wafer; this helps make more powerful and more power-efficient processors
available at a lower cost, which is beneficial to everyone, but especially to the database
administrator.

The first Tock release was the Intel Core microarchitecture, which was introduced as
the Woodcrest (for servers) in 2006, with a 65 nm process technology. This was followed
up by a shrink to 45 nm process technology in the dual-core Wolfdale and quad-core
Harpertown processors in late 2007, both of which were Tick releases.

The next Tock release was the Intel Nehalem microarchitecture, which used a 45 nm
process technology, introduced in late 2008. In 2010, Intel released a Tick release, code-
named Westmere, that shrank to 32 nm process technology in the server space. In 2011,
the Sandy Bridge Tock release debuted with the E3-1200 series.

38
Chapter 1: Processors and Associated Hardware

These Tick and Tock releases, plus planned releases, are summarized in Figure 1.7.

Type Year Process Model Families Code Name

Tock 2006 65nm 3000, 3200, 5100, 5300, 7300 Core 2 Woodcrest, Clovertown

Tick 2007 45nm 3100, 3300, 5200, 5400, 7400 Core 2 Wolfdale, Harpertown

Tock 2008 45nm 3400, 3500, 5500, 6500, 7500 Nehalem-EP, Nehalem-EX (2010)

Westmere-EP, Westmere-EX
Tick 2010 32nm 3600, 5600, E7-8800/4800
(2011)

Sandy Bridge-EP, Sandy


Tock 2011 32nm E3-1200, E5, E8 ?
Bridge-EX

Tick 2012? 22nm Ivy Bridge

Tock 2013? 22nm Haswell

Figure 1.7: Intel's Tick-Tock milestones.

Future Intel Xeon releases


The next Tick release after the Xeon 7500 series was the ten-core Xeon
E7-8800/4800/2800 series, Westmere-EX, which has a process shrink to 32nm, a larger
L3 cache, an improved memory controller, and other minor improvements. The Xeon
E7-8800/4800/2800 Westmere-EX is a Tick release that became available in April 2011,
which supports 32 GB DDR3 DIMMs, giving you the ability to have 2 TB of RAM in a four-
socket server (if you have deep enough pockets).

The next Tock release after the Xeon 5600 series will be the eight-core, Xeon E5, Sandy
Bridge-EP series, meant for two-socket servers. It is scheduled for production during
the second half of 2011. It will also have hyper-threading, and an improved version of
Turbo Boost that will be more aggressive about increasing the clock speed of additional

39
Chapter 1: Processors and Associated Hardware

cores in the physical processor based on the overall system temperature. This will help
single-threaded application performance (such as OLTP database workloads).

The subsequent Ivy Bridge Tick release will be a 22 nm process technology die shrink of
Sandy Bridge with some other minor improvements. A Tock release called Haswell is due
sometime in 2013 and Intel is planning to get to 15 nm process technology by late 2013,
11 nm process technology by 2015, and 8 nm process technology by 2017.

Current recommended Intel Xeon processors


Complete details of the current Intel Xeon family can be found at http://ark.intel.com/
ProductCollection.aspx?familyID=594, while Appendix A of this book provides further
details of each of the processor series.

Generally, speaking any Intel processor based on the Nehalem microarchitecture and
later, can be considered a modern processor and will give very competitive performance.
Older models that relied on the SMP memory architecture and a shared front-side
bus (see the Memory Architecture section, later in this chapter) offer considerably lower
performance, as is borne out by TPC-E benchmark tests results (see Chapter 3).

My current recommendations for Intel processors are below.

One-socket server (OLTP)


Xeon E3-1280 (32 nm Sandy Bridge)

• 3.50 GHz, 8 MB L3 Cache, 5.0 GT/s Intel QPI

• Four cores, Turbo Boost 2.0 (3.90 GHz), hyper-threading

• Two memory channels.

40
Chapter 1: Processors and Associated Hardware

One-socket server (DW/DSS)


Xeon W3690 (32 nm Westmere)

• 3.46 GHz, 12 MB L3 Cache, 6.40 GT/s Intel QPI

• Six cores, Turbo Boost (3.73 GHz), hyper-threading

• Three memory channels.

Two-socket server (OLTP)


Xeon X5690 (32 nm Westmere-EP)

• 3.46 GHz, 12 MB L3 Cache, 6.40 GT/s Intel QPI

• Six cores, Turbo Boost (3.73 GHz), hyper-threading

• Three memory channels.

Two-socket server (DW/DSS)


Xeon E7-2870 (32 nm Westmere-EX)

• 2.40 GHz, 30 MB L3 Cache, 6.40 GT/s Intel QPI

• Ten-cores, Turbo Boost 2.0 (2.8 GHz), hyper-threading

• Four memory channels.

41
Chapter 1: Processors and Associated Hardware

Four-socket server (any workload type)


Xeon E7-4870 (32 nm Westmere-EX)

• 2.40 GHz, 30 MB L3 Cache, 6.40 GT/s Intel QPI

• Ten-cores, Turbo Boost 2.0 (2.8 GHz), hyper-threading

• Four memory channels.

Eight-socket server (any workload type)


Xeon E7-8870 (32 nm Westmere-EX)

• 2.40 GHz, 30 MB L3 Cache, 6.40 GT/s Intel QPI

• Ten-cores, Turbo Boost 2.0 (2.8 GHz), hyper-threading

• Four memory channels.

Intel Itanium and Itanium 2


In the IA64 arena, Intel has the Itanium family. The Intel Itanium architecture is based
on the Explicitly Parallel Instruction Computing (EPIC) model, which was specifically
designed to enable very high Instruction Level Parallelism (ILP). The processor provides a
wide and short pipeline (eight instructions wide and six instructions deep). The original
Merced Itanium was released in 2001, but was quickly replaced by the first McKinley
Itanium2 in 2002. The original Itanium used an 180 nm manufacturing process, and
its performance was simply not competitive with contemporary x86 processors. Since
then, subsequent Itanium2 processors have moved from a 180 nm process all the way to
a 65 nm process, while clock speeds have gone from 900 MHz to 1.73 GHz, and L3 cache

42
Chapter 1: Processors and Associated Hardware

sizes have grown from 1.5 MB to 30 MB. Even so, the Itanium2 family remains a relatively
specialized, low-volume part for Intel.

The Intel Xeon 9000 series is Intel's "big iron;" specialized high-end server processors,
designed for maximum performance and scalability, for RISC replacement usage with
2 – to 512-processor server motherboards. RISC is a rival processor technology to the
more widespread CISC processor technology. RISC became popular with the SUN SPARC
workstation line of computers.

I have heard one SQL Server hardware expert describe Itanium as a big semi-trailer truck
compared to the Xeon, which is more like a Formula 1 race car, and certainly they are
intended for use in cases where extremely high RAM and I/O capacity is required, along
with the highest levels of Reliability, Availability and Serviceability (RAS). Itanium-based
machines have many more expansion slots for RAID controllers or host bus adapters, and
many more memory slots, which allow them to have the I/O capacity to handle the very
largest SQL Server workloads.

Further details about the recent generations of the 9000 sequence are listed in Appendix
A. Of future releases, the Intel Itanium Poulson will use the 32 nm manufacturing process
and will have eight cores. It is scheduled for release in 2012. This will presumably be the
Itanium 9400 series. The Intel Itanium Kittridge is the follow-on to the Poulson, due to
be released in 2014. Intel has not yet released any specific details about Kittridge and, in
light of the fact that Microsoft announced in April in 2010 that Windows Server 2008
R2, Visual Studio 2010, and SQL Server 2008 R2 will be the last releases of those products
that will support the Itanium architecture, it remains to be seen whether Kittridge will
actually be released.

Given that SQL Server Denali will not support the Itanium processor family, it makes it
difficult to recommend the use of an Itanium processor for any new SQL Server installa-
tions. Mainstream support for Windows Server 2008 R2 for Itanium-based systems will
end on July 9, 2013, while extended support will end on July 10, 2018.

43
Chapter 1: Processors and Associated Hardware

AMD Opteron processors


In the x86/x64 space, AMD has various versions of the Opteron family. When assessing
AMD processors, it is very helpful, once again, to understand what the model numbers
mean. Recent AMD Opteron processors are identified by a four-digit model number in
the format ZYXX, where Z indicates the product series:

• 1000 Series = 1-socket servers

• 2000 Series = up to 2-socket servers and workstations

• 4000 Series = up to 2-socket servers

• 6000 Series = high-performance 2 and 4-socket servers

• 8000 Series = up to 8-socket servers and workstations.

The Y digit differentiates products within a series. For example:

• Z2XX = dual-core

• Z3XX = quad-core

• Z4XX = six-core

• First-generation AMD Opteron 6000 series processors are denoted by 61XX.

The XX digits indicate a change in product features within the series (for example, in the
8200 series of dual-core processors, we have models 8214, 8216, 8218, and so on), and are
not a measure of performance. It is also possible to have a two-digit product suffix after
the XX model number, as below.

• No suffix – indicates a standard power AMD Opteron processor.

• SE – performance optimized, high-powered.

• HE – low power.

• EE – lowest power AMD Opteron processor.

44
Chapter 1: Processors and Associated Hardware

For example, an Opteron 6176 SE would be a 6000 series, twelve-core, performance


optimized processor; an Opteron 8439 SE would be an 8000 series, six-core, performance
optimized processor, while an Opteron 2419 EE would be a 2000 series, six-core, energy
efficient processor. For mission-critical database servers, I would recommend that
you select the SE suffix processor, if it is available for your server model. The reason
that it is not always available in every server model is due to its higher electrical power
requirements.

Recent Opteron AMD releases, plus planned releases, are summarized in Figure 1.8.
Since 2010, the Magny Cours processor has been AMD's best-performing model.

Year Process Model Families Code Name

2006 90nm 1200, 2200, 8200 Santa Ana, Santa Rosa

2007-8 65nm 1300, 2300, 8300 Budapest, Barcelona

2009 45nm 1300, 2300, 2400, 8300, 8400 Suzuka, Shanghai, Istanbul

2010 45nm 4100, 6100 Lisbon, Magny Cours

2011 32nm ?? Interlagos, Valencia

Figure 1.8: AMD Opteron milestones.

Future Opteron AMD releases


In 2011, AMD will have the Interlagos and Valencia platforms, which will both use the
next-generation Bulldozer-based processors. The Bulldozer will have technology similar
to Intel hyper-threading, where integer operations will be able to run in separate logical
cores, (from the operating system's perspective). The Interlagos will be a Socket G34
platform that will support 12 – and 16-core 32 nm processors. The Valencia will be a
Socket C32 platform that will use 6 – and 8-core 32 nm processors. Interlagos and Valencia
will first begin production in Q2 2011, and AMD is scheduled to launch them in Q3 2011.

45
Chapter 1: Processors and Associated Hardware

In each module, the two integer cores (used only for integer, as opposed to floating point,
operations) will share a 2 MB L2 cache, and there will be an 8 MB L3 cache shared over the
entire physical processor (16 MB on the 16-core Interlagos processor).

AMD is going to include AMD Turbo CORE technology in the Bulldozer processor. This
Turbo CORE technology will boost clock speeds by up to 500 MHz, with all cores fully
utilized. The current Intel implementation of Turbo Boost technology will increase the
clock speed of a few cores, when the others are idle, but with the upcoming AMD Turbo
CORE the processors will see full core boost, meaning an extra 500 MHz across all 16
threads for most workloads. This means that all 16 cores will be able to have their clock
speed increased by 500 MHz for short periods of time, limited by the total power draw of
the processor, not the ambient temperature of the system itself.

AMD has also announced greater memory throughput for the newly redesigned memory
controller. AMD claims about a 50% increase in memory throughput with the new
Bulldozer integrated memory controller. About 30% of that increase is from enhance-
ments to the basic design of the memory controller, while the other 20% is from support
of higher speed memory.

Current recommended AMD Opteron processors


Complete current details of the current AMD Opteron family can be found at www.amd.
com/us/products/server/processors/Pages/server-processors.aspx, and Appendix A
of this book provides further details of each of the processor series. However, my current
recommendations are as follows.

46
Chapter 1: Processors and Associated Hardware

One-socket or budget two-socket server


• Opteron 4184 (45 nm Lisbon), six cores

• 2.8 GHz, 6 MB L3 cache, 6.4 GT/s.

Two-socket server
• Opteron 6180 SE (45 nm Magny Cours), twelve cores

• 2.5 GHz, 12 MB L3 Cache, 6.4 GT/s.

Four-socket server
• Opteron 6180 SE (45 nm Magny Cours), twelve cores

• 2.5 GHz, 12 MB L3 Cache, 6.4 GT/s.

Server Motherboards: how to evaluate


motherboards and chipsets
For any given processor, a server vendor may make available a number of different
motherboards, and associated chipsets that support it. When you are evaluating and
selecting a motherboard, you will want to consider such factors as:

• number of sockets

• supported chipsets, including number and type of expansion slots and reliability,
availability, and serviceability (RAS) features

47
Chapter 1: Processors and Associated Hardware

• memory requirements, including number of memory slots and supported


memory architecture

• type and number of other integrated components, such as Network Interface


Cards (NICs).

Number of sockets
Until recently, it was standard practice to choose a four-socket system for most database
server applications, simply because two-socket systems usually did not have enough
processor cores, memory sockets or expansion slots to support a normal database server
workload.

This prescription has changed dramatically since late 2008. Prior to that time, a typical
two-socket system was limited to 4–8 processor cores and 4–8 memory slots. Now, we see
two-socket systems with up to 24 processor cores, and 18–24 memory slots.

The number and type of expansion slots has also improved in two-socket systems. Finally,
with the widespread acceptance of 2.5" SAS (Serial Attached SCSI) drives (see Chapter 2),
it is possible to have many more internal drives in a 1U or 2U system than you could have
with 3.5" drives. All of these factors allow you to build a very powerful system, while only
requiring two SQL Server processor licenses.

The technology of four-socket Intel systems (in terms of processors and chipsets) usually
lags about one year behind two-socket server systems. For example, the processor
technology available in the two-socket 45 nm Intel Xeon 5500 systems, launched in early
2009, was not available in four-socket systems until early 2010. By mid-2010, the latest
two-socket Intel systems were based on 32 nm Intel Westmere-EP architecture processors
and the latest 4-socket systems were still based on the older 45 nm Intel Nehalem-EX
architecture processors. The 4-socket-capable 32 nm Intel Xeon E7-4800 Westmere-EX
was not introduced until April 5, 2011.

48
Chapter 1: Processors and Associated Hardware

Another factor to consider in this discussion is hardware scaling. You might assume
that a 4-socket system will have twice as much overall load capacity or scalability as
the equivalent 2-socket system, but this assumption would be incorrect. Depending on
the exact processor and chipsets involved, rather than the theoretical maximum of 2.0
you typically see a hardware scaling factor of anywhere from 1.5 to 1.8 as you double the
number of sockets.

This, taken with the fact that the latest 2-socket Intel processors use newer technology
and have faster clock speeds than the latest 4-socket Intel processors, means that you
may be better off with two 2-socket servers instead of one 4-socket server, depending
on your overall workload and your application architecture. You will often see better
single-threaded performance with a brand new 2-socket system compared to a brand
new 4-socket system, especially with OLTP workloads. Single-threaded performance is
especially relevant to OLTP workloads, where queries are usually relatively simple and
low in average elapsed time, normally running on a single processor core.

In general, the latest 2-socket systems from both Intel and AMD are extremely capable,
and should have more than enough capacity to handle all but the very largest database
server workloads.

Server chipsets
When you are evaluating server models from a particular server vendor (such as a Dell
PowerEdge R710), you should always find out which chipset is being used in that server.
Intel processors are designed to use Intel chipsets. For each processor sequence (3000,
5000, 6000, 7000, or 9000), several different chipsets will be compatible with a given
processor. Intel usually has at least two different chipsets available for any processor, with
each chipset offering different levels of functionality and performance.

49
Chapter 1: Processors and Associated Hardware

What you want to focus on as a SQL Server DBA are the number and types of expansion
slots and the number of memory slots supported by the chipset. This affects how many
RAID controllers or Host Bus Adapters you can install in a server, which ultimately limits
your total available I/O capacity and performance.

Also important are RAS features. The goal is to have a system that is always available,
never corrupts data and delivers consistent performance, while allowing maintenance
during normal operation. Examples of RAS features include, for example, hot-swappable
components for memory, processors, fans, and power supplies. You also want to see
redundant components that eliminate common single points of failure, such as a power
supply, RAID controller, or a network interface card (NIC).

Details of most recent and current Intel chipsets are available in Appendix A. Based on the
above advice, I recommend you choose a server that uses either the Intel 3420, 5520, or
7500 series chipsets for one-, two-, and four-socket servers respectively.

BIOS
The Basic Input Output System (BIOS) software, built into the motherboard, contains
the first code run by a server when it is turned on. It is used to identify and configure
hardware components, and it is stored in upgradeable firmware. Your hardware system
vendor (such as Dell, HP, or IBM) will periodically release updated BIOS firmware for
your server model to correct problems that were identified and corrected after your server
was initially manufactured. Other items in your server (such as your system backplane
and RAID controllers) will also have upgradeable firmware that needs to be periodically
updated. Upgradeable firmware is increasingly common in everyday consumer electronic
devices such as Blu-ray players, AV Receivers, Smart Phones, and Kindle Readers.

50
Chapter 1: Processors and Associated Hardware

Memory requirements
The basic rule of thumb for SQL Server is that you can never have too much RAM. Server
RAM is relatively inexpensive, especially compared to SQL Server licenses, and having as
much RAM as possible is extremely beneficial for SQL Server performance in many ways.

• It will help reduce read I/O pressure, since it increases the likelihood that SQL Server
will find in the buffer pool any data requested by queries.

• It can reduce write I/O pressure, since it will enable SQL Server to reduce the
frequency of checkpoint operations.

• It is cheap insurance against I/O sizing miscalculations or spikes in your I/O workload.

In order to take the best advantage of your available RAM, you should make sure that you
are running a 64-bit version of SQL Server, which requires a 64-bit version of Windows
Server (Windows Server 2008 R2 is only available in a 64-bit version). This will allow SQL
Server to fully utilize all installed RAM, rather than being restricted to 2 GB (out of the
4 GB of virtual address space to which each 32-bit process has access). To be clear, 32-bit
SQL Server is limited to using 2 GB of RAM, unless you use the /PAE and/or the /3GB
switches (see Chapter 5 for full details).

However, keep in mind that SQL Server 2008 R2 Standard Edition has a memory limit
of 64 GB, so there is little point in buying more memory than that if you will be using
that edition.

Memory types
Any relatively new database servers should be using DIMMs (Dual In-line Memory
Modules) with one of the modern DRAM (Dynamic Random Access Memory) implemen-
tations, specifically either DDR2 SDRAM or DDR3 SDRAM memory.

51
Chapter 1: Processors and Associated Hardware

DDR2 SDRAM is a double-data-rate, synchronous, dynamic, random access memory


interface. It replaces the original DDR SDRAM specification (and the two are not
compatible) and offers higher bus speed with lower power consumption.

DDR3 SDRAM is a double-data-rate type three, synchronous, dynamic, random access,


memory interface. It supersedes DDR2, offers more bandwidth and is required by most of
the latest processors.

ECC, or registered, memory (RDIMM) is required by most server chipsets. It is pretty


likely that you will need DDR3 ECC SDRAM for nearly any new server that you buy in
2010/2011.

DIMMS can contain multiple ranks, where a rank is "two or more independent sets
of DRAM chips connected to the same address and data buses." Memory performance
can vary considerably, depending on exactly how the memory is configured in terms
of the number of memory modules and the number of ranks per module (single-rank,
dual-rank, and so on). This is discussed in more detail in Chapter 3, on benchmarking.

An older memory type that you may run into with some older servers is Fully Buffered
Dual In-line Memory Modules (FB-DIMM), which is memory that uses an Advanced
Memory Buffer (AMB) between the memory controller and the memory module. Unlike
the parallel bus architecture of traditional DRAM, an FB-DIMM has a serial interface
between the memory controller and the AMB. Compared to conventional, registered
DIMMs, the AMB dissipates extra heat and requires extra power. FB-DIMMs are also
more expensive than conventional DIMMs, and they have not been widely adopted in
the industry.

Memory slots
One very important consideration with regard to the server motherboard and its
suitability for use in a SQL Server installation, is the total number of memory slots. This,
along with the limitations of the memory controller in the chipset or the CPU itself,

52
Chapter 1: Processors and Associated Hardware

ultimately determines how much RAM you can install in a database server. Obviously, the
more slots the better. For example, many server vendors currently offer 1U, two-socket
servers that have a total of twelve memory slots, while they also offer nearly identical
2U servers that have a total of eighteen memory slots. In most cases, I would prefer the
server that had eighteen memory slots, since that would allow me to ultimately have
more memory in the database server. This assumes that the memory controller (whether
in the chipset or the integrated memory controller in the processor) will support enough
memory to fill all of those available memory slots.

A related consideration is memory density, by which I mean the memory capacity of each
memory stick. One tactic that hardware vendors used quite often with standard configu-
rations in the past was to fill all of the available memory slots with relatively low capacity
sticks of memory. This is less expensive initially, but leads to additional cost and waste
when it is time to add additional RAM to the server, since you will have a number of
smaller capacity DIMMs that must be replaced, and which you may have no use for after
the upgrade. You can avoid this situation by specifying larger memory stick sizes when
you pick the server components.

Be careful to choose the most cost-effective size; at the time of writing (April 2011), 8 GB
sticks of RAM represented the sweet spot in the Price-Performance curve, because of the
prohibitive cost of 16 GB sticks of RAM. Violating this sweet spot rule might cause you to
spend far more on memory for the server than the rest of the server combined. However,
once 32 GB sticks of RAM are available in 2011, the price-performance sweet spot will shift
pretty quickly towards 16 GB sticks of RAM.

Many older processors and chipsets will not be able to use that capacity offered by the
forthcoming 32 GB sticks. One known exception will be the 32 nm Intel E7 (Westmere-
EX) processor series that is meant for use in 4-socket, or larger, systems. This opens up
the possibility, sometime in 2011, of a 4-socket system with 4 processors x 16 DIMMS
per processor x 32 GB per DIMM = 2 TB of RAM. You'll have to have very deep pockets
though; 32 GB SDRAM DIMMs will be very, very expensive when they are first available.

53
Chapter 1: Processors and Associated Hardware

Memory architecture: SMP versus NUMA


Another important consideration, when evaluating a server motherboard, is whether it
uses Symmetric Multiprocessing (SMP) or Non-Uniform Memory Architecture (NUMA).

SMP refers to a multiprocessor computer hardware architecture where two or more


identical processors are connected to a single shared main memory, and are controlled
by a single operating system. Many earlier Intel multiprocessor systems use SMP archi-
tecture. In the case of multi-core processors, the SMP architecture applies to the logical
cores, treating them as separate processors.

The bottleneck in SMP scalability is the bandwidth of the front-side bus, connecting
the various processors with the memory and the disk arrays; as the speed and number
of processors increase, the competition between CPUs for access to memory creates bus
contention and limits the ability of a system to scale. Consequently, system throughput
does not grow linearly with the number of processors; for example, doubling the number
of processors in an SMP computer does not double its performance or capacity. This
makes it harder to design high-performance SMP-based systems with more than four
processor sockets. All Intel x64 processors that were released before the Nehalem family
use SMP architecture, since they rely on a front-side bus.

A newer alternative, especially useful for systems with more than four processor sockets,
is NUMA, which dedicates different memory banks, called NUMA nodes, to different
processors. Nodes are connected to each other via an external bus that transports cross-
node data traffic. In NUMA, processors may access local memory quickly but remote
memory access is much more expensive and slower. NUMA can dramatically improve
memory throughput, as long as the data is localized to specific processes.

The NUMA node arrangement addresses the front-side bus contention and, therefore,
scalability issue by limiting the number of CPUs competing for access to memory;
processors will access memory within their own node faster than the memory of other
NUMA nodes. You can get much closer to linear scaling with NUMA architecture.

54
Chapter 1: Processors and Associated Hardware

Having NUMA architecture is more important with systems that have four or more
processor sockets, and not as important with two-socket systems.

AMD server systems have supported NUMA architecture for several years, while Intel
server systems have only supported NUMA since the introduction of the Nehalem micro-
architecture (Xeon 5500) in 2008. One new development for SQL Server 2008 R2 is the
concept of NUMA Groups, which are logical collections of up to 64 logical processors.
This makes it easier to make NUMA configuration changes for SQL Server.

Network Interface Cards


A Gigabit Ethernet NIC is seldom the bottleneck with most OLTP work-loads. A common
exception is if you are doing backups over the network, and the destination for the
backup files has lots of I/O capacity. In this situation, you may be able to completely
saturate a single Gigabit Ethernet NIC. Most recent-vintage server motherboards have
at least two, and sometimes four, Gigabit Ethernet NICs embedded on the motherboard.
This allows you to combine your available NICs using NIC Teaming, or to segregate your
network traffic using multiple IP addresses for your various NICs.

Fast Ethernet was introduced in 1995, and has a nominal bandwidth of 100 Mb/sec. Any
database server that has Fast Ethernet NICs embedded on the motherboard will be quite
ancient, (at least 8–10 years old), and I would really not recommend you use it if you
have any other option available. You should also make sure that you don't have any Fast
Ethernet switches that are used by your database servers, since they will restrict your
bandwidth to that level. Gigabit Ethernet (which is rated at 1 Gb/sec) has completely
replaced Fast Ethernet in servers (and workstations) built in the last four to five years.
10-Gigabit Ethernet, although still quite expensive, is starting to be used more often in
mission-critical servers. It is rated at 10 Gb/sec, and is especially useful for iSCSI SAN
applications (see the Storage Area Networks (SAN) section of Chapter 2 for more details).
The next standard on the horizon is 40 Gb/sec.

55
Chapter 1: Processors and Associated Hardware

Choosing a Processor and Motherboard for Use


with SQL Server
As discussed previously, I'm a strong proponent of getting the very best processor that
your budget will allow, over-provisioning if possible, and using spare CPU capacity (and
memory) to remove load from the disk I/O sub-system. High-performance CPUs are
much more affordable than additional I/O capacity, so it makes financial sense to get the
best CPU available for a given server model.

When provisioning CPU, remember that your server not only has to cope smoothly
with your normal workload, but also deal with inevitable spikes in CPU usage, for
example, during:

• performing database backups

• index rebuilds and reorganization; the actual process of rebuilding or reorganizing an


index is CPU-intensive

• periods of concurrent, CPU-intensive queries, especially Full Text Search queries.

Layered on top of my general advice to get the best-in-class processor, with as many cores
as possible, in order to maximize CPU capacity per SQL Server license, comes consid-
eration of the nature of the workload.

OLTP workloads, characterized by a high number of short-duration transactions, tend


to benefit least from parallelization. If I knew that I was going to have a primarily OLTP
workload, I would lean very heavily towards getting a two-socket system that used Intel
Xeon X5690 Westmere-EP six-core processors, because of their excellent single-threaded
performance. The Intel Xeon X5690 currently has the absolute best single-threaded
performance of any server-class processor. Once the upcoming Intel Sandy Bridge-EP
processors become available, later in 2011, they should have even better single-threaded
performance than the Xeon X5690.

56
Chapter 1: Processors and Associated Hardware

Of course, this assumes that the two-socket system that you choose has enough memory
capacity and expansion slot capacity to support your entire OLTP workload. The
number of expansion slots ultimately limits how many RAID controllers or Host Bus
Adaptors (HBAs) you can use in a system (see Chapter 2). If you are convinced that a single
two-socket system cannot handle your workload, my advice would be to partition, or
shard, the workload in such a way that you could use multiple two-socket systems.

There are several benefits to this strategy. Firstly, even if you could use identical Intel
processors in two-socket and four-socket systems (which you cannot do right now, since
Intel uses different processors for two – and four-socket systems), CPU performance
and load capacity do not scale at 100% when you move from two sockets to four sockets.
In other words, a four-socket system does not have twice as much CPU capacity as a
two-socket system. Secondly, two-socket Intel systems are usually one release ahead of
their four-socket counterparts. The current example is the 32 nm 3.46 GHz Intel Xeon
X5690 for two-socket systems vs. the 45 nm 2.26 GHz Xeon X7560 for four-socket (and
above) systems. The Xeon X5690 is much faster than the Xeon X7560 for single-threaded
OLTP performance. Finally, going through the architectural and engineering work
required to partition or shard your database is a good long-term strategy, since it will
allow you to scale out at the database level, whether it is with on-premises SQL Server or
with SQL Azure.

Another factor to consider is the number of Intel-based systems that have been submitted
and approved for the TPC-E OLTP benchmark, compared to how few AMD-based
systems have been submitted and accepted (see Chapter 3 for further details on bench-
marking). As of January 2011, there are 37 Intel-based TPC-E results compared to four
AMD-based TPC-E results. I don't think this is an accident. Lest you think I am simply an
Intel cheerleader, I am actually rooting for AMD to become more competitive for OLTP
workloads with the upcoming Bulldozer family of CPUs. If AMD cannot compete with
Intel for raw performance, I am afraid that Intel will get lazy and slow down their pace of
innovation, which is bad for the DBA community.

By contrast, if I were managing a primarily DSS/DW workload, characterized by large,


long-running queries, I would tend to favor using the AMD Opteron 6180 SE Magny

57
Chapter 1: Processors and Associated Hardware

Cours 12-core processors, or the AMD Bulldozer processors (when they are available later
in 2011) because they have high core counts and they tend to perform quite well with
multi-threaded applications.

If you use a modern Intel processor (such as a Xeon X5690 or Xeon X7560) with a DSS/
DW workload, you should strongly consider disabling hyper-threading, since long-
running queries often do not perform as well on hyper-threaded cores in a processor.

Summary
In this chapter, we have discussed some of the basics of processors and related hardware,
from a database server perspective.

The latest processors – Nehalem and later from Intel, and Magny Cours and later from
AMD – will allow you to run many SQL Server workloads on a much less expensive
two-socket database server instead of a more traditional four-socket database server.
This can save you an enormous amount of money in both hardware costs and SQL
Server license costs. I strongly advocate getting the best available processor for a given
server model, since the price delta compared to a less expensive processor is quite small
compared to the overall hardware cost (not to mention the SQL Server license costs).
Having extra processor capacity will allow you to use SQL Server features like backup
compression and data compression, which can dramatically reduce I/O pressure, usually
at a much lower cost than adding I/O capacity.

Maximizing the amount of installed RAM in a database server is another relatively


inexpensive tactic to reduce pressure on your I/O subsystem and improve overall database
performance. You need to make sure that you select the size of individual sticks of RAM
that gives you the best price/performance ratio in order to make sure you don't spend an
inordinate amount of money on RAM.

58
Chapter 1: Processors and Associated Hardware

Ultimately, having appropriate and modern hardware can save you lots of money in SQL
Server licensing costs, and can help you avoid future performance and scalability issues
with your database servers. As you begin to understand some of the differences between
different types of hardware, and how to evaluate hardware for use with different types
of SQL Server workloads, you will be in a much better position to actually select appro-
priate hardware for SQL Server yourself, or to make an intelligent argument for proper
hardware with another part of your organization. In the next chapter, we move on to
discuss the storage subsystem, the correct provisioning of which is also critical for SQL
Server performance.

59
Chapter 2: The Storage Subsystem
There are many factors to consider when choosing an appropriate processor and
associated chipsets, and there are just as many considerations when sizing and config-
uring the storage subsystem. It is very easy to hamstring an otherwise powerful system
with poor storage choices. Important factors discussed in this chapter include:

• disk seek time and rotational latency limitations

• type of disk drive used:

• traditional magnetic drive – SATA, SCSI, SAS, and so on

• solid-state drives (SSDs)

• storage array type: Storage Area Network (SAN) vs. Direct Attached Storage (DAS)

• redundant Array of Independent disk (RAID) configuration of your disks.

Having reviewed each component of the disk subsystem, we'll discuss how the size
and nature of the workload will influence the way in which the subsystem is provisioned
and configured.

Disk I/O
RAM capacity has increased constantly over the years and its cost has decreased enough
to allow us to be lavish in its use for SQL Server, to help minimize disk I/O. Also, CPU
speed has increased to the point where many systems have substantial spare capacity that
can often be used to implement data compression and backup compression, again, to help
reduce I/O pressure. The common factor here is helping to reduce disk I/O. While disk
capacity has improved greatly, disk speed has not, and this poses a serious problem; most
large, busy OLTP systems end up running into I/O bottlenecks.

60
Chapter 2: The Storage Subsystem

The main factor limiting how quickly data is returned from a single traditional magnetic
disk is the overall disk latency, which breaks down into seek time and rotational latency.

• Seek time is the time it takes the head to physically move across the disk to find the
data. This will be a limiting factor in the number of I/O operations a single disk can
perform per second (IOPS) that your system can support.

• Rotational latency is the time it takes for the disk to spin to read the data off the disk.
This is a limiting factor in the amount of data a single disk can read per second (usually
measured in MB/s), in other words the I/O throughput of that disk.

Typically, you will have multiple magnetic disks working together in some level
of RAID to increase both performance and redundancy. Having more disk spindles
(i.e. more physical disks) in a RAID array increases both throughput performance and
IOPS performance.

However, a complicating factor here is the performance limitations of your RAID


controllers, for direct attached storage, or Host Bus Adaptors (HBAs), for a storage area
network. The throughput of such controllers, usually measured in gigabits per second,
e.g. 3 Gbps, will dictate the upper limit for how much data can be written to or read from
a disk per second. This can have a huge effect on your overall IOPS and disk throughput
capacity for each logical drive that is presented to your host server in Windows.

The relative importance of each of these factors depends on the type of workload being
supported; OLTP or DSS/DW. This, in turn, will determine how you provision the disk
storage subsystem.

As discussed in Chapter 1, OLTP workloads are characterized by a high number of short


transactions, where the data tends to be rather volatile (modified frequently). There is
usually much higher write activity in an OLTP workload than in a DSS workload. As such,
most OLTP systems generate more IOPS than an equivalent-sized DSS system.

61
Chapter 2: The Storage Subsystem

Furthermore, in most OLTP databases, the read/write activity is largely random, meaning
that each transaction will likely require data from a different part of the disk. All of
this means that, in most OLTP applications, the hard disks will spend most of their
time seeking data, and so the seek time of the disk is a crucial bottleneck for an OLTP
workload. The seek time for any given disk is determined by how far away from the
required data the disk heads are at the time of the read/write request.

A DSS or DW system is usually characterized by longer-running queries than a similar


size OLTP system. The data in a DSS system is usually more static, with much higher read
activity than write activity. The disk activity with a DSS workload also tends to be more
sequential and less random than with an OLTP workload. Therefore, for a DSS type of
workload, sequential I/O throughput is usually more important than IOPS performance.
Adding more disks will increase your sequential throughput until you run into the
throughput limitations of your RAID controller or HBA. This is especially true when a
DSS/DW system is being loaded with data, and when certain types of complex, long-
running queries are executed.

Generally speaking, while OLTP systems are characterized by lots of fast disks, to
maximize IOPS to overcome disk latency issues with high numbers of random reads and
writes, DW/DSS systems require lots of I/O channels, in order to handle peak sequential
throughput demands. An I/O channel is an individual RAID controller or an individual
HBA; either of these gives you a dedicated, separate path to either a DAS array or a SAN.
The more I/O channels you have, the better.

With all of this general advice in mind, let's now consider each of the major hardware
and architectural choices that must be made when provisioning the storage subsystem,
including the type of disks used, the type of storage array, and the RAID configuration of
the disks that make up the array.

62
Chapter 2: The Storage Subsystem

Drive Types
Database servers have traditionally used magnetic hard drive storage. Seek times for
traditional magnetic hard disks have not improved appreciably in recent years, and are
unlikely to improve much in the future, since they are electro-mechanical in nature.
Typical seek times for modern hard drives are in the 5–10 ms range.

The rotational latency for magnetic hard disks is directly related to the rotation speed of
the drive. The current upper limit for rotational speed is 15,000 rpm, and this limit has
not changed in many years. Typical rotational latency times for 15,000 rpm drives are in
the 3–4 ms range.

This disk latency limitation led to the proliferation of vast SAN-based (or DAS-based)
storage arrays, allowing data to be striped across numerous magnetic disks, and leading
to greatly enhanced I/O throughput. However, in trying to fix the latency problem, SANs
have become costly, complex, and sometimes fault-prone. These SANs are generally
shared by many databases, which adds even more complexity and often results in a
disappointing performance, for the cost.

Newer solid-state storage technology has the potential to displace traditional magnetic
drives and even SANs altogether, and allow for much simpler storage systems. The seek
times for SSDs and other flash-based storage are much, much lower than for traditional
magnetic hard disks, since there are no electro-mechanical moving parts to wait on. With
an SSD, there is no delay for an electro-mechanical drive head to move to the correct
portion of the disk to start reading or writing data. With an SSD, there is no delay waiting
for the spinning disk to rotate past the drive head to start reading or writing data, and the
latency involved in reading data off an SSD is much lower than it is for magnetic drives,
especially for random reads and writes. SSD drives also have the additional advantage
of lower electrical power usage, especially compared to large numbers of traditional
magnetic hard drives.

63
Chapter 2: The Storage Subsystem

Magnetic disk drives


Disks are categorized according to the type of interface they use. Two of the oldest types
of interface, which you still occasionally see in older workstations, are Integrated Drive
Electronics (IDE) or Parallel Advanced Technology Attachment (PATA) drives. Of course,
it is not unusual for old, "retired" workstations, with PATA disk controllers, to be pressed
into service as development or test database servers. However, I want to stress that you
should not be using PATA drives for any serious database server use.

PATA and IDE drives are limited to two drives per controller, one of which is the Master
and the other is the Slave. The individual drive needed to have a Jumper Setting to
designate whether the drive was acting as a Master or a Slave drive. PATA 133 was limited
to a transfer speed of 133 MB/sec, although virtually no PATA drives could sustain that
level of throughput.

Starting in 2003, Serial Advanced Technology Attachment (SATA) drives began replacing
PATA drives in workstations and entry-level servers. They offer throughput capacities
of 1.5, 3, or 6 Gbps (also commonly known as SATA 1.0, SATA 2.0, and SATA 3.0), along
with hot-swap capability. Most magnetic SATA drives have a 7,200 rpm rotational speed,
although a few can reach 10,000 rpm. Magnetic SATA drives are often used for low-cost
backup purposes in servers, since their performance and reliability typically do not match
that of enterprise-level SAS drives.

Both traditional magnetic drives and newer SSDs can use the SATA interface. With an
SSD, it is much more important to make sure you are using a 6 Gbps SATA port, since the
latest generation SSDs can completely saturate an older 3 Gbps SATA port.

External SATA (eSATA) drives are also available. They require a special eSATA port, along
with an eSATA interface to the drive itself. An eSATA external drive will have much better
data transfer throughput than the more common external drives that use the much
slower USB 2.0 interface. The new USB 3.0 interface is actually faster than eSATA, but

64
Chapter 2: The Storage Subsystem

your throughput will be limited by the throughput limit of the external drive itself, not
the interface.

Small Computer System Interface (SCSI) drives have been popular in server applications
since the mid 1980s. SCSI drives were much more expensive than PATA drives, but offered
better performance and reliability. The original parallel SCSI interface is now being
rapidly replaced by the newer Serial Attached SCSI (SAS) interface. Most enterprise-level
database servers will use either parallel SCSI or SAS internal drives, depending on their
age. Any new or recent-vintage database server will probably have SAS internal drives
instead of SCSI internal drives.

Server-class magnetic hard drives have rotation speeds ranging from 7,200 rpm (for SATA)
to either 10,000 rpm or 15,000 rpm (for SCSI and SAS). Higher rotation speeds reduce
data access time by reducing the rotational latency. Drives with higher rotation speed
are more expensive, and often have lower capacity sizes compared to slower rotation
speed drives. Over the last several years, disk buffer cache sizes have grown from 2 MB
all the way to 64 MB. Larger disk buffers usually improve the performance of individual
magnetic hard drives, but often are not as important when the drive is used by a RAID
array or is part of a SAN, since the RAID controller or SAN will have its own, much larger,
cache that is used to cache data from multiple drives in the array.

Solid-state drives
Solid-State Drives (SSD), or Enterprise Flash Disks (EFD), are different from traditional
magnetic drives in that they have no spinning platter, drive actuator, or any other moving
parts. Instead, they use flash memory, along with a storage processor, controller, and
some cache memory, to store information.

The lack of moving parts eliminates the rotational latency and seek-time delay that is
inherent in a traditional magnetic hard drive. Depending on the type of flash memory,
and the technology and implementation of the controller, SSDs can offer dramatically

65
Chapter 2: The Storage Subsystem

better performance compared to even the fastest enterprise-class magnetic hard drives.
This performance does come at a much higher cost per gigabyte, and it is still somewhat
unusual for database servers, direct attached storage or SANs, to exclusively use SSD
storage, but this will change as SSD costs continue to decrease.

SSDs perform particularly well for random access reads and writes, and for sequential
access reads. Some earlier SSDs do not perform as well for sequential access writes, and
they also have had issues where write performance declines over time, particularly as the
drive fills up. Newer SSD drives, with better controllers and improved firmware, have
mitigated these earlier problems.

There are two main types of flash memory currently used in SSDs: Single Level Cell (SLC)
and Multi Level Cell (MLC). Enterprise-level SSDs almost always use SLC flash memory
since MLC flash memory does not perform as well and is not as durable as the more
expensive SLC flash memory.

Fusion-IO drives
Fusion-IO is a company that makes several interesting, SSD-like products that are getting
a lot of visibility in the SQL Server community. The term, SSD-like, refers to Fusion-IO
cards that use flash memory, just like SSDs do, but are connected to your server through a
PCI-E slot, instead of a SAS or SATA controller.

The Fusion-IO products are relatively expensive, but offer extremely high performance.
Their three current server-related products are the ioDrive, ioDrive Duo and the new
ioDrive Octal. All three of these products are PCI-E cards, with anywhere from 80 GB
to 5.12 TB of SLC or MLC flash on a single card. Using a PCI-E expansion slot gives one
of these cards much more bandwidth than a traditional SATA or SAS connection. The
typical way to use Fusion-IO cards is to have at least two of the cards, and then to use
software RAID in Windows to get additional redundancy. This way, you avoid having a
pretty important single point of failure in the card itself and the PCI-E slot it was using
(but you incur the accompanying increase in hardware expenditure).

66
Chapter 2: The Storage Subsystem

Fusion-IO drives offer excellent read and write performance, albeit at a relatively high
hardware cost. As long as you have enough space, it is possible and feasible to locate all of
your database components on Fusion-IO drives, and get extremely good I/O performance,
without the need for a SAN. One big advantage of using Fusion-IO, instead of a tradi-
tional SAN, is the reduced electrical power usage and reduced cooling requirements,
which are big issues in many datacenters.

Since Fusion-IO drives are housed in internal PCI-E slots in a database server, you cannot
use them with traditional Windows fail-over clustering (which requires shared external
storage for the cluster), but you can use them with database mirroring or the upcoming
AlwaysOn technology in SQL Server Denali, which allows you to create a Windows
Cluster with no shared storage.

SSDs and SQL Server


I'm often asked which components of a SQL Server database should be moved to SSD
storage as they become more affordable. Unfortunately, the answer is that it depends on
your workload, and on where (if anywhere) you are experiencing I/O bottlenecks in your
system (data files, TempDB files, or transaction log file).

Depending on your database size and your budget, it may make sense to move the entire
database to solid-state storage, especially with a heavy OLTP workload. For example, if
your database(s) are relatively small, and your budget is relatively large, it may be feasible
to have your data files, your log files, and your TempDB files all running on SSD storage.

If your database is very large, and your hardware budget is relatively small, you may have
to be more selective about which components can be moved to SSD storage. For example,
it may make sense to move your TempDB files to SSD storage if your TempDB is experi-
encing I/O bottlenecks. Another possibility would be to move some of your most heavily
accessed tables and indexes to a new data file, in a separate file group, that would be
located on your SSD storage.

67
Chapter 2: The Storage Subsystem

Internal Storage
All blade and rack-mounted database servers have some internal drive bays. Blade servers
usually have two to four internal drive bays, while rack servers have higher numbers
of drive bays, depending on their vertical size. For example, a 2U server will have more
internal drive bays than an equivalent 1U server (from the same manufacturer and model
line). For standalone SQL Server instances, it is common to use at least two 15 K drives in
RAID 1 for the operating system and SQL Server binaries. This provides a very basic level
of redundancy for the operating system and the SQL Server binaries, meaning that the
loss of a single internal drive will not bring down the entire database server.

Modern servers often use 2.5" drives, in place of the 3.5" drives that were common a few
years ago. This allows more physical drives to fit in the same size chassis, and it reduces
the electrical and cooling requirements. The latest 2.5" drives also tend to out-perform
older 3.5" drives. Despite these improvements, however, for all but very lightest database
workloads, you simply won't have enough internal drive bays to completely support
your I/O requirements.

Ignoring this reality is a very common mistake that I see made by many DBAs and
companies. They buy a new, high-performance database server with fast multi-core
processors and lots of RAM and then try to run an OLTP workload on six internal drives.
This is like a body-builder who only works his upper body, but completely neglects
his legs, ending up completely out of balance, and ultimately not very strong. Most
production SQL Server workloads will require much more I/O capacity than is obtainable
from the available internal drives. In order to provide sufficient storage capacity, and
acceptable I/O performance, additional redundant storage is required, and there are
several ways to provide it.

68
Chapter 2: The Storage Subsystem

Attached Storage
The two most common form of storage array used for SQL Server installations are DAS
and the SAN.

Direct Attached Storage


One option is to use Direct Attached Storage (DAS), which is also sometimes called
"locally attached storage." DAS drives are directly attached to a server with an eSATA,
SAS, or SCSI cable. Typically, you have an external enclosure, containing anywhere from
8 to 24 drives, attached to a RAID controller in single database server. Since DAS enclo-
sures are relatively affordable compared to a Storage Area Network, it is becoming more
common to use DAS storage, with multiple enclosures and multiple RAID controllers, to
achieve very high throughput numbers for DW and Reporting workloads.

However, with relative simplicity and low cost, you do give up some flexibility. It is
relatively hard to add capacity and change RAID levels when using DAS, compared
to SAN.

The diagram in Figure 2.1 shows a somewhat simplified view of a server that is using DAS.

You have a server with one or more PCI-e RAID controller cards that are connected (via a
SAS or SCSI cable) to one or more external storage enclosures that usually have between
14 and 24 SAS or SCSI hard drives. The RAID controller(s) in the server are used to create
and manage any RAID arrays that you decide to create and present to Windows as logical
drives (that show up with a drive letter in Windows Explorer). This lets you build a
storage subsystem with very good performance, relatively inexpensively.

69
Chapter 2: The Storage Subsystem

Figure 2.1: Direct Attached Storage.

Storage Area Network


If you have a bigger budget, the next level of storage is a Storage Area Network (SAN).
A SAN is a dedicated network that has multiple hard drives (anywhere from dozens
to hundreds of drives) with multiple storage processors, caches, and other redundant
components.

With the additional expense of the SAN, you do get a lot more flexibility. Multiple
database servers can share a single, large SAN (as long as you don't exceed the overall
capacity of the SAN), and most SANs offer features that are not available with DAS, such
as SAN snapshots. There are two main types of SANs available today: Fiber Channel
and iSCSI.

70
Chapter 2: The Storage Subsystem

A Fiber Channel SAN has multiple components, including large numbers of magnetic
hard drives or solid-state drives, a storage controller, and an enclosure to hold the drives
and controller. Some SAN vendors are starting to use what they call tiered storage, where
they have some SSDs, some fast 15,000 rpm Fiber Channel drives, and some slower 7,200
rpm SATA drives in a single SAN. This allows you to prioritize your storage, based on
the required performance. For example, you could have your SQL Server transaction log
files on SSD storage, your SQL Server data files on Fiber Channel storage, and your SQL
Server backup files on slower SATA storage.

Multiple fiber channel switches, and Host Bus Adapters (HBAs) connect the whole infra-
structure together in what is referred to as a fabric. Each component in the fabric has a
bandwidth capacity, which is typically 1, 2, 4 or 8 Gbits/sec. When evaluating a SAN, be
aware of the entire SAN path (HBA, switches, caches, storage processor, disks, and so
on), since a lower bandwidth component (such as a switch) mixed in with higher capacity
components will restrict the effective bandwidth that is available to the entire fabric.

An iSCSI SAN is similar to a Fiber Channel SAN except that it uses a TCP/IP network,
connected with standard Ethernet network cabling and components, instead of fiber
optics. The supported Ethernet wire speeds that can be used for iSCSI include 100 Mb,
1 Gb, and 10 Gb/sec. Since iSCSI SANs can use standard Ethernet components, they are
usually much less expensive than Fiber Channel SANs. Early iSCSI SANs did not perform
as well as contemporary Fiber Channel SANs, but that gap has closed in recent years.

One good option for an iSCSI SAN is to use a TCP Offload Engine, also known as a
TOE Card instead of a full iSCSI HBA. A TOE offloads the TCP/IP operations for that
card from the main CPU, which can improve overall performance (for a slightly higher
hardware cost).

Regardless of which type of SAN you evaluate or use, it is very important to consider
multi-path I/O (MPIO) issues. Basically, this means designing and implementing a SAN to
eliminate any single point of failure. For example, you would start with at least two HBAs
(preferably with multiple channels), connected to multiple switches, which are connected

71
Chapter 2: The Storage Subsystem

to multiple ports on the SAN enclosure. This gives you redundancy and potentially better
performance (at a greater cost).

If you want to see what a real-life SAN looks like, Figure 2.2 shows a 3PAR S400 SAN
with (216) 146 GB 10,000 rpm Fiber Channel drives and (24) 500 GB 7,200 rpm SATA
drives in a single, 42U rack enclosure. This SAN cost roughly $500,000 when it was
purchased in 2006.

Figure 2.2: NewsGator's 3PAR S400 SAN.

72
Chapter 2: The Storage Subsystem

RAID Configurations
Redundant array of independent disks (RAID) is a technology that allows the use of
multiple hard drives, combined in various ways, to improve redundancy, availability and
performance, depending on the RAID level used. When a RAID array is presented to a
host in Windows, it is called a logical drive.

Using RAID, the data is distributed across multiple disks in order to:

• overcome the I/O bottleneck of a single disk, as described previously

• get protection from data loss through the redundant storage of data on multiple disks

• avoid any one hard drive being a single point of failure

• manage multiple drives more effectively.

Regardless of whether you are using traditional magnetic hard-drive storage or newer
solid-state storage technology, most database servers will employ RAID technology. RAID
improves redundancy, improves performance, and makes it possible to have larger logical
drives. RAID is used for both OLTP and DW workloads. Having more spindles in a RAID
array helps both IOPS and throughput, although ultimately throughput can be limited by
a RAID controller or HBA.

Please note that, while RAID does provide redundancy in your data storage, it is not a
substitute for an effective backup strategy or a High Availability / Disaster Recovery
(HA/DR) strategy. Regardless of what level of RAID you use in your storage subsystem,
you still need to run SQL Server full and log backups as necessary to meet your recovery
point objectives (RPO) and recovery time objectives (RTO).

There are a number of commercially available RAID configurations, which we'll review
over the coming sections, and each has associated costs and benefits. When considering
which level of RAID to use for different SQL Server components, you have to carefully
consider your workload characteristics, keeping in mind your hardware budget. If cost is

73
Chapter 2: The Storage Subsystem

no object, I am going to want RAID 10 for everything, i.e. data files, log file, and TempDB.
If my data is relatively static, I may be able to use RAID 5 for my data files.

During the discussion, I will assume that you have a basic knowledge of how RAID works,
and of the basic concepts of striping, mirroring, and parity.

RAID 0 (disk striping with no parity)


RAID 0 simply stripes data across multiple physical disks. This allows reads and writes to
happen simultaneously across all of the striped disks, so offering improved read and write
performance compared to a single disk. However, it actually provides no redundancy
whatsoever. If any disk in a RAID 0 array fails, the array is offline and all the data in the
array is lost. This is actually more likely to happen than if you only have a single disk,
since the probability of failure for any single disk goes up as you add more disks. There is
no disk space loss for storing parity data (since there is no parity data with RAID 0), but I
don't recommend that you use RAID 0 for database use, unless you enjoy updating your
resumé. RAID 0 is often used by serious computer gaming enthusiasts to reduce the time
it takes to load portions of their favorite games. They do not keep any important data on
their gaming rigs, so they are not that concerned about losing one of their drives.

RAID 1 (disk mirroring or duplexing)


You need at least two physical disks for RAID 1. Your data is mirrored between the two
disks, i.e. the data on one disk is an exact mirror of that on the other disk. This provides
redundancy, since you can lose one side of the mirror without the array going offline
and without any data loss, but at the cost of losing 50% of your space to the mirroring
overhead. RAID 1 can improve read performance, but can hurt write performance in some
cases, since the data has to be written twice.

74
Chapter 2: The Storage Subsystem

On a database server, it is very common to install the Windows Server operating system
on two (at least) of the internal drives, configured in a RAID 1 array, and using an
embedded internal RAID controller on the motherboard. In the case of a non-clustered
database server, it is also common to install the SQL Server binaries on the same
two-drive RAID 1 array as the operating system. This provides basic redundancy for both
the operating system and the SQL Server binaries. If one of the drives in the RAID 1 array
fails, you will not have any data loss or down-time. You will need to replace the failed
drive and rebuild the mirror, but this is a pretty painless operation, especially compared
to reinstalling the operating system and SQL Server!

RAID 5 (striping with parity)


RAID 5 is probably the most commonly-used RAID level, both for general file server
systems and for SQL Server. RAID 5 requires at least three physical disks. The data, and
calculated parity information, is striped across the physical disks by the RAID controller.
This provides redundancy because, if one of the disks goes down, then the missing data
from that disk can be reconstructed from the parity information on the other disks.
Also, rather than losing 50% of your storage in order to achieve redundancy, as for disk
mirroring, you only lose 1/N of your disk space (where N equals the number of disks in
the RAID 5 array) for storing the parity information. For example, if you had six disks in a
RAID 5 array, you would lose 1⁄6 of your space for the parity information.

However, you will notice a very significant decrease in performance while you are missing
a disk in a RAID 5 array, since the RAID controller has to work pretty hard to reconstruct
the missing data. Furthermore, if you lose a second drive in your RAID 5 array, the array
will go offline, and all of the data will be lost. As such, if you lose one drive, you need to
make sure to replace the failed drive as soon as possible. RAID 6 stores more parity infor-
mation than RAID 5, at the cost of an additional disk devoted to parity information, so
you can survive losing a second disk in a RAID 6 array.

75
Chapter 2: The Storage Subsystem

Finally, there is a write performance penalty with RAID 5, since there is overhead to write
the data, and then to calculate and write the parity information. As such, RAID 5 is usually
not a good choice for transaction log drives, where we need very high write performance.
I would also not want to use RAID 5 for data files where I am changing more than 10% of
the data each day. One good candidate for RAID 5 is your SQL Server backup files. You
can still get pretty good backup performance with RAID 5 volumes, especially if you use
backup compression.

RAID 10 and RAID 0+1


When you need the best possible write performance, you should consider either RAID
0+1 or, preferably, RAID 10. These two RAID levels both involve mirroring (so there is a
50% mirroring overhead) and striping, but theydiffer in the details of how it is done in
each case.

In RAID 10 (striped set of mirrors), the data is first mirrored and then striped. In this
configuration, it is possible to survive the loss of multiple drives in the array (one from
each side of the mirror), while still leaving the system operational. Since RAID 10 is more
fault tolerant than RAID 0+1, it is preferred for database usage.

In RAID 0+1 (mirrored pair of stripes) the data is first striped, and then mirrored. This
configuration cannot handle the loss of more than one drive in each side of the array.

RAID 10 and RAID 0+1 offer the highest read/write performance, but incur a roughly
100% storage cost penalty, which is why they are sometimes called rich man's RAID.
These RAID levels are most often used for OLTP workloads, for both data files and trans-
action log files. As a SQL Server database professional, you should always try to use RAID
10 if you have the hardware and budget to support it. On the other hand, if your data is
less volatile, you may be able to get perfectly acceptable performance using RAID 5 for
your data files. By "less volatile," I mean if less than 10% of your data changes per day, then
you may still get acceptable performance from RAID 5 for your data files(s).

76
Chapter 2: The Storage Subsystem

RAID Controllers
There are two common types of hardware RAID controllers used in database servers. The
first is an integrated hardware RAID controller, embedded on the server motherboard.
This type of RAID controller is usually used to control internal drives in the server. The
second is a hardware RAID controller on a PCI-E expansion card that slots into one of
the available (and compatible) PCI-E expansion slots in your database server. This is most
often used to control one or more DAS enclosures, which are full of SAS, SATA, or SCSI
hard drives.

It is also possible to use the software RAID capabilities built into the Windows Server
operating system, but I don't recommend this for production database use with tradi-
tional magnetic drives, since it places extra overhead on the operating system, is less
flexible, has no dedicated cache, and increases the load on the processors and memory in
a server. For both internal drives and direct attached storage, dedicated hardware RAID
controllers are much preferable to software RAID. One exception to this guidance is if
you are going to use multiple Fusion-IO drives in a single database server, in which case it
is acceptable, and common, to use software RAID.

Hardware-based RAID uses a dedicated RAID controller to manage the physical disks
that are part of any RAID arrays that have been created. A server-class hardware RAID
controller will have a dedicated, specialized processor that is used to calculate parity
information; this will perform much better than using one of your CPUs for that purpose.
Besides, your CPUs have more important work to do, so it is much better to offload that
work to a dedicated RAID controller.

A server-class hardware RAID controller will also have a dedicated memory cache, usually
around 512 MB in size. The cache in a RAID controller can be used for either reads or
writes, or split between the two purposes. This cache stores data temporarily, so that
whatever wrote that data to the cache can return to another task without having to wait
to write the actual physical disk(s).

77
Chapter 2: The Storage Subsystem

Especially for database server use, it is extremely important that this cache is backed
up by a battery, in case the server ever crashes or loses power before the contents of the
RAID controller cache are actually written to disk. Most RAID controllers allow you to
control how the cache is configured, in terms of whether it is used for reads or writes
or a combination of the two. Whenever possible, you should disable the read cache (or
reduce it to a much smaller size) for OLTP workloads, as they will make little or no use
of it. By reducing the read cache you can devote more space, or often the entire cache,
to write activity, which will greatly improve write performance. You can also usually
control whether the cache is acting as a write-back cache or a write-through cache. In a
write-through cache, every write to the cache causes a synchronous write to the backing
store, which is safer, but reduces the write performance of the cache. A write-back cache
improves write performance, because a write to the high-speed cache is faster than to
the actual disk(s). As enough of the data in the write-back cache becomes "dirty," it will
eventually have to actually be written to the disk subsystem. The fact that data that has
been marked as committed by the database is still just in the write-back cache is why it is
so critical to have a battery backing the cache.

For both performance and redundancy reasons, you should always try to use multiple
HBAs or RAID controllers whenever possible. While most direct attached storage enclo-
sures will allow you to daisy-chain multiple enclosures on a single RAID controller, I
would avoid this configuration if possible, since the RAID controller will be a single point
of failure, and possibly a performance bottleneck as you approach the throughput limit of
the controller. Instead, I would want to have one RAID controller per DAS array (subject
to the number of PCI-E slots you have available in your server). This gives you both better
redundancy and better performance. Having multiple RAID controllers allows you to take
advantage of the dedicated cache in each RAID controller, and helps ensure that you are
not limited by the throughput capacity of the single RAID controller or the expansion slot
that it is using.

78
Chapter 2: The Storage Subsystem

Provisioning and Configuring the Storage


Subsystem
Having discussed each of the basic components of the storage system, it's time to review
the factors that will determine the choices you make when provisioning and configuring
the storage subsystem.

The number, size, speed, and configuration of the disks that comprise your storage array
will be heavily dependent on the size and nature of the workload. Every time that data
required by an application or query is not found in the buffer cache, it will need to be read
from the data files on disk, causing read I/O. Every time data is modified by an appli-
cation, the transaction details are written to the transaction log file, and then the data
itself is written to the data files on disk, causing write I/O in each case.

In addition to the general read and write I/O generated by applications that access SQL
Server, additional I/O load will be created by other system and maintenance activities.

• Transaction log backups – create both read and write I/O pressure. The active
portion of the transaction log file is read, and then the transaction log backup file
must be written.

• Index maintenance, including index reorganizations and index rebuilds – creates read
I/O pressure as the index is read off the I/O subsystem, which then causes memory
pressure as the index data goes into the SQL Server Buffer Pool. There is CPU pressure
as the index is reorganized or rebuilt, and then write I/O pressure as the index is
written back out to the I/O subsystem.

• Full text catalog and indexes for Full Text Search – the work of crawling the base
table(s) to create and maintain these structures and then writing the changes to the
Full Text index(s) creates both read and write I/O pressure.

79
Chapter 2: The Storage Subsystem

• Database checkpoint operations – the write activity to the data files occurs during
database checkpoint operations. The frequency of checkpoints is influenced by the
recovery interval setting and the amount of RAM installed in the system.

• Use of High Availability / Disaster Recovery (HA/DR) – features like Log Shipping
or Database Mirroring will cause additional read activity against your transaction
log, since the transaction log must be read before the activity can be sent to the Log
Shipping destination(s) or to the database mirror. Using Transactional Replication will
also cause more read activity against your transaction log on your Publisher database.

The number of disks that make up your storage array, their specifications in terms of size,
speed and so on, and the physical configuration of these drives in the storage array, will be
determined by the size of the I/O load that your system needs to support, both in terms
of IOPS and I/O throughput, as well as in the nature of that load, in terms of the read I/O
and write I/O activity that it generates.

A workload that is primarily OLTP in nature will generate a high number of I/O opera-
tions and a high percentage of write activity; it is not that unusual to actually have more
writes than reads in a heavy OLTP system. This will cause heavy write (and read) I/O
pressure on the logical drive(s) that house your data files and, particularly, heavy write
pressure on the logical drive where your transaction log is located, since every write must
go to the transaction log first. The drives that house these files must be sized, spec'd and
configured appropriately, to handle this pressure.

Furthermore, almost all of the other factors listed previously that cause additional I/O
pressure are almost all more prominent for OLTP systems. High write activity, caused by
frequent data modifications, leads to more regular transaction log backups, index mainte-
nance, more frequent database checkpoints, and so on.

Backup and data compression

Using data and backup compression can reduce the I/O cost and duration of SQL Server backups at the
cost of some additional CPU pressure – see Chapter 1 for further discussion.

80
Chapter 2: The Storage Subsystem

A DSS or DW system usually has longer-running queries than a similar size OLTP system.
The data in a DSS system is usually more static, with much higher read activity than write
activity. The less volatile data means less frequent data and transaction log backups –
you might even be able to use read-only file groups to avoid having to regularly back up
some file groups – less frequent index maintenance and so on, all of which contributes
to a lower I/O load in terms of IOPS, though not necessarily I/O throughput, since the
complex, long-running aggregate queries that characterize a DW/DSS workload will often
read a lot of data, and the data load operations will write a lot of data. All of this means
that, for a DSS/DW type of workload, I/O throughput is usually more important than
IOPS performance.

Finding the read/write ratio


One way of determining the size and nature of your workload is to retrieve the read/write
ratio for your database files. The higher the proportion of writes, the more OLTP-like is
your workload.

The DMV query shown in Listing 2.1 can be run on an existing system to help charac-
terize the I/O workload for the current database. This query will show the read/write
percentage, by file, for the current database, both in the number of reads and writes, and
in the number of bytes read and written.

-- I/O Statistics by file for the current database


SELECT DB_NAME(DB_ID()) AS [Database Name] ,
[file_id] ,
num_of_reads ,
num_of_writes ,
num_of_bytes_read ,
num_of_bytes_written ,
CAST(100. * num_of_reads / ( num_of_reads + num_of_writes )
AS DECIMAL(10,1)) AS [# Reads Pct] ,
CAST(100. * num_of_writes / ( num_of_reads + num_of_writes )
AS DECIMAL(10,1)) AS [# Write Pct] ,

81
Chapter 2: The Storage Subsystem

CAST(100. * num_of_bytes_read / ( num_of_bytes_read


+ num_of_bytes_written )
AS DECIMAL(10,1)) AS [Read Bytes Pct] ,
CAST(100. * num_of_bytes_written / ( num_of_bytes_read
+ num_of_bytes_written )
AS DECIMAL(10,1)) AS [Written Bytes Pct]
FROM sys.dm_io_virtual_file_stats(DB_ID(), NULL) ;

Listing 2.1: Finding the read/write ratio, by file, for a given database.

Three more DMV queries, shown in Listing 2.2, can help characterize the workload
on an existing system, from a read/write perspective, for cached stored procedures.
These queries can help give you a better idea of the total read and write I/O activity, the
execution count, and the cached time for those stored procedures.

-- Top Cached SPs By Total Logical Writes (SQL 2008 and 2008 R2)
-- This represents write I/O pressure
SELECT p.name AS [SP Name] ,
qs.total_logical_writes AS [TotalLogicalWrites] ,
qs.total_logical_reads AS [TotalLogicalReads] ,
qs.execution_count , qs.cached_time
FROM sys.procedures AS p
INNER JOIN sys.dm_exec_procedure_stats AS qs
ON p.[object_id] = qs.[object_id]
WHERE qs.database_id = DB_ID()
AND qs.total_logical_writes > 0
ORDER BY qs.total_logical_writes DESC ;
-- Top Cached SPs By Total Physical Reads (SQL 2008 and 2008 R2)
-- This represents read I/O pressure
SELECT p.name AS [SP Name] ,
qs.total_physical_reads AS [TotalPhysicalReads] ,
qs.total_logical_reads AS [TotalLogicalReads] ,
qs.total_physical_reads/qs.execution_count AS [AvgPhysicalReads] ,
qs.execution_count , qs.cached_time
FROM sys.procedures AS p
INNER JOIN sys.dm_exec_procedure_stats AS qs
ON p.[object_id] = qs.[object_id]
WHERE qs.database_id = DB_ID()
AND qs.total_physical_reads > 0
ORDER BY qs.total_physical_reads DESC, qs.total_logical_reads DESC ;

82
Chapter 2: The Storage Subsystem

-- Top Cached SPs By Total Logical Reads (SQL 2008 and 2008 R2)
-- This represents read memory pressure
SELECT p.name AS [SP Name] ,
qs.total_logical_reads AS [TotalLogicalReads] ,
qs.total_logical_writes AS [TotalLogicalWrites] ,
qs.execution_count , qs.cached_time
FROM sys.procedures AS p
INNER JOIN sys.dm_exec_procedure_stats AS qs
ON p.[object_id] = qs.[object_id]
WHERE qs.database_id = DB_ID()
AND qs.total_logical_reads > 0
ORDER BY qs.total_logical_reads DESC ;

Listing 2.2: The read/write ratio for cached stored procedures.

As discussed, a workload with a high percentage of writes will place more stress on the
drive array where the transaction log files for your user databases are located, since all
data modifications are written to the transaction log. The more volatile the data, the
more write I/O pressure you will see on your transaction log file. A workload with a high
percentage of writes will also put more I/O pressure on your SQL Server data file(s). It is
common practice, with large volatile databases, to have multiple data files spread across
multiple logical drives to get both higher throughput and better IOPS performance.
Unfortunately, you cannot increase I/O performance for your transaction log by adding
additional files, since the log file is written to sequentially.

The relative read/write ratio will also affect how you configure the cache in your RAID
controllers. For OLTP workloads, write cache is much more important than read cache,
while read cache is more useful for DSS/DW workloads. In fact, it is a common best
practice to devote the entire RAID controller cache to writes for OLTP workloads.

83
Chapter 2: The Storage Subsystem

How many disks?


One common mistake that you should avoid in selecting storage components is to only
consider space requirements when looking at sizing and capacity requirements. For
example, if you had a size requirement of 1 TB for a drive array that was meant to hold
a SQL Server data file, you could satisfy the size requirement by using three 500 GB
drives in a RAID 5 configuration. Unfortunately, for the reasons discussed relating to
disk latency, the performance of that array would be quite low. A better solution from a
performance perspective would be to use either 8x146 GB drives in RAID 5, or 15x73 GB
drives in RAID 5 to satisfy the space requirement with many more spindles. You should
always try to maximize your spindle count instead of just using the minimum number
of drives to meet a size requirement with the level of RAID you are going to use. So,
after all of this discussion, how many disks do you actually need to achieve acceptable
performance?

Here is one formula for estimating the number of disks required for a given workload and
RAID level that Australian SQL Server MVP Rod Colledge has written about:

n = (%R + f (%W))(tps)/125
Required # Disks = (Reads/sec + (Writes/sec * RAID adjuster)) / Disk IOPS

It is important to consider both IOPS, to calculate the number of disks needed, and the
I/O type, to ensure the I/O bus is capable of handling the peak I/O sequential throughput.

Configuration: SAN vs. DAS, RAID levels


For OLTP systems, the seek time and rotational latency limitations for a single disk,
discussed at the start of this chapter, have led to the proliferation of large SAN-based
storage arrays, allowing data to be segmented across numerous disks, in various RAID
configurations. Many larger SANs have a huge number of drive spindles and so this archi-
tecture is able to support a very high random I/O rate. Use of a SAN for the I/O subsystem

84
Chapter 2: The Storage Subsystem

in OLTP workloads makes it relatively easy (but expensive) to support dozens to hundreds
of disk spindles for a single database server.

The general guideline is that you will get roughly 100 IOPS from a single 10,000 rpm
magnetic drive and about 150 IOPS from a single 15,000 rpm drive. For example, if you
had a SAN with two hundred 15,000 rpm drives, that would give the entire SAN a raw
IOPS capacity of 30,000 IOPS. If the HBAs in your database server were older, 4 Gbps
models, your sequential throughput would still be limited to roughly 400 MB/second for
each HBA channel.

If you don't have the budget or in-house expertise for a large SAN, it is still possible to get
very good IOPS performance with other storage techniques, such as using multiple DAS
enclosures with multiple RAID controllers along with multiple SQL Server file groups
and data files. This allows you to spread the I/O workload among multiple logical drives
that each represent a dedicated DAS enclosure. You can also use SSDs or Fusion-IO cards
to get extremely high IOPS performance without using a SAN, assuming you have the
hardware budget available to do that.

When you are using non-SAN storage (such as DAS enclosures) it is very important to
explicitly segregate your disk activity by logical drive. This means doing things like having
one or more dedicated logical drives for SQL Server data files, a dedicated logical drive for
the log file (for each user database, if possible), a dedicated logical drive for your TempDB
data and log files, and one or more dedicated logical drives for your SQL Server backup
files. Of course, your choices and flexibility are ultimately limited by the number of drives
that you have available, which is limited by the number of DAS enclosures you have, and
the number of drives in each enclosure.

However, for DW/DSS systems, a SAN storage array may not be the best choice. Here,
I/O throughput is the most important factor, and the throughput of a SAN array can be
limited to the throughput capacity of a switch or individual HBA. As such, it is becoming
more common for DW/DSS systems to use multiple DAS devices, each on a dedicated
RAID controller, to get high levels of throughput at a relatively low cost.

85
Chapter 2: The Storage Subsystem

If you have the available budget, I would prefer to use RAID 10 for all of your various SQL
Server files, including data files, log files, TempDB, and backup files. If you do have budget
constraints, I would consider using RAID 5 for your database backup files, and RAID 5 for
your data files (if they are relatively static). Depending on your workload characteristics
and how you use TempDB, you might be able to use RAID 5 for TempDB files. I would fight
as hard as possible to avoid using RAID 5 for transaction log files, since RAID 5 does not
perform nearly as well for writes.

Summary
Having an appropriate storage subsystem is critical for SQL Server performance. Most
high-volume SQL Server workloads ultimately run into I/O bottlenecks that can be very
expensive to alleviate. Selecting, sizing, and configuring your storage subsystem properly
will reduce the chances that you will suffer from I/O performance problems.

In addition, using powerful multi-core processors and large quantities of RAM, as


discussed in Chapter 1, provides relatively cheap extra protection from expensive I/O
bottlenecks. Having more RAM reduces read I/O pressure, since more data can fit into
the SQL Server buffer cache, and it can also reduce write I/O pressure by reducing the
frequency of checkpoints. Using modern, multi-core processors can give you the excess
processor capacity that can allow you to use various compression features, such as data
compression and backup compression, which can also reduce I/O pressure in exchange
for additional CPU utilization.

Ultimately, however, the only way to know that your chosen hardware, including the
processor, disk subsystem and so on, is capable of handling the expected workload is to
perform benchmark tests, as described in the next chapter.

86
Chapter 3: Benchmarking Tools
Having read Chapters 1 and 2, you hopefully have a better understanding of server
hardware from a SQL Server perspective, and of how your hardware choices can be
affected by the types of SQL Server workload that must be supported.

However, as well as understanding the factors that will influence hardware selection
and provisioning, it is vitally important to measure and validate the performance of
the various major hardware components (such as the processor(s), memory, or disk
subsystem) as well as SQL Server itself, in order to verify that performance targets will be
met.

One way to evaluate and compare hardware performance, and to make sizing and
capacity estimates, is to use benchmark test results. There are many different kinds of
benchmarks in use today, but this chapter will focus on two main types, which are:

• application benchmarks – use one or more real applications (such as Microsoft SQL
Server) to measure the actual performance, throughput, and response time of an entire
system while running the application

• component benchmarks – focus on one or more components in a computer system,


usually using a synthetic workload that is designed to measure the absolute perfor-
mance of that part of the system.

We will discuss some of the industry standard database application benchmarks (such
as TPC-C, TPC-E, and TPC-H), including how they work and how they can be useful in
helping you evaluate and properly size database server hardware.

We'll then move on to the component benchmarks that you can carry out yourself and
that will help you to compare the relative performance of different components of the
system in a focused manner, without actually using SQL Server. For example, tools such

87
Chapter 3: Benchmarking Tools

as HD Tune Pro, CrystalDiskMark, and SQLIO allow us to measure and validate SQL
Server storage performance, before we even install SQL Server.

Likewise, a component benchmark tool, such as Geekbench (from Primate Labs)


www.primatelabs.ca/geekbench/ can be very useful for estimating and comparing
CPU and memory performance between an existing server and a proposed new server,
allowing the DBA to make much more accurate sizing and capacity estimates. In my
opinion, it is always preferable to make use of component benchmarks to actually
measure performance instead of just guessing about the relative performance of different
components across different systems.

Application Benchmarks
The Transaction Processing Performance Council (TPC) is a non-profit organization,
founded in 1988, which aims to define transaction processing and database benchmarks,
and to disseminate objective, verifiable, TPC performance data to the industry.

TPC benchmarks are used widely in evaluating the performance of database systems. TPC
benchmark results are listed by Performance, Price/Performance, and Watts/Performance.
The TPC organization has very strict rules about how results must be submitted and
audited, including very detailed disclosure rules. TPC results are published on the TPC
website, at www.tpc.org/.

Both hardware and software vendors have a tendency to use good TPC benchmark
results as marketing tools, which is fine by me, but which leads some people to treat with
skepticism, or even completely disregard, the value of TPC benchmarks. One argument
I have heard is that the database vendors (such as Microsoft, IBM, and Oracle) are so
familiar with the various queries that are used in the TPC benchmarks that they modify
their query optimizer logic to make their products artificially perform better on the
benchmarks. I tend to discount this argument, since I believe all of the major database

88
Chapter 3: Benchmarking Tools

vendors have more integrity than that, and I believe it would be extremely difficult and
counterproductive to make those types of code modifications to the query optimizer.

Another argument I hear against the TPC benchmarks is that they don't represent a
realistic workload and that the types of systems that are built for TPC benchmarks are
extremely expensive; that they are not representative of a typical system that would
actually be purchased by a real customer, and so have little real-world value and should
be ignored.

I think this attitude is a mistake; as long as you know how to interpret the results, and
realize their purpose and limitations, I believe there is real value to be had in comparing
the TPC benchmarks for various systems. There is a rigorous set of rules in place for
how TPC benchmark testing is conducted and submitted to TPC for final auditing and
approval. Results that are listed on the TPC website must include an Executive Summary,
a Full Disclosure Report (FDR), and a full set of Supporting Files, all of which make very
interesting reading for the true database and hardware geek.

Furthermore, taken alongside the results from component benchmarks and your own
common sense and experience, you can apply and extrapolate the results from formal
TPC submissions to your own, probably smaller-scale, systems. There are three current
benchmarks used by TPC, the TPC-C, TPC-E, and TPC-H benchmarks.

TPC-C benchmark
The TPC Benchmark C (TPC-C) benchmark is an old OLTP benchmark, originally
released in 1992, which simulates the OLTP workload of a wholesale supplier. The
business model of the wholesaler is organized into Warehouses, Districts, and Customers.
The TPC-C data model is very simple, with only nine tables and four data types, and is
hierarchical in nature – districts are subsets of warehouses, while customers are subsets
of districts.

89
Chapter 3: Benchmarking Tools

There are five different transaction types in the TPC-C benchmark. The TPC-C data
model does not enforce referential integrity, and the data in the database is mostly
random strings of gibberish (for columns like customer names). A frequent criticism of
the TPC-C benchmark is that it does not require fault-tolerant storage media, which
is not especially realistic for a database benchmark. The TPC-C benchmark also has an
unrealistic dependency on disk I/O, meaning that vendors would often configure systems
with an extremely high number of disk spindles in a quest to get the best absolute TPC-C
benchmark score. There is some validity to these criticisms, in my opinion, so I tend to
ignore the TPC-C benchmark and focus on the much newer TPC-E benchmark.

TPC-E benchmark
The TPC Benchmark E (TPC-E) is an OLTP performance benchmark that was introduced
in February 2007. TPC-E is not a replacement for the older TPC-C benchmark, but a
completely new OLTP benchmark. It is an OLTP, database-centric workload that is meant
to reduce the cost and complexity of running the benchmark compared to the older
TPC-C benchmark. It simulates the OLTP workload of a brokerage firm that interacts
with customers using synchronous transactions and with a financial market using
asynchronous transactions.

The business model of the brokerage firm is organized by Customers, Accounts, and
Securities. The data model for TPC-E is significantly more complex, but more realistic
than TPC-C, with 33 tables and many different data types. The data model for the TPC-E
database does enforce referential integrity, unlike the older TPC-C data model. Some
of the differences in the data model for the TPC-C and TPC-E databases are shown in
Figure 3.1.

90
Chapter 3: Benchmarking Tools

Characteristic TPC-C TPC-E

Tables 9 33

Columns 92 188

Primary Keys 8 33

Foreign Keys 9 50

Tables w/Foreign Keys 7 27

Check Constraints 0 22

Referential Integrity No Yes

Figure 3.1: TPC-C and TPC-E database schema summary.

The TPC-E database is populated with pseudo-real data, including customer names from
the year 2000 US Census, and company listings from the NYSE and NASDAQ. Having
realistic data introduces data skew, and makes the data compressible. Unlike TPC-C, the
storage media for TPC-E must be fault tolerant (which means no RAID 0 arrays). Overall,
the TPC-E benchmark is designed to have reduced I/O requirements compared to the
old TPC-C benchmark, which makes it both less expensive and more realistic, since
the sponsoring vendors will not feel as much pressure to equip their test systems with
disproportionately large disk subsystems in order to get the best test results. The TPC-E
benchmark is also more CPU-intensive than the old TPC-C benchmark.

The TPC-E implementation is broken down into a Driver and a System Under Test (SUT),
separated by a mandatory network. The Driver represents the various client devices
that would use an N-tier client-server system, abstracted into a load generation system.
The SUT has multiple Application servers (Tier A) that communicate with the database
server and its associated storage subsystem (Tier B). TPC provides a transaction harness
component that runs in Tier A, while the test sponsor provides the other components in
the SUT.

91
Chapter 3: Benchmarking Tools

The performance metric for TPC-E is transactions per second, tpsE. The actual tpsE
score represents the average number of Trade Result transactions executed within one
second. To be fully compliant with the TPC-E standard, all references to tpsE results
must include the tpsE rate, the associated price per tpsE, and the availability date of the
priced configuration.

It seems interesting that, as of early 2011, Microsoft is the only database vendor that has
submitted any TPC-E results, even though the TPC-E benchmark has been available since
early 2007. Whatever the reasons why other database vendors haven't posted results,
there are certainly many results posted for SQL Server, which makes it a very useful
benchmark when assessing SQL Server hardware.

TPC-H benchmark
The TPC Benchmark H (TPC-H) is a benchmark for Decision Support Systems (DSS).
It consists of a suite of business oriented, ad hoc queries and concurrent data modifi-
cations. The queries, and the data populating the database, have been chosen to have
broad industry-wide relevance. This benchmark illustrates decision support systems that
examine large volumes of data, execute queries with a high degree of complexity, and give
answers to critical business questions.

The performance metric reported by TPC-H is called the TPC-H Composite Query-per-
Hour Performance Metric (QphH@Size), and reflects multiple aspects of the capability
of the system to process queries. These aspects include the selected database size against
which the queries are executed, the query processing power when queries are submitted
by a single stream and the query throughput when queries are submitted by multiple
concurrent users. The TPC-H Price/Performance metric is expressed as $/QphH@Size.

TPC-H results are grouped by database size, with database size groups of 100 GB, 300 GB,
1,000 GB, 3,000 GB, 10,000 GB, and 30,000 GB. You should not compare TPC-H results

92
Chapter 3: Benchmarking Tools

across database sizes, which means that the TPC-H score for a 100 GB database should
not be compared with the TPC-H score for a 1,000 GB database.

Analyzing benchmark test results


Now that you have a better idea about the purpose and composition of the three current
TPC benchmarks, it is time to talk about how to analyze and interpret the results. First,
you need to focus on the benchmark that is closest to your type of workload. If you
have an OLTP workload, that would be TPC-C or TPC-E, while if you have a DSS/DW
workload you would focus on TPC-H. The TPC-E benchmark is newer and more realistic
than the old TPC-C benchmark, but there are fewer submitted results for TPC-E than
for TPC-C.

When analyzing the submitted results, I like to view All Results, sorted by performance,
and look for systems that are similar to the one I am considering. Since TPC-E results
go back to 2007, it is likely that you can find a system that has the same number of
processors, of the same processor generation and family, as your candidate system.

TPC-E benchmark analysis sample


As an example, let's consider that you are upgrading an existing OLTP system, and want
to understand what sort of performance and scalability improvement you might expect
for your hardware investment.

Among the submitted TPC-E benchmark scores, you see one from August 24, 2007 for a
Dell PowerEdge 6850 system with (4) dual-core, 3.4 GHz , Xeon 7140 processors and 64
GB of RAM, with 184 disk spindles. This system was running x64 SQL Server 2005 Enter-
prise Edition SP2 on top of x64 Windows Server 2003 Enterprise Edition SP1. The initial
database size was 856 GB. This system is the closest match to the existing system that you
are looking to consolidate or replace, and its tpsE score was 220.

93
Chapter 3: Benchmarking Tools

You also see a more recent submission, March 30, 2009, for a Dell PowerEdge T610
system, with (2) quad-core 2.93 GHz Xeon X5570 processors and 96 GB of RAM, with 396
disk spindles. The T610 was running on x64 SQL Server 2008 Enterprise Edition on top
of x64 Windows Server 2008 Enterprise Edition. The initial database size on the T610
system was 2,977 GB. Its tpsE score was 766.

At first glance, the newer, two-socket system offers about 3.5 times the performance of
the older, four-socket system. Looking deeper, you notice the newer system is running a
newer version of both SQL Server and the operating system, and it has 50% more RAM
and has slightly over twice the number of disk spindles, while the initial database size is
about 3.5 times as large.

There are many competing factors at play here. The newer version of SQL Server and of
the operating system would give about a 10–15% advantage with OLTP workloads, on the
same hardware. This 10–15% performance advantage comes primarily from improvements
in the SQL Server query optimizer, better memory management in SQLOS, and low-level
kernel and network stack improvements in Windows Server 2008.

The newer system has more RAM and more disk spindles, which are required to drive the
system hard enough to max out the processors during the test. Having more RAM and
more disk spindles in the newer system is somewhat counterbalanced by having a much
larger initial database size, which places more stress on the memory and I/O subsystem.

The newer system would have much better single-threaded performance, which is very
important for OLTP workloads. Given all of this information, I would feel very confident
that I could replace my existing, four-socket system with the newer, two-socket system
and have lots of scalability headroom to spare.

If two benchmarked systems were not exact matches to my existing and prospective
systems, I would try to use the results of component benchmarks to help adjust for
the differences. For example, I might use the results of the Geekbench component
benchmark to help determine by how much to adjust a TPC-E score (for my sizing calcu-
lations) to allow for differences in the processors and memory types between two systems.

94
Chapter 3: Benchmarking Tools

This is a relevant adjustment technique because the TPC-E benchmark is primarily CPU
limited, assuming you have enough I/O capacity to drive the workload to capacity, which
is a pretty safe assumption for any TPC submitted score, due to the expense and time
required to submit an official TPC-E result.

The overall idea here is to use all of the available information from the TPC-E benchmark
submissions and component benchmark results, along with your own judgment and
common sense, to get the most accurate impression of the performance and scalability
differences between two systems.

TPC-E benchmark analysis by CPU type


As an experiment, I imported the current official TPC-E results spreadsheet (available
at http://tpc.org/information/results_spreadsheet.asp) into a SQL Server database,
so that I could easily query and analyze the results. I did a little data cleansing so that
the data would always use the same terms for CPU model, SQL Server version, and so
on. I also added a few columns that are not in the original official results spreadsheet,
such as the MemorySize, SpindleCount, and OriginalDatabaseSize, and added
that data to each row in the table, using the information from the Executive Summary
for each submitted TPC-E result. Luckily, there were only 41 TPC-E results at the time
this was written.

Having done all this, I was ready to write a few queries to see if anything interesting
might reveal itself from the actual raw data. Since SQL Server is licensed by physical
socket when you buy a processor license, I ranked the results by TpsE per Socket, by
simply dividing the TpsE score by the number of sockets. This provides a rough guide
as to which processor gives you the most "bang for the buck" on the TPC-E benchmark,
assuming that the rest of the system was properly optimized.

The abridged results (repeat tests for the same processor model, with the same number
of cores and threads, are not included) for the top tier of processors are shown in Figure
3.2. At the top of the list, we have a system using the Intel Xeon X5680 (Westmere-EP)

95
Chapter 3: Benchmarking Tools

processor. After that come four- and eight-sockets systems using the Intel Xeon X7560
processor (Nehalem-EX). Notice that the TpsE performance scales very well, i.e., the eight
socket scores are pretty close to double the four-socket scores for that processor, which
is indicative of the effectiveness of the NUMA memory architecture, used in these newer
processors.

Next in line is a two-socket AMD Opteron 6176 SE (Magny Cours) system, which does
10% better than the following Intel Xeon X5570 system, which is about a year older. After
that, we have a mix of Intel Nehalem- and AMD Magny Cours-based systems.

CPU Type Sockets Cores Threads TpsE TpsE per socket

Intel Xeon X5680 2 12 24 1,110.1 555.05

Intel Xeon X7560 4 32 64 2,022.64 505.66

Intel Xeon X7560 8 64 128 3,800 475

AMD Opteron 6176 SE 2 24 24 887.38 443.69

Intel Xeon X5570 2 8 16 800 400

Intel Xeon X7560 8 64 128 3,141.76 392.72

Intel Xeon X5570 2 8 16 766.47 383.23

AMD Opteron 6174 4 48 48 1,464.12 366.03

AMD Opteron 6176 SE 4 48 48 1,400.14 350.035

Figure 3.2: The top tier of TPC-E results, by CPU type.

The first interesting point to note, on examining the second tier of processor benchmark
scores, shown in Figure 3.3 (again, abridged), is just how much of a drop we see in the
TpsE per Socket score; the four-socket Intel Xeon X7460 (Dunnington), which was no
slouch in its day, shows a drop of nearly 50% compared to the lowest system in the top

96
Chapter 3: Benchmarking Tools

tier, and a drop of almost 65% compared to the system using its four-socket Nehalem-
EX counterpart, the Intel Xeon X7560. This shows what a huge improvement the Intel
Nehalem is, compared to the older Intel Dunnington.

You can also see that newer, two-socket Intel systems (X5680 and X5570) do much better
than the older, four-socket X7460 systems, both in absolute and per socket terms. Finally,
notice the poor performance of the listed 16-socket Intel Xeon X7460 systems, showing
the weakness of the old shared front-side bus architecture in older Intel Xeon processors,
compared to the newer NUMA architecture (see Chapter 1 for more details). As you add
more and more processors to a front-side bus architecture machine, you get increased
memory contention, and your scaling factor goes down.

CPU Type Sockets Cores Threads TpsE TpsE per socket

Intel Xeon X7460 4 24 24 721.4 180.35

AMD Opteron
4 16 16 635.43 158.85
8384

Intel Xeon X5460 2 8 8 317.45 158.72

Intel Xeon X7460 8 48 48 1,165.56 145.69

Intel Xeon X5355 1 4 4 144.88 144.88

Intel Xeon X5460 2 8 8 268 134

Intel Xeon X7460 16 96 96 2,012.77 125.79

Intel Xeon X7350 4 16 16 492.34 123.08

Intel Xeon X7460 12 64 64 1,400 116.66

Intel Xeon X7350 8 32 32 804 100.5

Intel Xeon X7460 16 64 64 1,568.22 98.01

97
Chapter 3: Benchmarking Tools

CPU Type Sockets Cores Threads TpsE TpsE per socket

Intel Xeon 5160 2 4 4 169.59 84.79

Intel Xeon X7350 16 64 64 1,250 78.12

Intel Xeon 7140 4 8 16 220 55

Intel Xeon 7140 16 32 64 660.85 41.30

Intel Itanium
32 64 64 1,126.49 35.20
9150N

Figure 3.3: The 2nd tier of TPC-E results, by CPU type.

The main point to take away from this simple analysis is that any system with processors
older than the Intel Nehalem or the AMD Magny Cours (Intel 55xx, Intel 75xx or AMD
61xx) will be pretty severely handicapped compared to a system with a modern processor.
This is especially evident with older Intel four-socket systems that use Xeon 74xx or
older processors, which are easily eclipsed by two-socket systems with Intel Xeon X5570
or X5690 processors. The Xeon 74xx series had six cores per physical processor (but no
hyper-threading) so a typical four-socket machine would have 24 total physical cores
available. The newer Xeon 55xx series has four cores per physical processor (plus hyper-
threading), so you can have up to 16 total logical cores available in a two-socket machine.
The Xeon 56xx series has up to six cores per physical processor (plus hyper-threading), so
you can have up to 24 total logical cores available in a two-socket machine.

Component Benchmarks
Component benchmarks are micro-benchmarks that purposely do not simulate an
actual application workload, but instead generate a synthetic workload that is designed
to heavily stress one component of a system. Rather than measure the performance of
the entire system, including SQL Server, component benchmarks allow us to assess the

98
Chapter 3: Benchmarking Tools

performance characteristics of a specific component, or group of related components,


such as the processor(s), memory, disk subsystem, and so on.

Performing a component benchmark can be very useful when, for example:

• assessing the suitability of the disk subsystem to cope with the predicted I/O load

• assessing the true performance benefit of an investment in more expensive


components (such as processors)

• measuring and documenting the performance of a component (such as the I/O


subsystem) before and after making a change

• validating the effects of configuration changes.

I think it is important to remember that component-level benchmarks should be used,


primarily, to compare the relative performance of different components in different
systems. Try not to be overly concerned with the absolute performance results for a
single-component benchmark.

Over the coming sections, we will take a closer look at a few useful component bench-
marks for testing various aspects of the hardware system for a SQL Server installation,
including processor power, memory, and I/O capacity.

CPU and memory testing


There is an age-old, and still prevalent, tradition among many DBAs to attempt to solve
SQL Server-related performance issues by throwing more hardware at the problem. Even
though the cost of CPU and memory has come down in recent years, it can still represent
a substantial investment, and it's vital that you understand, from an engineering
perspective, exactly what performance benefit this investment can be expected to deliver.

99
Chapter 3: Benchmarking Tools

In this section, we'll briefly review two very useful benchmark tools for CPU sizing,
capacity or consolidation planning: SPEC benchmarks and Geekbench.

SPEC benchmarks
The Standard Performance Evaluation Corporation (SPEC) is a non-profit corporation,
the purpose of which is to "establish, maintain and endorse a standardized set of relevant
benchmarks that can be applied to the newest generation of high-performance computers."

SPEC develops benchmark suites to test the performance of a range of different systems,
including workstations, web servers, mail servers, network file systems, and so on. SPEC
also reviews and publishes the results submitted by their member organizations.

Most relevant to the DBA, are the SPEC CPU benchmarks, the current one being SPEC
CPU2006, which is a widely-used and useful benchmark for measuring and comparing
CPU performance. It has two separate benchmark suites for measuring computer
intensive performance. The first is CINT2006, which measures integer performance, and
the second is CFP2006, which measures floating point performance. You can buy these
tools (from the SPEC website www.spec.org/order.html) and run benchmarks on your
own systems, or you can analyze published results on the SPEC website. SPEC CPU2006
benchmark results are published at www.spec.org/cgi-bin/osgresults?conf=cpu2006,
and you can search the results by hardware vendor, server model number, or particular
processor model number.

Geekbench
Geekbench is a cross-platform, synthetic benchmark tool from Primate Labs. It provides a
comprehensive set of benchmarks designed to quickly and accurately measure processor
and memory performance. There are 32-bit and 64-bit versions of Geekbench, but in trial
mode you can only use the 32-bit version, which is available from www.primatelabs.ca/
geekbench/.

100

You might also like