Best Practices for
SQL Server on VMware
Dean Richards
Senior DBA, Confio Software
1
Who Am I?
20+ Years in SQL Server & Oracle
• DBA and Developer
• Specialize in Performance Tuning
• Actively Implementing SQL Server on VMware
Product Architect for Confio Software
•
[email protected] • Makers of Ignite8 Response Time Analysis Tools
• Ignite for SQL Server on VMware
2
Agenda
Why Virtualize
Terms and Concepts
Best Practices
• Memory
• CPU
• Network
• Storage
• Monitoring
Summary
3
Why Virtualize
Too much physical horsepower
• Most resources are drastically underutilized
• Many are running at <10% CPU
• Confio Before Virtualization - Pictures
• Confio After Virtualization - Pictures
4
Confio Lab – 40 Small Boxes
5
Confio Lab – 40 Small Boxes
6
CPU on “Busy” Boxes
7
Confio New “Datacenter”
8
Still Have Room for Growth
9
Why Virtualize
Easier to manage fewer physical boxes
• Manage physical resources on 2, 4 or 8 physical
machines vs. 50-100 small boxes
• vMotion enables automatic resource balancing
Cheaper
• More bang for the buck with bigger machines
• Increased power efficiency
• Less floor space
10
SQL Server on VMware
Microsoft only officially supports Hyper-V
• Has a joint support partnership with VMware
• http://support.microsoft.com/kb/897615
• http://support.microsoft.com/kb/944987
Many SQL Server instances will match native
performance
• http://www.vmware.com/files/pdf/perf_vsphere_sql_scalability.pdf **
• Fully saturated instances - 2-15% overhead
• But, new hardware may be 10-30% faster
Deploying SQL Server on VMware is very similar
to using physical servers
• Monitoring the whole stack will take some change
11
Scalability Highlights
- From http://www.vmware.com/files/pdf/perf_vsphere_sql_scalability.pdf
12
Hyper-V
FKA Windows Server Virtualization
• Current version is Hyper-V Server 2008 R2
• Includes live migration (similar to vMotion)
• Expanded hardware support
• Supports Windows Server 2008, 2003, 2000, 7, Vista, XP
• Now handles 8 processors (like VMware)
Also Supports non-Windows Guest O/S
• SUSE Enterprise Linux 10, 11
• Red Hat Enterprise 5.2 and above
Market share seems to be catching up
• Confio has 100’s of customers on VMware
• None are using Hyper-V for SQL Server
Check it out - make an informed decision
13
VMware Architecture
Picture courtesy of VMware.com
14
VMware Clusters
15 Picture courtesy of VMware.com
Terms and Concepts
ESX and ESXi – the hypervisor and foundation
for VMware products
Physical Host – underlying hardware where
ESX is installed
Virtual Machine (VM) – container inside host
that looks like a physical machine
vCenter Server – centralized management
vSphere Client – GUI
17
Concepts - Cluster
Cluster – several physical hosts linked together
vMotion – live migration of VM from one host to
another – no loss of connectivity
Distributed Resource Scheduler (DRS) – can
automatically make sure hosts in a cluster have a
balanced workload – uses vMotion
High Availability (HA) – automated restart of VMs
after host failure – several minutes of downtime
Fault Tolerance (FT) – a mirrored copy of a VM on
another host – takes over with no downtime
Consolidated Backup – (VCB) – integrates with
several 3rd party tools to backup a snapshot of the VM
18
Best Practices - Configuration
Install VMware ESX 4.x
• Alleviates much of overhead seen in ESX 3
• Use recent hardware
Do not install unnecessary options in O/S
• No graphics, office productivity, instant messaging,
video programs, COM ports, DVD, etc.
• Be careful with automatic updates
Install VMware Tools in VM
One SQL Server per VM – more flexibility
Use vSphere Cloning Technologies
• Clone to Template – create a template from VM
19
Best Practices - CPU
Concepts
• Reservation – amount of CPU guaranteed
• Limit – limits the amount of CPU
• Shares – sets priority for this VM
SQL Server is typically not a CPU bound application
• Use only the vCPUs required – know your workload
• If not known, start with 1 or 2 and increase later
• vSphere attempts to co-schedule CPUs
• If you have 4 vCPU, 4 Physical CPUs need to be
available to start processing
Check with VMware docs for Hyperthreading
• Enabled on Intel Xeon 5500 processors
Best Practices - CPU
Disable Power Savings Mode on Host
Thanks Paul Randall for mentioning CPU-Z
Rated Speed
Actual Speed
21
Best Practices - Memory
Concepts
• Reservation – guarantees amount of RAM
• Limit – limits amount of RAM
• Shares – priority of getting RAM
• Ballooning – unused memory that was given back for
use on other VMs
• Swapping – memory (could be active) given back
forcibly for use on other VMs
• Shared Memory – exact same data from each VM will
only be stored once
– SQL Server binaries
– Not the buffer cache, plan cache, etc
Best Practices - Memory
Set SQL Server Min / Max based on reservation
• Min should be < reservation
• Set Max less than Limit if one is used
• Leave room for O/S and other things
Be careful about overcommitting in production
• Can be more aggressive in dev/test/stage
Use hardware assisted memory virtualization
• AMD – Rapid Virtualization Indexing (RVI)
• Intel – Extended Page Tables (EPT)
• Databases have large amount of page table activity and
will benefit from this
Large Memory Pages
Use Large Memory Pages in SQL Server if needed
• http://www.vmware.com/files/pdf/large_pg_performance.pdf
• Shows how to configure it in Guest O/S and ESX Server
Larger page size for memory
• Windows default is 4KB, can go up to 2MB
• Faster memory performance – pages are not subject to
swapping/replacement – no page table lookups
Best Practices - Network
VMware can handle > 30GB / sec
SQL Server does not use much network bandwidth
• Typically well below 100 MB / sec
Make sure all components in network stack can
handle the same load
• If you have 1 GB NICs, ensure cables, switches, and
others are up to that level too
Use VMXNET paravirtualized network adapter
• Installed into guest O/S capable of 1Gbps
• Minimizes overhead between VM and Host
• Requires VMware Tools
• Supports jumbo frames
Best Practices - Storage
Datastore – access point to storage
Storage issues are usually related to configuration
and not capabilities of ESX
Ensure network is configured properly for SAN/NFS
Create dedicated datastores for intensive instances
• More flexibility
• Bad SAN planning cannot be fixed by datastores
• Isolate data and log activity
Best Practices - Storage
Use VMFS for single instance databases
• Like NTFS for Windows
• Raw Device Mapping (RDM) does not perform better
• RDM may be better in MS cluster environment
Make sure VMFS is properly aligned
• File system misalignment creates issues with I/O
intensive application – databases
• vCenter will automatically align
• Work closely with manufacturers to manually align
Monitoring - vSphere
Get access to vSphere client
• Need a user account
• http://<vcenter machine> - provides download link
vSphere – Host Summary
vSphere – Host Performance
vSphere – VM Summary
vSphere – VM Performance
Monitoring - CPU
Primary Metric – VM Ready Time
Secondary Metrics – VM CPU Utilization, Host CPU
Utilization
Rules
• If VM Ready Time > 10-20%
– If Host CPU Utilization is high => Need more CPU resources on Host
– If Host CPU Utilization ok => VM is limited, give more CPU resources
• If VM CPU Utilization high (sustained over 80%)
– May not be a problem now if no ready time
– Could be a problem soon for this VM
• If Host CPU Utilization high (sustained over 80%)
– May not be a problem now if no ready time on any VM
– Could be a problem soon for all VMs on this host
– Balance VM resources better
Monitoring - Memory
Primary Metric – Swapping
Secondary Metrics – Ballooning, VM Memory Utilization,
Host Memory Utilization
Rules
• If Any Swapping is occurring
– Host needs more memory because it cannot satisfy current demands
– Lessen demands for memory – lower reservations where possible
• Excessive Ballooning
– May be ok for now, but could be a pending issue
• VM Memory Utilization High
– May not be a problem now unless Guest O/S swapping is occurring
– If VM is limited, may want to increase memory this VM can get
• If Host Memory Utilization High
– May not be a problem now if no swapping
– Could be a problem soon for all VMs on this host
Monitoring - Network
Primary Metric – Dropped Receive Packets, Dropped
Transmit Packets
Secondary Metrics – Send Rate, Received Rate
Rules
• If any packets are being dropped
– Look for errors on the Host’s NIC
– See if one NIC is getting all traffic
– Understand which VM is causing the most traffic and reduce it
• If Network Rate is getting close to maximum for hardware
– Understand which VM is causing load
– May need to get better network hardware
Monitoring - Storage
Primary Metrics – Host Highest Disk Latency, Host Device
Latency (by device), VM Command Latency (for all VMs)
Secondary Metrics – Disk Read Rate, Disk Write Rate
Rules
• If Host Latency >= 30 ms
– Review Disk Read / Write rates
– If Close to Storage Capacity - Overloaded Storage
– Otherwise - Slow Storage
• If VM Command Latency >= 30ms only for your VM
– Tune Disk I/O intensive processes on database
– Are Memory / CPU issues causing I/O problems
Takeaway Points
SQL Server on VMware works great
• Go slow at first
• Be careful when virtualizing critical production
Configure Multiple Hosts in a DRS cluster
• Let VMware automatically balance load
When Installing SQL Server
• Use same settings as if it was physical machine
Monitoring
• Use Primary Metrics discussed – there are many more
• Use Ignite for SQL Server on VMware
• Full correlated view that vSphere cannot match for DBAs
IgniteVM
http://174.143.151.145:8133
• Username / Password – demo/demo
Layers and Annotations
Drilldown to Problem
40
Confio Software
Award Winning Performance Tools
Ignite for SQL Server, Oracle, DB2, Sybase
IgniteVM – Provide view into VMware “black
box” to see what you can’t see
• Currently looking for early release testers
Provides Answers for
• What changed recently that affected end users
• What layer is causing the problem
• Who and How should fix the problem
Download free trial at
41
www.confio.com