0% found this document useful (0 votes)
63 views

Kashyap Chamarthy - Effective Virtual CPU Configuration OSS EU2018

This document discusses effective virtual CPU configuration with QEMU and libvirt. It provides a timeline of recent CPU flaws discovered in 2018 such as Spectre and Meltdown. It describes the components involved in KVM-based virtualization like QEMU, libvirt, and Linux kernel modules. It discusses how to specify CPU models and control guest CPU features on the QEMU command line. It notes that explicitly configuring the CPU model is important for security and performance.

Uploaded by

SergiuPol
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

Kashyap Chamarthy - Effective Virtual CPU Configuration OSS EU2018

This document discusses effective virtual CPU configuration with QEMU and libvirt. It provides a timeline of recent CPU flaws discovered in 2018 such as Spectre and Meltdown. It describes the components involved in KVM-based virtualization like QEMU, libvirt, and Linux kernel modules. It discusses how to specify CPU models and control guest CPU features on the QEMU command line. It notes that explicitly configuring the CPU model is important for security and performance.

Uploaded by

SergiuPol
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

Effective Virtual CPU Configuration with QEMU

and libvirt

Kashyap Chamarthy <[email protected]>

Open Source Summit


Edinburgh, 2018

1 / 38
Timeline of recent CPU flaws, 2018 (a)

Jan 03 • Spectre v1: Bounds Check Bypass

Jan 03 • Spectre v2: Branch Target Injection

Jan 03 • Meltdown: Rogue Data Cache Load

May 21 • Spectre-NG: Speculative Store


Bypass

Jun 21 • TLBleed: Side-channel attack over


shared TLBs
2 / 38
Timeline of recent CPU flaws, 2018 (b)

Jun 29 • NetSpectre: Side-channel attack


over local network

Jul 10 • Spectre-NG: Bounds Check Bypass


Store

Aug 14 • L1TF: "L1 Terminal Fault"

... • ?

3 / 38
What this talk is not about

4 / 38
What this talk is not about

Out of scope:
Internals of various side-channel attacks
How to exploit Meltdown & Spectre variants
Details of performance implications

4 / 38
What this talk is not about

Out of scope:
Internals of various side-channel attacks
How to exploit Meltdown & Spectre variants
Details of performance implications

Related talks in the ‘References’ section

4 / 38
KVM-based virtualization components

Linux with KVM


5 / 38
KVM-based virtualization components

QEMU QEMU
VM1 VM2
Disk1 Disk2 ioctl()

Linux with KVM


5 / 38
KVM-based virtualization components

libvirtd
P QM
QM P

QEMU QEMU
VM1 VM2
Disk1 Disk2 ioctl()

Linux with KVM


5 / 38
KVM-based virtualization components
OpenStack,
et al.
Virt Driver

libvirtd
P QM
QM P

QEMU QEMU
VM1 VM2
Disk1 Disk2 ioctl()

Linux with KVM


5 / 38
KVM-based virtualization components
OpenStack,
et al.
libguestfs Virt Driver
(guestfish)
libvirtd
P QM
QM P

QEMU QEMU
VM1 VM2
Custom
Disk1 Disk2 ioctl()
Appliance

Linux with KVM


5 / 38
QEMU and KVM
Guest RAM QEMU

e1000e NVMe Virtio-SCSI

vCPU-1 vCPU-2
ioctl()→/dev/kvm

Host [kvm.ko; kvm-intel.ko]


kernel VMX modes: guest↔host
Emulation: CPUID, irqchip
VMLAUNCH, ...

Hardware: Intel VMX extensions


6 / 38
QEMU and KVM
Guest RAM QEMU

e1000e NVMe Virtio-SCSI

vCPU-1 vCPU-2
To inspect, use
ioctl()→/dev/kvm

Linux tools: Host [kvm.ko; kvm-intel.ko]


top, kill, ... kernel VMX modes: guest↔host
Emulation: CPUID, irqchip
VMLAUNCH, ...

Hardware: Intel VMX extensions


6 / 38
Hardware-based virtualization with KVM
KVM prepares
VMENTER
to enter CPU
Guest Mode

QEMU issues
ioctl(KVM_RUN) Perform in-kernel
emulation
Execute natively
VMEXIT
in Guest Mode.
(CPU with VMX)
QEMU emulates Yes
hardware
Emulate
in-kernel?

No
7 / 38
Part I
Interfaces to configure vCPUs

8 / 38
x86: QEMU’s default CPU models (a)

The default models (qemu32, qemu64) work on any host CPU

9 / 38
x86: QEMU’s default CPU models (a)

The default models (qemu32, qemu64) work on any host CPU

But they are dreadful choices!

9 / 38
x86: QEMU’s default CPU models (a)

The default models (qemu32, qemu64) work on any host CPU

But they are dreadful choices!


No AES / AES-NI: critical for TLS performance
No RDRAND: important for entropy
No PCID: performance- & security-critical (thanks, Meltdown)

9 / 38
x86: QEMU’s default CPU models (b)

$ cd /sys/devices/system/cpu/vulnerabilities/
$ grep . *
l1tf:Mitigation: PTE Inversion
meltdown:Mitigation: PTI
spec_store_bypass:Vulnerable
spectre_v1:Mitigation: __user pointer sanitization
spectre_v2:Mitigation: Full generic retpoline

10 / 38
x86: QEMU’s default CPU models (b)

$ cd /sys/devices/system/cpu/vulnerabilities/
$ grep . * On a guest running with qemu64
l1tf:Mitigation: PTE Inversion
meltdown:Mitigation: PTI
spec_store_bypass:Vulnerable
spectre_v1:Mitigation: __user pointer sanitization
spectre_v2:Mitigation: Full generic retpoline

10 / 38
x86: QEMU’s default CPU models (b)

$ cd /sys/devices/system/cpu/vulnerabilities/
$ grep . *
l1tf:Mitigation: PTE Inversion
meltdown:Mitigation: PTI
spec_store_bypass:Vulnerable
spectre_v1:Mitigation:
Spectre-NG __user pointer sanitization
spectre_v2:Mitigation: Full generic retpoline

10 / 38
x86: QEMU’s default CPU models (b)

$ cd /sys/devices/system/cpu/vulnerabilities/
$ grep . *
l1tf:Mitigation: PTE Inversion
meltdown:Mitigation: PTI
spec_store_bypass:Vulnerable
spectre_v1:Mitigation: __user pointer sanitization
spectre_v2:Mitigation: Full generic retpoline

Always specify an explicit CPU model;


or use libvirt’s host-model
10 / 38
Defaults of other architectures

AArch64: Doesn’t provide a default guest CPU


$ qemu-system-aarch64 -machine virt -cpu help

11 / 38
Defaults of other architectures

AArch64: Doesn’t provide a default guest CPU


$ qemu-system-aarch64 -machine virt -cpu help
Default CPU depends on
the machine type

11 / 38
Defaults of other architectures

AArch64: Doesn’t provide a default guest CPU


$ qemu-system-aarch64 -machine virt -cpu help

ppc64 — host for KVM; power8 for TCG (pure emulation)

s390x — host for KVM; qemu for TCG

11 / 38
Configure CPU on the command-line

On x86, by default, the qemu64 model is used:


$ qemu-system-x86_64 [...]

12 / 38
Configure CPU on the command-line

On x86, by default, the qemu64 model is used:


$ qemu-system-x86_64 [...]

Specify a particular CPU model:


$ qemu-system-x86_64 -cpu IvyBridge-IBRS [...]

12 / 38
Configure CPU on the command-line

On x86, by default, the qemu64 model is used:


$ qemu-system-x86_64 [...]

Specify a particular CPU model:


$ qemu-system-x86_64 -cpu IvyBridge-IBRS [...]
Named CPU model

12 / 38
Control guest CPU features

Enable or disable specific features for a vCPU model:


$ qemu-system-x86_64 \
-cpu Skylake-Client-IBRS,vmx=off,pcid=on [...]

13 / 38
Control guest CPU features

Enable or disable specific features for a vCPU model:


$ qemu-system-x86_64 \
-cpu Skylake-Client-IBRS,vmx=off,pcid=on
Skylake-Client-IBRS [...]
Named CPU model

13 / 38
Control guest CPU features

Enable or disable specific features for a vCPU model:


$ qemu-system-x86_64 \
-cpu Skylake-Client-IBRS,vmx=off
vmx=off,pcid=on
pcid=on [...]
Granular CPU flags

13 / 38
Control guest CPU features

Enable or disable specific features for a vCPU model:


$ qemu-system-x86_64 \
-cpu Skylake-Client-IBRS,vmx=off,pcid=on [...]

For a list of supported vCPU models, refer to:


$ qemu-system-x86_64 -cpu help
Or libvirt’s — ‘virsh cpu-models x86_64’

13 / 38
QEMU’s CPU-related run-time interfaces

Granular details about vCPU models, their capabilities & more:


query-cpu-definitions
query-cpu-model-expansion
query-hotpluggable-cpus
query-cpus-fast; device_{add,del}

libvirtd caches some of this data under


/var/cache/libvirt/qemu/capabilities/
14 / 38
Run-time: Probe QEMU for CPU model specifics

[Upstream-QEMU]$ ./qmp-shell -v -p /tmp/qmp-sock


(QEMU) query-cpu-definitions
...
"return": [
{ "typename": "Westmere-IBRS-x86_64-cpu",
"unavailable-features": [],
"migration-safe": true,
"static": false,
"name": "Westmere-IBRS" }]
... # Snip other CPU variants
15 / 38
Part II
CPU modes, models and flags

16 / 38
Host passthrough

Exposes the host CPU model, features, etc. as-is to the VM


$ qemu-system-x86_64 -cpu host [...]

17 / 38
Host passthrough

Exposes the host CPU model, features, etc. as-is to the VM


$ qemu-system-x86_64 -cpu host [...]

Caveats:
No guarantee of a stable CPU for the guest

17 / 38
Host passthrough

Exposes the host CPU model, features, etc. as-is to the VM


$ qemu-system-x86_64 -cpu host [...]

Caveats:
No guarantee of a stable CPU for the guest
Live migration is a no go with mixed host CPUs

17 / 38
Host passthrough

Exposes the host CPU model, features, etc. as-is to the VM


$ qemu-system-x86_64 -cpu host [...]

Caveats:
No guarantee of a stable CPU for the guest
Live migration is a no go with mixed host CPUs

Most performant; ideal if live migration is not required


17 / 38
Host passthrough – when else to use it?

Data Center (Intel host CPUs)

Broadwell Broadwell Broadwell Broadwell

Broadwell Broadwell Broadwell Broadwell

18 / 38
Host passthrough – when else to use it?

Data Center (Intel host CPUs)

Broadwell Broadwell Broadwell Broadwell

Broadwell Broadwell Broadwell Broadwell

Along with identical CPUs, identical kernel and


microcode are a must for VM live migration!
18 / 38
QEMU’s named CPU models (a)
Virtual CPUs typically model physical CPUs

Add or remove CPU features:


$ qemu-system-x86_64 -cpu Broadwell-IBRS,\
vme=on,f16c=on,rdrand=on, \
tsc_adjust=on,xsaveopt=on,\
hypervisor=on,arat=off, \
pdpe1gb=on,abm=on [...]

19 / 38
QEMU’s named CPU models (a)
Virtual CPUs typically model physical CPUs

Add or remove CPU features:


$ qemu-system-x86_64 -cpu Broadwell-IBRS,\
vme=on,f16c=on,rdrand=on, \
tsc_adjust=on,xsaveopt=on,\
hypervisor=on,arat=off, \
pdpe1gb=on,abm=on [...]

More flexible in live migration than ‘host passthrough’


19 / 38
QEMU’s named CPU models (b)
QEMU is built with a number of pre-defined models:
$ qemu-system-x86_64 -cpu help
Available CPUs:
...
x86 Broadwell-IBRS Intel Core Processor (Broadwell, IBRS)
...
x86 EPYC AMD EPYC Processor
x86 EPYC-IBPB AMD EPYC Processor (with IBPB)
x86 Haswell Intel Core Processor (Haswell)
...
Recognized CPUID flags:
amd-ssbd apic arat arch-capabilities avx avx2 avx512-4fmaps
...
20 / 38
‘host-model’ – a libvirt abstraction

Tackles a few problems:


Maximum possible CPU features from the host
Live migration compatibility—with caveats
Auto-adds critical guest CPU flags (e.g. spec-ctrl)

21 / 38
‘host-model’ – a libvirt abstraction

Tackles a few problems:


Maximum possible CPU features from the host
Live migration compatibility—with caveats
Auto-adds critical guest CPU flags (e.g. spec-ctrl);
provided—microcode, kernel, QEMU & libvirt are updated!

21 / 38
‘host-model’ – a libvirt abstraction

Tackles a few problems:


Maximum possible CPU features from the host
Live migration compatibility—with caveats
Auto-adds critical guest CPU flags (e.g. spec-ctrl);
provided—microcode, kernel, QEMU & libvirt are updated!

Targets for the best of ‘host passthrough’ and


named CPU models

21 / 38
‘host-model’ – example libvirt config

From a libvirt guest definition:


<cpu mode=’host-model’>
<feature policy=’require’ name=’vmx’/>
<feature policy=’disable’ name=’pdpe1gb’/>
...
</cpu>

libvirt will translate it into a suitable CPU model;


based on: /usr/share/libvirt/cpu_map/*.xml

22 / 38
‘host-model’ and live migration

As done by libvirt:
Source vCPU definition is transferred as-is to the target
On target: Migrated guest sees the same vCPU model

23 / 38
‘host-model’ and live migration

As done by libvirt:
Source vCPU definition is transferred as-is to the target
On target: Migrated guest sees the same vCPU model
But: When the guest ‘cold boots’, it may pick up extra
CPU features—prevents migrating back to the source

Use host-model, if live migration in both directions


is not a requirement

23 / 38
OpenStack Nova and CPU models
Provides relevant config attributes:
cpu_mode
Can be: custom, host-passthrough; or host-model
cpu_model & cpu_model_extra_flags
Refer to libvirt’s /usr/share/libvirt/cpu_map/*.xml
Or QEMU’s: qemu-system-x86_64 -cpu help

Details in documentation of the above config attributes


https://docs.openstack.org/nova/rocky/configuration/config.html

24 / 38
Part III
Choosing CPU models & features

25 / 38
Finding compatible CPU models

Data Center (Intel host CPUs)

Haswell Westmere IvyBridge SandyBridge

Nehalem Broadwell Westmere Nehalem-IBRS

26 / 38
Finding compatible CPU models

Problem: Determine a compatible model among CPU variants

27 / 38
Finding compatible CPU models

Problem: Determine a compatible model among CPU variants

Enter libvirt’s APIs:


compareCPU() and baselineCPU()
compareHypervisorCPU() and baselineHypervisorCPU()
-
(New in libvirt 4.4.0)

27 / 38
Intersection between these two host CPUs?
$ cat Multiple-Host-CPUs.xml
<cpu mode=’custom’ match=’exact’>
<model fallback=’forbid’>Haswell-noTSX-IBRS</model>
<vendor>Intel</vendor>
<feature policy=’require’ name=’vmx’/>
<feature policy=’require’ name=’rdrand’/>
</cpu>
<!–- Second CPU –->
<cpu mode=’custom’ match=’exact’>
<model fallback=’forbid’>Skylake-Client-IBRS</model>
<vendor>Intel</vendor>
<feature policy=’disable’ name=’pdpe1gb’/>
<feature policy=’disable’ name=’pcid’/>
</cpu>
28 / 38
Intersection between these two host CPUs?
$ cat Multiple-Host-CPUs.xml
<cpu mode=’custom’ match=’exact’>
Haswell-noTSX-IBRS</model>
<model fallback=’forbid’>Haswell-noTSX-IBRS
<vendor>Intel</vendor>
<feature policy=’require’ name=’vmx’/>
<feature policy=’require’ name=’rdrand’/> Two CPU
</cpu> models
<!–- Second CPU –->
<cpu mode=’custom’ match=’exact’>
<model fallback=’forbid’>Skylake-Client-IBRS
Skylake-Client-IBRS</model>
<vendor>Intel</vendor>
<feature policy=’disable’ name=’pdpe1gb’/>
<feature policy=’disable’ name=’pcid’/>
</cpu>
28 / 38
Use baselineHypervisorCPU() to determine it

$ virsh hypervisor-cpu-baseline Multiple-Host-CPUs.xml


<cpu mode=’custom’ match=’exact’>
<model fallback=’forbid’>Haswell-noTSX-IBRS</model>
<vendor>Intel</vendor>
<feature policy=’require’ name=’rdrand’/>
<feature policy=’disable’ name=’pcid’/>
</cpu>

29 / 38
Use baselineHypervisorCPU() to determine it

$ virsh hypervisor-cpu-baseline Multiple-Host-CPUs.xml


<cpu mode=’custom’ match=’exact’>
<model fallback=’forbid’>Haswell-noTSX-IBRS</model>
<vendor>Intel</vendor>
<feature policy=’require’ name=’rdrand’/>
<feature policy=’disable’ name=’pcid’/>
</cpu>
Intersection between our
Haswell & Skylake variants

29 / 38
Use baselineHypervisorCPU() to determine it

$ virsh hypervisor-cpu-baseline Multiple-Host-CPUs.xml


<cpu mode=’custom’ match=’exact’>
<model fallback=’forbid’>Haswell-noTSX-IBRS</model>
<vendor>Intel</vendor>
<feature policy=’require’ name=’rdrand’/>
<feature policy=’disable’ name=’pcid’/>
</cpu>

A “baseline” model that permits live migration

29 / 38
x86: QEMU’s “machine types”

30 / 38
x86: QEMU’s “machine types”

Two main purposes:


Emulate different chipsets (and related devices)—e.g. Intel’s
i440FX (a.k.a ‘pc’) and Q35

30 / 38
x86: QEMU’s “machine types”

Two main purposes:


Emulate different chipsets (and related devices)—e.g. Intel’s
i440FX (a.k.a ‘pc’) and Q35
Provide stable guest ABI—virtual hardware remains the same,
regardless of changes in host software or hardware

30 / 38
x86: QEMU’s “machine types” – versioned
$ qemu-system-x86_64 -machine help
...
pc Standard PC (i440FX + PIIX, 1996) (alias of pc-i440fx-3.0)
pc-i440fx-3.0 Standard PC (i440FX + PIIX, 1996) (default)
pc-i440fx-2.9 Standard PC (i440FX + PIIX, 1996)
...
q35 Standard PC (Q35 + ICH9, 2009) (alias of pc-q35-3.0)
pc-q35-3.0 Standard PC (Q35 + ICH9, 2009)
pc-q35-2.9 Standard PC (Q35 + ICH9, 2009)
pc-q35-2.8 Standard PC (Q35 + ICH9, 2009)
...

31 / 38
x86: QEMU’s “machine types” – versioned
$ qemu-system-x86_64 -machine help
...
pc Standard PC (i440FX + PIIX, 1996) (alias of pc-i440fx-3.0)
pc-i440fx-3.0 Standard PC (i440FX + PIIX, 1996) (default)
pc-i440fx-2.9 Standard PC (i440FX + PIIX, 1996)
Traditional
...
q35 Standard PC (Q35 + ICH9, 2009) (alias of pc-q35-3.0)
pc-q35-3.0 Standard PC (Q35 + ICH9, 2009)
pc-q35-2.9 Standard PC (Q35 + ICH9, 2009)
pc-q35-2.8 Standard PC (Q35 + ICH9, 2009)
...

31 / 38
x86: QEMU’s “machine types” – versioned
$ qemu-system-x86_64 -machine help
...
pc Standard PC (i440FX + PIIX, 1996) (alias of pc-i440fx-3.0)
pc-i440fx-3.0 Standard PC (i440FX + PIIX, 1996) (default)
pc-i440fx-2.9 Standard PC (i440FX + PIIX, 1996)
...
q35 Standard PC (Q35 + ICH9, 2009) (alias of pc-q35-3.0)
pc-q35-3.0 Standard PC (Q35 + ICH9, 2009)
Recommended Standard PC (Q35 + ICH9, 2009)
pc-q35-2.9
pc-q35-2.8 Standard PC (Q35 + ICH9, 2009)
...

Versioned machine types provide stable guest ABI


31 / 38
Machine types and CPU features
Changing machine types is guest-visible

32 / 38
Machine types and CPU features
Changing machine types is guest-visible

After a QEMU upgrade, when using libvirt:


Need an explicit request for machine type upgrade
The guest needs a ‘cold-reboot’ (i.e. an explicit stop +
start)—to allow QEMU to re-exec()

Change machine types only after guest workload


evaluation—CPU features & devices can differ
32 / 38
x86: Recommended guest CPU models

Before configuring guest CPUs:


Update microcode, host & guest kernels; refer
to—/sys/devices/system/cpu/vulnerabilities/

33 / 38
x86: Recommended guest CPU models

Before configuring guest CPUs:


Update microcode, host & guest kernels; refer
to—/sys/devices/system/cpu/vulnerabilities/
Update libvirt & QEMU—and explicitly update guest
CPUs to patched variants (e.g. the *-IBRS models)
Cold-reboot the guests—to pick up new CPUID bits

33 / 38
x86: Recommended guest CPU models

Before configuring guest CPUs:


Update microcode, host & guest kernels; refer
to—/sys/devices/system/cpu/vulnerabilities/
Update libvirt & QEMU—and explicitly update guest
CPUs to patched variants (e.g. the *-IBRS models)
Cold-reboot the guests—to pick up new CPUID bits

Guidance: qemu/docs/qemu-cpu-models.texi
(Thanks, Daniel Berrangé)
33 / 38
x86: Important CPU flags

To mitigate guests from multiple Spectre & Meltdown variants:


Intel: ssbd, pcid, spec-ctrl
AMD: virt-ssbd, amd-ssbd, amd-no-ssb, ibpb
Some are built into QEMU’s *-IBRS & *-IBPB CPU models

34 / 38
x86: Important CPU flags

To mitigate guests from multiple Spectre & Meltdown variants:


Intel: ssbd, pcid, spec-ctrl
AMD: virt-ssbd, amd-ssbd, amd-no-ssb, ibpb
Some are built into QEMU’s *-IBRS & *-IBPB CPU models
Details:
qemu/docs/qemu-cpu-models.texi
https://www.qemu.org/2018/02/14/qemu-2-11-1-and-spectre-update

34 / 38
Future ‘expectations’ from applications?

“QEMU and libvirt took the joint decision


to stop adding new named CPU models when
CPU vulnerabilities are discovered from this point
forwards. Applications / users would be
expected to turn on CPU features explicitly as
needed and are considered broken if they don’t
provide this functionality.”
— “CPU model versioning separate from machine type versioning”
From ‘qemu-devel’ mailing list
35 / 38
References

CPU model configuration for QEMU/KVM x86 hosts, by Daniel Berrangé


https://www.berrange.com/posts/2018/06/29/cpu-model-configuration-for-qemu-kvm-on-x86-hosts

Mitigating Spectre and Meltdown (and L1TF), by David Woodhouse


https://kernel-recipes.org/en/2018/talks/mitigating-spectre-and-meltdown-vulnerabilities/

Exploiting modern microarchitectures—Meltdown, Spectre, and other


hardware attacks, by Jon Masters
https://archive.fosdem.org/2018/schedule/event/closing_keynote

KVM and CPU feature enablement, by Eduardo Habkost


https://wiki.qemu.org/images/c/c8/Cpu-models-and-libvirt-devconf-2014.pdf

36 / 38
Questions?
E-mail: [email protected]
IRC: kashyap – Freenode & OFTC

37 / 38
Related talks at the KVM Forum

(1) Security in QEMU: How Virtual Machines Provide


Isolation — by Stefan Hajnoczi
– Happening now, but it’s being recorded

(2) What Did Spectre and Meltdown Teach about CPU


Models? — by Paolo Bonzini
– 26-OCT, Wednesday: 11:30 – 12:00

38 / 38

You might also like