Oracle Solaris Zfs
Oracle Solaris Zfs
4-3
Objectives
To understand, what makes ZFS Unique To know, what facilities are provided by ZFS To learn, how to do ZFS administration
4-4
Agenda
Introduction to ZFS
ZFS Setup ZFS Components ZFS Storage Pool and File System ZFS Properties
4-5
Introduction to ZFS
First of its kind 128-bit file system Acronym for Zettabyte File System or simply ZFS Storage capacity of 256 quadrillion zettabytes Directories with possibly 256 trillion entries No limits on number of file systems or files Dynamic metadata allocation e.g. I-node pre-allocation Data integrity management using 256-bit checksum Its a(n) :revolution over traditional file system fundamentally new approach to data and volume management transactional file system with self healing capabilities design for robustness, scalability, and easy administration architecture with storage pool of heterogeneous devices
Setup Requirement
A machine, SPARC or x86/x64 with Solaris 10 6/06 or newer Minimum disk size of 128 MB for ZFS environment Minimum disk space of 64 MB required for storage pool Recommended memory of at least 1 GB or more
4-7
ZFS Components
ZFS Components comprise of virtual devices like
Whole disk (Recommended), Disk slice, or Files
Dataset names must begin with an alphanumeric character Dataset names must not contain a percent symbol (%)
4-8
zpool can contain heterogeneous storage devices. No limitation zpool can be dynamically expanded without re-configuration Multiple ZFS file system or dataset can be created in a zpool File system property quota and reservation can be set
4-9
ZFS Properties
ZFS Properties are of two types Native and User-defined Native properties control file system behavior, User-defined don't Native properties can be read-only or settable Many settable properties are inherited from parent All settable properties have associated source as either
default (local and not inherited) local (explicitly set on the dataset) inherited from <dataset-name> (specifies the dataset source) Native read-only properties comprise of Available compressratio creation mounted origin type used Settable properties include (See Appendix-I for complete list)
aclinherit aclmode canmount checksum compression dedup devices encryption mountpoint quota readonly recordsize reservation setuid sharenfs snapdir volsize zoned
4 - 10
General Architecture
File System Device GUI Application
JNI libzfs
User Kernel
Interface
ZFS Volume
Transactional Objects
Pooled Storage
Virtual Device
Layered Driver Interface (LDI)
Configuration
4 - 11
Agenda
Introduction to ZFS
ZFS Setup ZFS Components ZFS Storage Pool and File System ZFS Properties
4 - 12
4 - 13
pkg update creates new BE automatically Send stream to manage ZFS root properties
4 - 15
4 - 16
Delegation in a Zone
In a zone use zonecfg, add device for ZFS volume In a zone create or modify zpool is not allowed In a zone privileged user can modify ZFS properties
except sharenfs zoned quota reservation
4 - 17
Encrypt data before storing in ZFS file system Property scope is the file system and inherited by descendants File system owner's key is required to access encoded data Wrapping key encrypts the Data encryption key
Stored in a file (as raw or hex) or derived from the passphrase
Snapshot
Use zfs snaphot fs@snapN to create snapshot
Takes only one argument, i.e. snapshot name fs@snapN Instantly creates the Read-only snapshot of fs fs@snapN is stored on the same zpool Initially the space is shared by fs and the snapshot fs@snapN initially consumes no additional disk space fs@snapN grows in size as active dataset fs changes Provides persistence across system reboot Directory .zfs/snapshot lists all snapshots Use command zfs list -t snapshot Theoretically maximum N = 264 By default rollback to the most recent snapshot To rollback to N,intermediate snapshots must be destroyed To rollback the file system must be unmounted and remounted
4 - 19
Clone
Use zfs clone pool/fs1@snapN pool/fs2
Takes two argument, snapshot name and the new file system Created using a snapshot only Results in new file system with contents of original file system The new file system is Writable Snapshot cannot be deleted until clone exists Creates stream representation of snapshot to transfer Incremental changes can be saved between snapshots
Individual file restoration not possible Entire file system must be restored
Receive full stream to recreate the entire file system Different property values of ZFS snapshot streams
Receive stream with property value specified different than Send Specify at Receive to use the original property Specify at Receive to disable specific file system property
Delegated Administration
Refined permissions to specific user, group or everyone Delegated Permissions supported by ZFS of 2(Two) types
Individual Permissions Permission Sets
zfs allow satya create,destroy,mount,snapshot zfsN zfs allow mystaff @myset zfsN
4 - 21
ACL comprise of multiple Access Control Entries (ACE) ACLs are fine grained compared to standard file permissions Use ACL-aware cp mv tar cpio rcp to transfer UFS file to ZFS Translates POSIX-draft based ACL to equivalent NFSv4 ACL
Use ufsrestore on ZFS to restore, unlike tar cpio (UFS) By default, ACLs are not inherited unless Flag is specified
file_inherit inherit_only dir_inherit no_propagate
Agenda
Introduction to ZFS
ZFS Setup ZFS Components ZFS Storage Pool and File System ZFS Properties
4 - 23
Administration
ZFS supports both CLI and Web based Administration Use CLI command to create ZFS pool poolN
zpool create poolN c0t0d0 c0t1d0 c0t1d2 zpool create poolN mirror c0t0d0 c0t1d0 zpool create poolN c0t0d0 log c0t1d0 cache c0t1d2 zfs create poolN/zfsN
Use command to define Log or Cache devices in poolN Use command to create a file system in poolN Use command to add devices to poolN
zpool add poolN c1t1d1
Use command to set file system property Use command to get property value
zfs get compressratio
4 - 24
Web-Based Management
Use https://host:6789/zfs for ZFS Administration
Create new storage pool Add capacity to existing pool Export zpool to another system Import zpool from another system View and monitor storage pools Create new file system Create volume configuration Take snapshots Rollback using snapshot To start the web console server
4 - 25
ZFS Limitations
It is not possible to reduce the number of top-level vdev in a zpool It is not possible to add disk as a column to RAID-Z vdev Virtual devices cannot be nested in a zpool
Mirror or RAID-Z top-level vdev can only contain files or disks
ZFS cannot provide concurrent access from multiple hosts ZFS expects a disk cache flush command to commit data to media ZFS defragmentation can impact sequential read performance
Block Pointer Rewrite functionality will eliminate defragementation issue
ZFS can only detect or report but repair silent data corruption errors
Unless explicitly specified copies=N (where N>1)
ZFS RAID resilvering may take long time ZFS does not support TRIM which is used with SSD
4 - 26
Best Practices
Create zpool using whole disk instead of disk slices (label EFI)
Provides file system safety by automatic enabling write cache
In case of Root pool use disk slice instead of whole disk (label SMI)
Allocate entire disk capacity to slice 0
Create zpool with several group of vdev instead of single large vdev
Improves IOPS performance
Do not create zpool that contain components from another zpool RAID-Z is not recommended for random read, e.g. Databases Variable covariance between random and sequential reads
Sequential read of fragmented files adversely impact random reads
Match ZFS record size to db block size for OLTP workload Keep pool space under 80% utilization for maintaining performance Mirrored pool or hardware RAID is preferred over RAID-Z
4 - 27
Appendix-I
Use zfs set for ZFS Properties in Oracle Solaris 11 Express
PROPERTY EDIT INHERIT VALUES <size> <1.00x or higher if compressed> <date> yes | no undefined | unavailable | available yes | no <snapshot> <size> <date> filesystem | volume | snapshot <size> <size> <size> NO <size> <size> <count> discard | noallow | restricted | passthrough | passthrough-x on | off on | off | noauto sensitive | insensitive | mixed on | off | fletcher2 | fletcher4 | sha256 on | off | lzjb | gzip | gzip-[1-9] | zle 1 | 2 | 3 on | off | verify | sha256[,verify] on | off on | off | aes-128-ccm | aes-192-ccm | aes-256-ccm | aes-128-gcm | aesavailable NO NO compressratio NO NO creation NO NO defer_destroy NO NO keystatus NO NO mounted NO NO origin NO NO referenced NO NO rekeydate NO NO type NO NO used NO NO usedbychildren NO NO usedbydataset NO NO usedbyrefreservation NO usedbysnapshots NO NO userrefs NO NO aclinherit YES YES atime YES YES canmount YES NO casesensitivity NO YES checksum YES YES compression YES YES copies YES YES dedup YES YES devices YES YES encryption NO YES 192-gcm | aes-256-gcm
4 - 28
Appendix-I
PROPERTY exec keysource logbias mlslabel mountpoint nbmand normalization primarycache quota readonly recordsize refquota refreservation reservation rstchown secondarycache setuid sharenfs sharesmb snapdir sync utf8only version volblocksize volsize vscan xattr zoned userused@... groupused@... userquota@... groupquota@... EDIT YES YES YES YES YES YES NO YES YES YES YES YES YES YES YES YES YES YES YES YES YES NO YES NO YES YES YES YES NO NO YES YES INHERIT YES YES YES YES YES YES YES YES NO YES YES NO NO NO YES YES YES YES YES YES YES YES NO YES NO YES YES YES NO NO NO NO VALUES on | off raw | hex | passphrase,prompt | file://<path> latency | throughput <sensitivity label> <path> | legacy | none on | off none | formC | formD | formKC | formKD all | none | metadata <size> | none on | off 512 to 128k, power of 2 <size> | none <size> | none <size> | none on | off all | none | metadata on | off on | off | share(1M) options on | off | sharemgr(1M) options hidden | visible standard | always | disabled on | off 1 | 2 | 3 | 4 | current 512 to 128k, power of 2 <size> on | off on | off on | off <size> <size> <size> | none <size> | none
4 - 29
References
Download Oracle Solaris 11 Express
www.oracle.com/technetwork/server-storage/solaris11/overview/
4 - 30