The Z file system, originally developed by Sun™, is designed to use a pooled storage method in that space is only used as it is needed for data storage. It is also designed for maximum data integrity, supporting data snapshots, multiple copies, and data checksums. It uses a software data replication model, known as RAID-Z. RAID-Z provides redundancy similar to hardware RAID, but is designed to prevent data write corruption and to overcome some of the limitations of hardware RAID.
Some of the features provided by ZFS are RAM-intensive, so some tuning may be required to provide maximum efficiency on systems with limited RAM.
At a bare minimum, the total system memory should be at least one gigabyte. The amount of recommended RAM depends upon the size of the pool and the ZFS features which are used. A general rule of thumb is 1GB of RAM for every 1TB of storage. If the deduplication feature is used, a general rule of thumb is 5GB of RAM per TB of storage to be deduplicated. While some users successfully use ZFS with less RAM, it is possible that when the system is under heavy load, it may panic due to memory exhaustion. Further tuning may be required for systems with less than the recommended RAM requirements.
Due to the RAM limitations of the i386™ platform, users using ZFS on the i386™ architecture should add the following option to a custom kernel configuration file, rebuild the kernel, and reboot:
This option expands the kernel address space, allowing
the vm.kvm_size
tunable to be pushed
beyond the currently imposed limit of 1 GB, or the
limit of 2 GB for PAE. To find the
most suitable value for this option, divide the desired
address space in megabytes by four (4). In this example, it
is 512
for 2 GB.
The kmem
address space can
be increased on all FreeBSD architectures. On a test system
with one gigabyte of physical memory, success was achieved
with the following options added to
/boot/loader.conf
, and the system
restarted:
For a more detailed list of recommendations for ZFS-related tuning, see http://wiki.freebsd.org/ZFSTuningGuide.
There is a start up mechanism that allows FreeBSD to mount ZFS pools during system initialization. To set it, issue the following commands:
#
echo 'zfs_enable="YES"' >> /etc/rc.conf
#
service zfs start
The examples in this section assume three
SCSI disks with the device names
,
da0
,
and da1
.
Users of IDE hardware should instead use
da2
device names.ad
To create a simple, non-redundant ZFS
pool using a single disk device, use
zpool
:
#
zpool create example /dev/da0
To view the new pool, review the output of
df
:
#
df
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/ad0s1a 2026030 235230 1628718 13% /
devfs 1 1 0 100% /dev
/dev/ad0s1d 54098308 1032846 48737598 2% /usr
example 17547136 0 17547136 0% /exampleThis output shows that the example
pool has been created and mounted. It
is now accessible as a file system. Files may be created
on it and users can browse it, as seen in the following
example:
#
cd /example
#
ls
#
touch testfile
#
ls -al
total 4
drwxr-xr-x 2 root wheel 3 Aug 29 23:15 .
drwxr-xr-x 21 root wheel 512 Aug 29 23:12 ..
-rw-r--r-- 1 root wheel 0 Aug 29 23:15 testfileHowever, this pool is not taking advantage of any ZFS features. To create a dataset on this pool with compression enabled:
#
zfs create example/compressed
#
zfs set compression=gzip example/compressed
The example/compressed
dataset is now
a ZFS compressed file system. Try
copying some large files to /example/compressed
.
Compression can be disabled with:
#
zfs set compression=off example/compressed
To unmount a file system, issue the following command
and then verify by using df
:
#
zfs umount example/compressed
#
df
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/ad0s1a 2026030 235232 1628716 13% /
devfs 1 1 0 100% /dev
/dev/ad0s1d 54098308 1032864 48737580 2% /usr
example 17547008 0 17547008 0% /exampleTo re-mount the file system to make it accessible
again, and verify with df
:
#
zfs mount example/compressed
#
df
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/ad0s1a 2026030 235234 1628714 13% /
devfs 1 1 0 100% /dev
/dev/ad0s1d 54098308 1032864 48737580 2% /usr
example 17547008 0 17547008 0% /example
example/compressed 17547008 0 17547008 0% /example/compressedThe pool and file system may also be observed by viewing
the output from mount
:
#
mount
/dev/ad0s1a on / (ufs, local)
devfs on /dev (devfs, local)
/dev/ad0s1d on /usr (ufs, local, soft-updates)
example on /example (zfs, local)
example/data on /example/data (zfs, local)
example/compressed on /example/compressed (zfs, local)ZFS datasets, after creation, may be
used like any file systems. However, many other features
are available which can be set on a per-dataset basis. In
the following example, a new file system,
data
is created. Important files will be
stored here, the file system is set to keep two copies of
each data block:
#
zfs create example/data
#
zfs set copies=2 example/data
It is now possible to see the data and space utilization
by issuing df
:
#
df
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/ad0s1a 2026030 235234 1628714 13% /
devfs 1 1 0 100% /dev
/dev/ad0s1d 54098308 1032864 48737580 2% /usr
example 17547008 0 17547008 0% /example
example/compressed 17547008 0 17547008 0% /example/compressed
example/data 17547008 0 17547008 0% /example/dataNotice that each file system on the pool has the same
amount of available space. This is the reason for using
df
in these examples, to show that the
file systems use only the amount of space they need and all
draw from the same pool. The ZFS file
system does away with concepts such as volumes and
partitions, and allows for several file systems to occupy
the same pool.
To destroy the file systems and then destroy the pool as they are no longer needed:
#
zfs destroy example/compressed
#
zfs destroy example/data
#
zpool destroy example
There is no way to prevent a disk from failing. One method of avoiding data loss due to a failed hard disk is to implement RAID. ZFS supports this feature in its pool design.
To create a RAID-Z pool, issue the following command and specify the disks to add to the pool:
#
zpool create storage raidz da0 da1 da2
Sun™ recommends that the amount of devices used in a RAID-Z configuration is between three and nine. For environments requiring a single pool consisting of 10 disks or more, consider breaking it up into smaller RAID-Z groups. If only two disks are available and redundancy is a requirement, consider using a ZFS mirror. Refer to zpool(8) for more details.
This command creates the storage
zpool. This may be verified using mount(8) and
df(1). This command makes a new file system in the
pool called home
:
#
zfs create storage/home
It is now possible to enable compression and keep extra copies of directories and files using the following commands:
#
zfs set copies=2 storage/home
#
zfs set compression=gzip storage/home
To make this the new home directory for users, copy the user data to this directory, and create the appropriate symbolic links:
#
cp -rp /home/* /storage/home
#
rm -rf /home /usr/home
#
ln -s /storage/home /home
#
ln -s /storage/home /usr/home
Users should now have their data stored on the freshly
created /storage/home
. Test by
adding a new user and logging in as that user.
Try creating a snapshot which may be rolled back later:
#
zfs snapshot storage/home@08-30-08
Note that the snapshot option will only capture a real
file system, not a home directory or a file. The
@
character is a delimiter used between
the file system name or the volume name. When a user's
home directory gets trashed, restore it with:
#
zfs rollback storage/home@08-30-08
To get a list of all available snapshots, run
ls
in the file system's
.zfs/snapshot
directory. For example, to see the previously taken
snapshot:
#
ls /storage/home/.zfs/snapshot
It is possible to write a script to perform regular snapshots on user data. However, over time, snapshots may consume a great deal of disk space. The previous snapshot may be removed using the following command:
#
zfs destroy storage/home@08-30-08
After testing, /storage/home
can be made the
real /home
using
this command:
#
zfs set mountpoint=/home storage/home
Run df
and
mount
to confirm that the system now
treats the file system as the real
/home
:
#
mount
/dev/ad0s1a on / (ufs, local)
devfs on /dev (devfs, local)
/dev/ad0s1d on /usr (ufs, local, soft-updates)
storage on /storage (zfs, local)
storage/home on /home (zfs, local)
#
df
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/ad0s1a 2026030 235240 1628708 13% /
devfs 1 1 0 100% /dev
/dev/ad0s1d 54098308 1032826 48737618 2% /usr
storage 26320512 0 26320512 0% /storage
storage/home 26320512 0 26320512 0% /homeThis completes the RAID-Z configuration. To get status updates about the file systems created during the nightly periodic(8) runs, issue the following command:
#
echo 'daily_status_zfs_enable="YES"' >> /etc/periodic.conf
Every software RAID has a method of
monitoring its state
. The status of
RAID-Z devices may be viewed with the
following command:
#
zpool status -x
If all pools are healthy and everything is normal, the following message will be returned:
If there is an issue, perhaps a disk has gone offline, the pool state will look similar to:
This indicates that the device was previously taken offline by the administrator using the following command:
#
zpool offline storage da1
It is now possible to replace
da1
after the system has been
powered down. When the system is back online, the following
command may issued to replace the disk:
#
zpool replace storage da1
From here, the status may be checked again, this time
without the -x
flag to get state
information:
#
zpool status storage
pool: storage
state: ONLINE
scrub: resilver completed with 0 errors on Sat Aug 30 19:44:11 2008
config:
NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz1 ONLINE 0 0 0
da0 ONLINE 0 0 0
da1 ONLINE 0 0 0
da2 ONLINE 0 0 0
errors: No known data errorsAs shown from this example, everything appears to be normal.
ZFS uses checksums to verify the integrity of stored data. These are enabled automatically upon creation of file systems and may be disabled using the following command:
#
zfs set checksum=off storage/home
Doing so is not recommended as
checksums take very little storage space and are used to
check data integrity using checksum verification in a
process is known as “scrubbing.” To verify the
data integrity of the storage
pool, issue
this command:
#
zpool scrub storage
This process may take considerable time depending on the amount of data stored. It is also very I/O intensive, so much so that only one scrub may be run at any given time. After the scrub has completed, the status is updated and may be viewed by issuing a status request:
#
zpool status storage
pool: storage
state: ONLINE
scrub: scrub completed with 0 errors on Sat Jan 26 19:57:37 2013
config:
NAME STATE READ WRITE CKSUM
storage ONLINE 0 0 0
raidz1 ONLINE 0 0 0
da0 ONLINE 0 0 0
da1 ONLINE 0 0 0
da2 ONLINE 0 0 0
errors: No known data errorsThe completion time is displayed and helps to ensure data integrity over a long period of time.
ZFS supports different types of quotas: the refquota, the general quota, the user quota, and the group quota. This section explains the basics of each type and includes some usage instructions.
Quotas limit the amount of space that a dataset and its descendants can consume, and enforce a limit on the amount of space used by filesystems and snapshots for the descendants. Quotas are useful to limit the amount of space a particular user can use.
Quotas cannot be set on volumes, as the
volsize
property acts as an implicit
quota.
The
refquota=
limits the amount of space a dataset can consume by
enforcing a hard limit on the space used. However, this
hard limit does not include space used by descendants, such
as file systems or snapshots.size
To enforce a general quota of 10 GB for
storage/home/bob
, use the
following:
#
zfs set quota=10G storage/home/bob
User quotas limit the amount of space that can be used
by the specified user. The general format is
userquota@
,
and the user's name must be in one of the following
formats:user
=size
POSIX compatible name such as
joe
.
POSIX
numeric ID such as
789
.
SID name
such as
joe.bloggs@example.com
.
SID
numeric ID such as
S-1-123-456-789
.
For example, to enforce a quota of 50 GB for a user
named joe
, use the
following:
#
zfs set userquota@joe=50G
To remove the quota or make sure that one is not set, instead use:
#
zfs set userquota@joe=none
User quota properties are not displayed by
zfs get all
.
Non-root
users can only see their own
quotas unless they have been granted the
userquota
privilege. Users with this
privilege are able to view and set everyone's quota.
The group quota limits the amount of space that a
specified group can consume. The general format is
groupquota@
.group
=size
To set the quota for the group
firstgroup
to 50 GB,
use:
#
zfs set groupquota@firstgroup=50G
To remove the quota for the group
firstgroup
, or to make sure that
one is not set, instead use:
#
zfs set groupquota@firstgroup=none
As with the user quota property,
non-root
users can only see the quotas
associated with the groups that they belong to. However,
root
or a user with the
groupquota
privilege can view and set all
quotas for all groups.
To display the amount of space consumed by each user on
the specified filesystem or snapshot, along with any
specified quotas, use zfs userspace
.
For group information, use zfs
groupspace
. For more information about
supported options or how to display only specific options,
refer to zfs(1).
Users with sufficient privileges and
root
can list the quota for
storage/home/bob
using:
#
zfs get quota storage/home/bob
ZFS supports two types of space reservations. This section explains the basics of each and includes some usage instructions.
The reservation
property makes it
possible to reserve a minimum amount of space guaranteed
for a dataset and its descendants. This means that if a
10 GB reservation is set on
storage/home/bob
, if disk
space gets low, at least 10 GB of space is reserved
for this dataset. The refreservation
property sets or indicates the minimum amount of space
guaranteed to a dataset excluding descendants, such as
snapshots. As an example, if a snapshot was taken of
storage/home/bob
, enough disk space
would have to exist outside of the
refreservation
amount for the operation
to succeed because descendants of the main data set are
not counted by the refreservation
amount and so do not encroach on the space set.
Reservations of any sort are useful in many situations, such as planning and testing the suitability of disk space allocation in a new system, or ensuring that enough space is available on file systems for system recovery procedures and files.
The general format of the reservation
property is
reservation=
,
so to set a reservation of 10 GB on
size
storage/home/bob
, use:
#
zfs set reservation=10G storage/home/bob
To make sure that no reservation is set, or to remove a reservation, use:
#
zfs set reservation=none storage/home/bob
The same principle can be applied to the
refreservation
property for setting a
refreservation, with the general format
refreservation=
.size
To check if any reservations or refreservations exist on
storage/home/bob
, execute one of the
following commands:
#
zfs get reservation storage/home/bob
#
zfs get refreservation storage/home/bob
All FreeBSD documents are available for download at http://ftp.FreeBSD.org/pub/FreeBSD/doc/
Questions that are not answered by the
documentation may be
sent to <freebsd-questions@FreeBSD.org>.
Send questions about this document to <freebsd-doc@FreeBSD.org>.