2011-09-26

ZFS with 4k sectors on Debian GNU/kFreeBSD

Debian GNU/kFreeBSD is an operating system for IA-32 and x86-64 computer architectures. It is a distribution of GNU with Debian package management APT and the kernel of FreeBSD. The 'k' in kFreeBSD is an abbreviation for kernel of, and reflects the fact that only the kernel of the complete FreeBSD operating system is used.

Current stable distribution (Squeeze) provides zpool version 14.
Current testing distribution (Wheezy) provides zpool version 15.
FreeBSD 9.0-BETA2 provides zpool version 28.

For some reason the current kFreeBSD ZFS implementation creates zpools with property whole_disk=0 even when creating a zpool with whole device vdev. To remedy this, one can use FreeBSD 9.0-BETA2 to create a zpool with whole_disk=1 and with an older zpool version. Then export it and import to Debian.
# zpool create -o version=14 <pool> <vdev>

When creating raidz zpools to be used with 4k advanced format hard drives,  it is best to use raidz1 with 3, 5 or 9 disks, raidz2 with 6 or 10 disks, and raidz3 with 11 or 19 disks.

ZFS aligns zpools with hard disk logical sector size. All current hard drives (including the 4k advanced format ones) report 512 bytes as the logical sector size (ashift=9, 2⁹ = 512). To make ZFS align zpools with 4k sectors the ashift value has to be ashift=12, 2¹² = 4096.

I have created this way a 12 TB NAS using six Western Digital 3 TB disks in raidz2.

Install zfsutils which provides zpool, zfs and zdb commands, and freebsd-geom which provides gnop (through geom nop).
# aptitude install zfsutils
# aptitude install freebsd-geom

Find out the device names in your system.
# atacontrol list

or if using FreeBSD-9 kernel:
# camcontrol devlist

Create a NOP device for simulating 4k sector.
# geom nop create -v -S 4096 ad6

Create a zfs pool and export it.
# zpool create datapool ad6.nop
# zpool export datapool

Destroy the NOP device since it's only needed to set ashift=12 when creating the pool.
# geom nop destroy -v ad6.nop

Import back the pool.
# zpool import datapool

Confirm that the ashift value is 12.
# zdb datapool | grep ashift


(For testing the zpool creation with files one can use):
# dd if=/dev/zero of=<file> bs=1G seek=4096 count=0
# zpool create <pool> `mdconfig -f <file> -S 4096`


The zdb command seems to have a bug even still in Solaris 11 Express, when using it with an older zpool version. Update: this is fixed in Solaris 11 EA.
# zpool create -o version=14 <pool> <vdev>
# zdb <pool>
Assertion failed: mp->initialized == B_TRUE, file ../common/kernel.c, line 127,
function mutex_enter


How to mount an ext2 filesystem with FreeBSD?
# kldload -v ext2fs
# mount -t ext2fs <device> <mountpoint>

2 comments:

  1. Anonymous18.10.11

    I was able to use the directions above to create a RAID 10 (Striped Mirrored, or perhaps it is a mirrored stripe?) for two Hitachi 7K2000 (4k sector) drives and two Hitachi 5K3000 (4k sector) to make a 4.5 terabyte mirrored stripe pool:


    [root@freenas] ~# zpool list
    no pools available
    [root@freenas] ~# geom nop create -v -S 4096 ada1
    Done.
    [root@freenas] ~# geom nop create -v -S 4096 ada2
    Done.
    [root@freenas] ~# geom nop create -v -S 4096 ada3
    Done.
    [root@freenas] ~# geom nop create -v -S 4096 ada4
    Done.
    [root@freenas] ~# zpool create pentagon mirror ada1.nop ada3.nop
    cannot mount '/pentagon': failed to create mountpoint
    [root@freenas] ~# zpool add pentagon mirror ada2.nop ada4.nop
    [root@freenas] ~# zpool export pentagon
    [root@freenas] ~# geom nop destroy -v ada1.nop ada2.nop ada3.nop ada4.nop
    Done.
    [root@freenas] ~# zdb pentagon | grep ashift
    ashift=12
    ashift=12
    [root@freenas] ~# zpool status -v
    pool: pentagon
    state: ONLINE
    scrub: none requested
    config:

    NAME STATE READ WRITE CKSUM
    pentagon ONLINE 0 0 0
    mirror ONLINE 0 0 0
    ada1 ONLINE 0 0 0
    ada3 ONLINE 0 0 0
    mirror ONLINE 0 0 0
    ada2 ONLINE 0 0 0
    ada4 ONLINE 0 0 0

    errors: No known data errors
    Thanks a lot this was very helpful

    ReplyDelete
  2. Nice to hear! Probably pretty good performance with that mirrored stripe setup too!

    ReplyDelete