Post

Tuning a NAS for ZFS and 10G Networking

For running a Linux based NAS a little faster.

These notes are written using Rocky Linux but should be broadly applicable.

TuneD

Install and enable TuneD if you don’t already have it:

1
2
sudo dnf install tuned
sudo systemctl enable --now tuned

TuneD Profile

Create a custom profile:

1
2
sudo mkdir /etc/tuned/file-server
sudo vim /etc/tuned/file-server/tuned.conf

Add contents:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
[main]
include=throughput-performance

# Disk settings for ZFS
[disk]
replace=1
devices=!sd*,!nvme*

[disk_hdd]
type=disk
devices=sd*
readahead=4096
elevator=none

[disk_ssd]
type=disk
devices=nvme*
readahead=0
elevator=none

# Network tuning for 10G+
[sysctl]
replace=1
# allow testing with buffers up to 128MB
net.core.rmem_max=134217728
net.core.wmem_max=134217728
# increase Linux autotuning TCP buffer limit to 64MB
net.ipv4.tcp_rmem=4096 87380 67108864
net.ipv4.tcp_wmem=4096 65536 67108864
# increase the length of the processor input queue
net.core.netdev_max_backlog=250000
# recommended default congestion control is htcp
net.ipv4.tcp_congestion_control=htcp
net.core.default_qdisc=fq

Activate the profile:

1
2
3
sudo tuned-adm profile file-server
# Verify
sudo tuned-adm active

ZFS Tuning

On Linux the default ZFS ARC will use up to 50% of memory. I like to set the ARC min/max manually for the system.

Create a modprobe config for the ZFS module:

1
sudo vim /etc/modprobe.d/zfs.conf
1
2
3
4
5
# Set Max ARC size => 72GB == 77309411328 Bytes
options zfs zfs_arc_max=77309411328

# Set Min ARC size => 32GB == 34359738368 Bytes
options zfs zfs_arc_min=34359738368

Rebuild initramfs.

1
2
cd /boot
dracut -f initramfs-$(uname -r).img  $(uname -r)

Disk UDEV Rules

In my use case the nr_requests was set differently for half the disks so added a UDEV rule to set them all the same.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[user@nas ~]$ lsblk -t
NAME        ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE   RA WSAME
sda                 0   4096      0    4096     512    1 none     2936 4096    0B
├─sda1              0   4096      0    4096     512    1 none     2936 4096    0B
└─sda9              0   4096      0    4096     512    1 none     2936 4096    0B
sdb                 0   4096      0    4096     512    1 none     2936 4096    0B
├─sdb1              0   4096      0    4096     512    1 none     2936 4096    0B
└─sdb9              0   4096      0    4096     512    1 none     2936 4096    0B
...
sdh                 0   4096      0    4096     512    1 none       32 4096    0B
├─sdh1              0   4096      0    4096     512    1 none       32 4096    0B
└─sdh9              0   4096      0    4096     512    1 none       32 4096    0B
sdi                 0   4096      0    4096     512    1 none       32 4096    0B
├─sdi1              0   4096      0    4096     512    1 none       32 4096    0B
└─sdi9              0   4096      0    4096     512    1 none       32 4096    0B
...

I used a bit of trial and error with echo xx | sudo tee /sys/block/sdX/queue/nr_requests to find the maximum acceptable value for all the disks.

Create the UDEV rule:

1
sudo nano /etc/udev/rules.d/99-disk-queue.rules
1
ACTION=="add|change", SUBSYSTEM=="block",ENV{PARTN}=="", ENV{ID_SCSI}=="1", ATTR{queue/nr_requests}="32"

The rule should exclude any partition device (PARTN) and only select SCSI disks (ID_SCSI).

Reload and activate rules:

1
2
sudo udevadm control --reload-rules
sudo udevadm trigger --type=devices --action=change

Verify changes with lsblk -t

Apply At Boot

I couldn’t get nr_requests to set at boot even though the UDEV rules worked when applied manually. So I added a systemd oneshot unit to call the command after startup of ZFS.

Create systemd unit:

1
sudo nano /etc/systemd/system/disk-udev.service
1
2
3
4
5
6
7
8
9
10
11
12
[Unit]
Description=Apply disk udev rules
Requires=zfs.target
After=zfs.target

[Service]
Type=oneshot
ExecStartPre=sleep 30s
ExecStart=udevadm trigger --type=devices --action=change

[Install]
WantedBy=multi-user.target

Modify sleep time as needed.

Reload and enable:

1
2
sudo systemctl daemon-reload
sudo systemctl enable disk-udev.service

Sources

Networking

https://fasterdata.es.net/host-tuning/linux/ https://darksideclouds.wordpress.com/2016/10/10/tuning-10gb-nics-highway-to-hell/ https://www.kernel.org/doc/ols/2009/ols2009-pages-169-184.pdf https://cromwell-intl.com/open-source/performance-tuning/ethernet.html

ZFS

https://openzfs.readthedocs.io/en/latest/performance-tuning.html#whole-disks-vs-partitions https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Workload%20Tuning.html#whole-disks-versus-partitions https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html#zfs-arc-max

This post is licensed under CC BY 4.0 by the author.