Manage Btrfs
Incus uses the second disk (/dev/sdb
) as a storage, which is
formated with Btrfs. This filesystem is the most suitable and
efficient for our needs (because it supports copy-on-write,
deduplication, etc.) However it is a little bit diffrent from the
traditional ext2/ext3/ext4, and we may have some problems if we don’t
know how to manage it properly.
1. Disable quotas
Quotas are usually used to restrict the size of home directories of users. In our server we don’t have multiple users, so we don’t need them. Besides, quotas in Btrfs, in the current implementation, may cause high CPU utilization and performance issues, especially when creating or deleting snapshots.
So, let's make sure that we disable them:
# mount the disk used as a storage by incus
mkdir mnt
mount /dev/sdb mnt
ls mnt/
# disable quotas
btrfs qgroup show mnt/
btrfs quota disable mnt/
btrfs qgroup show mnt/
# unmount
umount mnt
ls mnt/
rmdir mnt/
2 Balance
2.1 Check
Because of the way that Btrfs works, the command df
may not always
be accurate. For example it cannot tell how much unallocated disk
space is available.
To see details on Btrfs disk usage, we need to use btrfs filesystem usage
. This shows how each chunk type is allocated and how much
unallocated space is available.
# overall check
btrfs filesystem show
# mount the disk used as a storage by incus
mkdir mnt
mount /dev/sdb mnt
ls mnt/
# check usage
btrfs filesystem df mnt/
btrfs filesystem usage mnt/
btrfs filesystem usage -T mnt/
# unmount
umount mnt
ls mnt/
rmdir mnt/
2.2 Balance
We can do it with btrfs balance start
. However, running it without
any filters, would re-write every data and metadata chunk in the
filesystem. Usually, this is not what we want. Instead, we use balance
filters to limit what chunks should be balanced.
With the option -dusage=5
we limit balance to compact data blocks
that are less than 5% full. This is a good start, and we can increase
it to 10-15% or more if needed. The goal here is to make sure there is
enough unallocated space on each device in the filesystem to avoid the
out-of-space error situations.
# mount the disk used as a storage by incus
mount /dev/sdb mnt
# balance
btrfs filesystem usage -T mnt/
btrfs balance start -dusage=5 mnt/
btrfs filesystem usage -T mnt/
# unmount
umount mnt
rmdir mnt/
2.3 Automate
We can automate the commands above with a script:
mkdir -p /root/misc/
cat <<'EOF' > /root/misc/btrfs-balance.sh
#!/bin/bash -x
disk=${1:-/dev/sdb}
mnt=$(dirname $0)/mnt/
mkdir -p $mnt
mount -t btrfs $disk $mnt
btrfs balance start -dusage=10 $mnt
umount $mnt
rmdir $mnt
EOF
chmod +x /root/misc/btrfs-balance.sh
Then we can create a cron job to call it periodically:
cat <<EOF > /etc/cron.d/btrfs-balance
0 3 * * 6 root /root/misc/btrfs-balance.sh &>/dev/null
EOF
3. Deduplicate
BTRFS supports deduplication. According to the BTRFS docs:
Deduplication is the process of looking up identical data blocks tracked separately and creating a shared logical link while removing one of the copies of the data blocks. This leads to data space savings while it increases metadata consumption.
BTRFS provides the basic building blocks for deduplication allowing other tools to choose the strategy and scope of the deduplication.
So, to take advantage of deduplication in BTRFS, we have to use one of these deduplication tools.
We will use BEES (Best-Effort Extent-Same). It is a block-oriented userspace deduplication agent designed for large btrfs filesystems.
3.1 Installation
In a Debian 12 server we have to build it from the source:
cd ~
git clone https://github.com/Zygo/bees
cd bees/
apt install -y build-essential btrfs-progs markdown
make
make install
which beesd
apt install -y uuid-runtime # it installs 'uuidparse'
3.2 Configuration
First we need to find out the UUIDs of the filesystems we want to run Bees on:
btrfs filesystem show
Then we should create a config file for each filesystem, like this:
cat <<EOF > /etc/bees/disk1.conf
UUID=91f2d0de-6678-4e89-9b0d-9ab8bdc724f2
OPTIONS="-P -v 6"
DB_SIZE=$((256*1024*1024))
EOF
nano /etc/bees/disk1.conf
The sample config file /etc/bees/beesd.conf.sample
has also some
comments with some explanations.
For more details look at this page.
3.3 Running
We want to run Bees as a service (one for each filesystem):
cp ~/bees/scripts/beesd@.service /lib/systemd/system/
systemctl enable --now beesd@fe0a1142-51ab-4181-b635-adbf9f4ea6e6.service
systemctl status 'bees*'
After it has been running for some time (maybe a few hours or more), we will notice that the amount of the used disk space is decreased:
btrfs filesystem show
The command top
will also show that bees
is working intensively.