Skip to main content

Manage Btrfs

Incus uses the second disk (/dev/sdb) as a storage, which is formated with Btrfs. This filesystem is the most suitable and efficient for our needs (because it supports copy-on-write, deduplication, etc.) However it is a little bit diffrent from the traditional ext2/ext3/ext4, and we may have some problems if we don’t know how to manage it properly.

1. Disable quotas

Quotas are usually used to restrict the size of home directories of users. In our server we don’t have multiple users, so we don’t need them. Besides, quotas in Btrfs, in the current implementation, may cause high CPU utilization and performance issues, especially when creating or deleting snapshots.

So, let's make sure that we disable them:

# mount the disk used as a storage by incus
mkdir mnt
mount /dev/sdb mnt
ls mnt/

# disable quotas
btrfs qgroup show mnt/
btrfs quota disable mnt/
btrfs qgroup show mnt/

# unmount
umount mnt
ls mnt/
rmdir mnt/

2 Balance

2.1 Check

Because of the way that Btrfs works, the command df may not always be accurate. For example it cannot tell how much unallocated disk space is available.

To see details on Btrfs disk usage, we need to use btrfs filesystem usage. This shows how each chunk type is allocated and how much unallocated space is available.

# overall check
btrfs filesystem show

# mount the disk used as a storage by incus
mkdir mnt
mount /dev/sdb mnt
ls mnt/

# check usage
btrfs filesystem df mnt/
btrfs filesystem usage mnt/
btrfs filesystem usage -T mnt/

# unmount
umount mnt
ls mnt/
rmdir mnt/

2.2 Balance

We can do it with btrfs balance start. However, running it without any filters, would re-write every data and metadata chunk in the filesystem. Usually, this is not what we want. Instead, we use balance filters to limit what chunks should be balanced.

With the option -dusage=5 we limit balance to compact data blocks that are less than 5% full. This is a good start, and we can increase it to 10-15% or more if needed. The goal here is to make sure there is enough unallocated space on each device in the filesystem to avoid the out-of-space error situations.

# mount the disk used as a storage by incus
mount /dev/sdb mnt

# balance
btrfs filesystem usage -T mnt/
btrfs balance start -dusage=5 mnt/
btrfs filesystem usage -T mnt/

# unmount
umount mnt
rmdir mnt/

2.3 Automate

We can automate the commands above with a script:

mkdir -p /root/misc/

cat <<'EOF' > /root/misc/btrfs-balance.sh
#!/bin/bash -x

disk=${1:-/dev/sdb}

mnt=$(dirname $0)/mnt/
mkdir -p $mnt

mount -t btrfs $disk $mnt
btrfs balance start -dusage=10 $mnt

umount $mnt
rmdir $mnt
EOF

chmod +x /root/misc/btrfs-balance.sh

Then we can create a cron job to call it periodically:

cat <<EOF > /etc/cron.d/btrfs-balance
0 3 * * 6 root /root/misc/btrfs-balance.sh &>/dev/null
EOF

3. Deduplicate

BTRFS supports deduplication. According to the BTRFS docs:

Deduplication is the process of looking up identical data blocks tracked separately and creating a shared logical link while removing one of the copies of the data blocks. This leads to data space savings while it increases metadata consumption.

BTRFS provides the basic building blocks for deduplication allowing other tools to choose the strategy and scope of the deduplication.

So, to take advantage of deduplication in BTRFS, we have to use one of these deduplication tools.

We will use BEES (Best-Effort Extent-Same). It is a block-oriented userspace deduplication agent designed for large btrfs filesystems.

3.1 Installation

In a Debian 12 server we have to build it from the source:

cd ~
git clone https://github.com/Zygo/bees
cd bees/
apt install -y build-essential btrfs-progs markdown
make
make install
which beesd
apt install -y uuid-runtime # it installs 'uuidparse'

3.2 Configuration

First we need to find out the UUIDs of the filesystems we want to run Bees on:

btrfs filesystem show

Then we should create a config file for each filesystem, like this:

cat <<EOF > /etc/bees/disk1.conf
UUID=91f2d0de-6678-4e89-9b0d-9ab8bdc724f2
OPTIONS="-P -v 6"
DB_SIZE=$((256*1024*1024))
EOF

nano /etc/bees/disk1.conf

The sample config file /etc/bees/beesd.conf.sample has also some comments with some explanations.

note

For more details look at this page.

3.3 Running

We want to run Bees as a service (one for each filesystem):

cp ~/bees/scripts/beesd@.service /lib/systemd/system/

systemctl enable --now beesd@fe0a1142-51ab-4181-b635-adbf9f4ea6e6.service

systemctl status 'bees*'

After it has been running for some time (maybe a few hours or more), we will notice that the amount of the used disk space is decreased:

btrfs filesystem show

The command top will also show that bees is working intensively.