Not Enough Space on Btrfs Filesystem Due to Exhausted Metadata Block Group Allocation

April 25, 2023, 9 p.m.

The Btrfs filesystem for Linux is believed by some to be the next default filesystem for Linux, replacing the current de-facto standard default filesystem, ext4. It has many advanced features which are not available in the current standard filesystem, such as copy-on-write capability which allows the creation of snapshots, which themselves allow rolling back a filesystem to a previous state. The drawback of the filesystem, however, is that it requires regular -- automatable -- maintenance operations. One such operation is the balancing, or the reallocating of disk space between the various types of block group allocations used by the Btrfs filesystem.

This article describes a situation when a Linux system becomes unusable due to allocated block groups for metadata in the Btrfs filesystem become exhausted, preventing new files from being written. This was observed in an Arch Linux system installed using An Arch Linux Installation on a Btrfs Filesystem with Snapper for System Snapshots and Rollbacks. The solution to the problem which requires a special use of btrfs balance to reallocate the available storage to the various block group types is presented.

Introduction

The Btrfs filesystem for Linux is believed by some to be the next default filesystem for Linux, replacing the current de-facto standard default filesystem. One reason for its possible adoption is increasing demand from data centers for larger storage capability with ability to access more files and for filesystems that are easily scalable. Another is its fault tolerance, not only at the device level, but at the individual file level. This is suggested by the Btrfs developers website:

Linux has a wealth of filesystems from which to choose, but we are facing a number of challenges with scaling to the large storage subsystems that are becoming common in today's data centers. Filesystems need to scale in their ability to address and manage large storage, and also in their ability to detect, repair and tolerate errors in the data stored on disk.

Filesystem Max File Size (standard units) Max File Size (bytes) Max Volume Size (standard units) Max Volume Size (bytes) Max Number of Files
Btrfs 16 EiB (18.44 EB) 264 16 EiB (18.44 EB) 264 264
exFAT 16 EiB (18.44 EB) 264 64 ZiB (75.55 ZB) 276 232
ext4 16 GiB (17.17 GB) to 16 TiB (17.59 TB)
range due to variable allocation unit size
234 to 44 1 EiB (1.52 EB) 260 232
HFS+ ~8 EiB 263 ~8 EiB 263
NTFS 16 TiB (17.59 TB) to 8 PiB (9.007 PB) 244 to 2 53 16 TiB (17.59 TB) to 8 PiB (9.007 PB) 244 to 253 232
ReFS 16 EiB (18.44 EB) 264 1 YiB (1.208 YB) 280
UFS I 16 GiB (17.17 GB) to 256 TiB (281.4 TB) 234 to 248 16 EiB (18.44 EB) 264 32,767 subdirectories per directory
UFS II 512 GiB (549.7 GB) to 32 PiB (36.02 PB) 239 to 255 512 ZiB (604.4 ZB) 279 32,767 subdirectories per directory
XFS 8 EiB (9.223 EB) 263 8 EiB (9.223 EB) 263
ZFS 16 EiB (18.44 EB) 264 281,474,976,710,656 YiB (340,282,366,920,938.463 YB) 2128 2128

Based on the above table, adapted from Comparison of Filesystems - Wikipedia, it is clear that Btrfs offers more storage capability in terms of maximum file size and maximum volume size than ext4; it even surpasses XFS, specifically developed for increased storage capacity by Silicon Graphics, and the current default filesystem supported by Red Hat. Only ZFS, developed by Sun Microsystems in the 2000s, surpasses Btrfs in storage capability. In fact it is vastly superior to all filesystems in storage capability.

Besides storage capability, ZFS has numerous advanced features that make it a next generation filesystem. However, ZFS is not usable on Linux without contending with licensing issues and/or installing it from source, which may also require installation of the kernel from source. And, despite its superior storage capability and fault tolerance on the device level, it lacks in fault tolerance on the file level. Btrfs is fault tolerant and self healing on the file level, as demonstrated in the excellent Ars Technica article, Bitrot and atomic COWs: Inside “next-gen” filesystems. Btrfs includes the following advanced features (some of which can also be found on ZFS):

  • Volume management to configure multiple block storage devices in Btrfs-RAID
  • Copy-on-write capability in which after a file is initially written, only modifications are subsequently stored with references to the previous state of the file
  • Copy-on-write capability allows snapshots and rollbacks
  • Check-summing of files to detect file corruption
  • Multiple copies of block groups on a single device or multiple devices for data redundancy
  • Ability to clone or send a subvolume to a different device

There are drawbacks to the Btrfs filesystem, however. One is the lack of maturity and reliability in some higher RAID configurations. Another is that its advanced features require more space to be provided for an installation and that it tends to exhaust the storage space more quickly than other filesystems for the amount of actual OS, application, and user data, due to additional space required for COW and snapshots. To mitigate this, and for other reasons such as sustaining stability, regular maintenance is required.

The Problem

A practical consequence of the drawback of Btrfs is that without regular maintenance, suddenly there may not be enough space left to write to storage because snapshots have filled the available space or because the space allocated for metadata has been exhausted. The latter recently occurred on my Arch Linux system installed following the process descried in An Arch Linux Installation on a Btrfs Filesystem with Snapper for System Snapshots and Rollbacks. During a recent update to the system, downloaded packages could not be extracted by the package manager, i.e. extracted files could not be written to the disk. The error produced by pacman is shown in the following listing.

[2023-04-19T19:02:18-0400] [ALPM] upgraded libxnvctrl (525.89.02-1 -> 530.41.03-1)
[2023-04-19T19:02:19-0400] [ALPM] upgraded libxvmc (1.0.13-1 -> 1.0.13-2)
[2023-04-19T19:02:19-0400] [ALPM] upgraded licenses (20220125-1 -> 20220125-2)
[2023-04-19T19:02:40-0400] [ALPM] upgraded linux-headers (6.2.2.arch1-1 -> 6.2.11.arch1-1)
[2023-04-19T19:03:36-0400] [ALPM] error: could not extract /usr/lib/modules/6.1.24-1-lts/build/include/config/IP_NF_MATCH_ECN (Can't create '/usr/lib/modules/6.1.24-1-lts/build/include/config/IP_NF_MATCH_ECN')
[2023-04-19T19:03:36-0400] [ALPM] error: could not extract /usr/lib/modules/6.1.24-1-lts/build/include/config/IP_NF_MATCH_TTL (Can't create '/usr/lib/modules/6.1.24-1-lts/build/include/config/IP_NF_MATCH_TTL')
[2023-04-19T19:03:36-0400] [ALPM] error: could not extract /usr/lib/modules/6.1.24-1-lts/build/include/config/IP_NF_RAW (Can't create '/usr/lib/modules/6.1.24-1-lts/build/include/config/IP_NF_RAW')

... truncated ...

[2023-04-19T19:06:52-0400] [ALPM] error: could not extract /usr/lib/modules/6.1.24-1-lts/build/include/soc/ (Can't create '/usr/lib/modules/6.1.24-1-lts/build/include/soc')
[2023-04-19T19:06:54-0400] [ALPM] error: could not extract /usr/lib/modules/6.1.24-1-lts/build/include/sound/ (Can't create '/usr/lib/modules/6.1.24-1-lts/build/include/sound')
[2023-04-19T19:06:58-0400] [ALPM] error: could not extract /usr/lib/modules/6.1.24-1-lts/build/include/target/ (Can't create '/usr/lib/modules/6.1.24-1-lts/build/include/target')
[2023-04-19T19:06:58-0400] [ALPM] error: could not extract /usr/lib/modules/6.1.24-1-lts/build/include/trace/ (Can't create '/usr/lib/modules/6.1.24-1-lts/build/include/trace')
[2023-04-19T19:07:02-0400] [ALPM] error: could not extract /usr/lib/modules/6.1.24-1-lts/build/include/uapi/ (Can't create '/usr/lib/modules/6.1.24-1-lts/build/include/uapi')
[2023-04-19T19:07:25-0400] [ALPM] error: could not extract /usr/lib/modules/6.1.24-1-lts/build/include/xen/ (Can't create '/usr/lib/modules/6.1.24-1-lts/build/include/xen')
[2023-04-19T19:12:43-0400] [PACMAN] Running 'pacman -Syy'
[2023-04-19T19:12:43-0400] [PACMAN] synchronizing package lists

At this point the system became unusable and required me to access the Arch installation's Btrfs filesystem from an openSUSE Tumbleweed installation on the same computer -- which also uses Btrfs and Snapper, the configuration of which the Arch system duplicates -- appropriately mounting all Btrfs subvolumes used in the Arch installation. I could see the problem by using

sudo btrfs fi us /mnt/arch

an alternate abbreviated form of

sudo btrfs filesystem usage /mnt/arch

where /mnt/arch is the mountpoint of the default subvolume of the Arch installation and contains within it mountpoints for other subvolumes used in the Arch installation other than the default subvolume.

Before describing the output it is important to be aware of some of the Btrfs filesystem in order to make the information contained in the output of the command meaningful.

Block Group Allocation Types

The filesystem allocates space on storage for three different types of block groups, Data, Metadata, and System, where Data block groups store data, which as the name suggests is just the files that make up the operating system, programs, and user data; Metadata block groups store the information about the data in B-trees; and System block groups which stores the structures that represent the filesystem as a logically linear array.

Profiles

The filesystem can duplicate block groups of each type, either on a single storage device or multiple storage devices. The characteristics of the duplication of the allocation types on various storage devices, on the same device, or the lack of duplication is described by the term profile, where each type of allocation has its own profile. The profile types are listed in the following table based on a similar table in the Btrfs documentation.

Btrfs Block Group Type Profiles
Profile Redundant Copies Redundancy Parity Redundancy Striping Space Utilization Min/Max Number of Devices
SINGLE block group type on a single device, no duplication 1 (no redundancy) N/A N/A 100% 1/unlimited
DUP duplication on a single device 2, on one device (in earlier implementations, original intention of profile)
2, on unlimited devices (newer implementations)
N/A N/A 50% 1/unlimited
raid0 A type of RAID that offers no redundancy but allows multiple storage devices to be used logically as on storage device to allow for more storage and higher performance 1 (no redundancy) N/A X 100% 1/unlimited
raid1 A type of RAID in which stored information is duplicated on a minimum of two devices 2 N/A N/A 50% 2/unlimited
raid1c3 3 N/A N/A 33% 3/unlimited
raid1c4 4 N/A N/A 25% 4/unlimited
raid5 A type of RAID which allows multiple storage devices to be used logically as one device, storing a single copy of data, and also stores one device worth of parity data. It allows recovery from a single storage device failure. 1 1 X X 2/unlimited
raid6 A type of RAID which allows multiple storage devices to be used logically as one device, storing a single copy of data and also stores two device's worth of data. It allows recovery from two storage device failures. 1 2 X X 3/unlimited
raid10 A type of RAID which allows a set of multiple devices to be used logically as one device, and mirrors data on on multiple sets of other devices. 2 N/A X 50% 2

Now that we can interpret the meaning of the information in the output of btrfs filesystem usage, shown in the following image, we can discuss the output of the command. It displays information about the usage statistics in separate areas -- for the filesystem in general in the area labeled Overall: and for each of the Data, Metadata, and System Block Group allocation types in areas of the output named for the allocation types -- also indicating the profile of each of these block group types. There is also an area labeled Unallocated which lists the size of unallocated space by device.

Output of btrfs filesystem usage /mnt/arch Showing There Is No Storage Space Left

  • In the "Overall" area, "Device Size" of 85 GiB represents the 85 GiB partition used for the Arch installation, except for /home which is on a separate non Btrfs partition.
  • All 85 GiB of the partition's space had been allocated to the three types of allocation in a Btrfs filesystem, Data, Metadata, and System as indicated by the "Device allocated" field in the Overall area. A negligible 1.00 MiB is still unallocated by the filesystem for any type of Block Groups, as indicated by the "Device unallocated" field.
  • The Data area, whose label indicates that this block group type has a "single" profile indicates that of the 85 GiB allocated, 76.98 GiB had been allocated for Data. And of this 76.98 GiB allocated space for Data, only 49.70 GiB was used (64.56% of space allocated for data was used).
  • The Metadata area, whose label indicates that this block group type has a "DUP" profile indicates that of the 85 GiB allocated, 4.00 GiB had been allocated for Metadata Block Group allocation type. And of this 4.00 GiB allocated space for Metadata, only 3.82 GiB was used (95.57% of space allocated for Metadata was used). This information is misleading in that because this profile is DUP on a single device, two areas of 4.00 GiB are allocated for metadata, each containing identical copies of information, and copy's allocated blocks 95.75% occupied.
  • The System area, whose label indicates that this block group type has a "DUP" profile indicates that of the 85 GiB allocated, 8 MiB had been allocated for System. And of this 8 MiB allocated space for System, only 32 KiB was used (0.39% of space allocated for data was used). Like the Metadata block group allocation, because the profile of this block group is DUP, two regions of storage of the size indicated are actually allocated, each containing identical copies of stored information, each 0.39% occupied.

Without taking into account that the Metadata and System allocation groups each have two allocations of the size indicated in the output of the btrfs filesystem usage, the total allocated space for each of the allocation types would not equal the indicated total allocated space on the device. However, if the duplicate allocations for Metadata and System are taken into account, the total is 85 GiB -- actually 84.996 GiB, the difference being due to rounding and the 1 MiB unallocated space -- as indicated in the "Overall" area of the output.

Allocation Group Allocated Space
Data 76.98 GiB
Metadata 4.00 GiB
Second Metadata Allocation (DUP) on 1 device 4.00 GiB
System 0.008 GiB (8.00 MiB)
Second System Allocation (DUP) on 1 device 0.008 GiB (8.00 MiB)
Total of Allocations 84.996 GiB

So, having seen the status of the filesystem as indicated by btrfs filesystem usage, it would seem that problem of not enough storage space being available on the Btrfs filesystem is not because of actual user, application, or OS data taking up all available storage space -- as suggested by the fact that the allocated space for Data block groups is not exhausted -- but because the space allocated for the Metadata allocation group has been depleted, preventing metadata for new or modified files to be recorded, thus preventing new files to be written or existing files to be modified. This usually happens when the amount of files written are unusually high. And, because space on the device is completely allocated, unallocated space can be allocated for Metadata.

The process to to balance the Btrfs filesystem, including the attempt to free space and add another device to the filesystem is presented later in the article, but the result of the process is shown in the following image showing a Konsole window with the filesystem status according to btrfs filesystem usage before the balance and another Konsole window (from the rebalanced Arch system after the breakages I introduced when attempting to free space by deleting some Plasma cache files) with the filesystem status after both balance operations the balance.

Solution

The solution for the problem is to use the btrfs balance command to reallocate space on the storage device(s) for the three block group allocation types. The basic form of the command is

btrfs balance <subcommand>

where some subcommands take a path to the mounted Btrfs filesystem. In the case of the Arch installation that is the subject of this post, this is the default subvolume mounted at the filesystem hierarchy location /. Other subvolumes are mounted at locations under /.

The <subcommand> can be

  • start to start a balance operation, takes the path to the filesystem hierarchy location where subvolume that contains the root of the filesystem hierarchy is mounted, also takes subcommand options
  • pause to pause a running balance operation, takes the path to the filesystem hierarchy location where subvolume that contains the root of the filesystem hierarchy is mounted
  • resume to resume a paused balance operation, takes the path to the filesystem hierarchy location where subvolume that contains the root of the filesystem hierarchy is mounted
  • cancel to cancel a running balance operation, takes the path to the filesystem hierarchy location where subvolume that contains the root of the filesystem hierarchy is mounted

Options to the start subcommand, called filters, can be used to specify the block group allocation type on which to act, specify that block groups only on a certain device should be balanced, and convert an block group allocation type's profile, among other specifications. Available filters are profiles, usage, devid, drange, vrange, convert, limit, stripes, and soft.

Some filters can be assigned parameter values as in

filter=parameter-value

Multiple filters can be entered as comma separated lists. The allocation group type on which the filter is to be applied can also be specified with one of three prefix to the filter name, -d, -m, or -s, e.g., -dconvert to convert a Data block group allocation's profile, or -mconvert to convert a Metadata block group allocation profile.

The btrfs balance start command requires at least 1 GiB free space in order to run successfully, so it was first necessary to mount the subvolumes of the Arch installation and delete some files. First, I chose to delete some SDDM cache directories (which turned out to be a bad choice as it broke the graphical interface, which I realized when I finally attempted to log in to the Arch system and was presented with a console login prompt). I then deleted files in /var/cache/pacman/pkg/, which freed a considerable amount of space from the installation, but the balance command still wouldn't execute, complaining of not enough space to perform the operation. To get around this problem I added another device (actually just a 9 GiB partition) to the Btrfs filesystem with the btrfs add command, creating a two device raid0 profile consisting of the original Btrfs partition and the new partition.

The addition of the new partition to the filesystem allowed the balance command to run. The command used options to convert the Metadata and System block group allocation profiles from DUP to single. This operation reduced the total allocated space from 85.00 GiB to 48.03 GiB, taking the unused space from the Data block group allocation. It also increased the allocation for System from 8.00 MiB to 32.00 MiB, however, it did not increase the allocation for Metadata from the original nearly exhausted 4.00 GiB.

A second balance operation without any options increased the allocated space for Metadata from 4.00 GiB to 5.00 GiB. The following table shows the result of both btrfs balance operations.

Allocation Group Allocated Space Before Balance Allocated Space After Balance
Data 76.98 GiB 44.00 GiB
Metadata 4.00 GiB 4.00 GiB
Second Metadata Allocation (DUP) on 1 device 4.00 GiB
System 0.008 GiB (8.00 MiB)
Second System Allocation (DUP) on 1 device 0.008 GiB (8.00 MiB)
Total of Allocations 84.996 GiB

A comparison between the original state and the final state of the filesystem is also displayed in the following image. The complete process from deleting files in the installation by mounting the subvolumes of the Arch installation in an openSUSE installation on the same computer to executing the balance operation through a chroot to the Arch installation is presented later in this post.

Comparison of btrfs filesystem usage / Output Before and After Balancing

Finding and Deleting Files to Free Space to Allow btrfs balance to Run

According to ENOSPC - No available disk space | Forza's Ramblings, at least 1 GiB must be free in the Metadata allocation in order to run a balance operation. Forza's Ramblings suggests deleting enough files to free up at least this much space. In order to free up this much space, I decided to delete files from the Arch installation's /var/cache/pacman directory. Because the Arch installation was unusable due to the lack of space on the Btrfs filesystem, I mounted the Arch installation's Btrfs subvolumes on the openSUSE Tumbleweed installation on the same computer and deleted old files from there. Although a significant amount of space was freed, a btrfs balance operation could not run. This is presumably because, while the files were deleted, it did not free enough used space specifically in the Metadata block group allocation.

The process used to delete the files may be informative anyway, so it is presented below. Later in the article the process to add another device to the Btrfs filesystem is presented that allows the balance operation.

  1. Mount the subvolume that contains the Arch installations's filesystem hierarchy root in openSUSE's /mnt/arch directory -- created just for this purpose.
    sudo mount /dev/nvme1n1p6 /mnt/arch
  2. Mount the other subvolumes used in the Arch installation. (See An Arch Linux Installation on a Btrfs Filesystem with Snapper for System Snapshots and Rollbacks for details.)
    /mnt/arch🔒 ❯$ sudo mount UUID=75603708-44c4-4c7f-bed9-1a6f31127d22 -o subvol=@/.snapshots,compress=zstd /mnt/arch/.snapshots
    /mnt/arch🔒 ❯$ sudo mount UUID=75603708-44c4-4c7f-bed9-1a6f31127d22 -o subvol=@/boot/grub,compress=zstd /mnt/arch/boot/grub
    /mnt/arch🔒 ❯$ sudo mount UUID=75603708-44c4-4c7f-bed9-1a6f31127d22 -o subvol=@/opt,compress=zstd /mnt/arch/opt
    /mnt/arch🔒 ❯$ sudo mount UUID=75603708-44c4-4c7f-bed9-1a6f31127d22 -o subvol=@/root,compress=zstd /mnt/root
    /mnt/arch🔒 ❯$ sudo mount UUID=75603708-44c4-4c7f-bed9-1a6f31127d22 -o subvol=@/root,compress=zstd /mnt/arch/root
    /mnt/arch🔒 ❯$ sudo mount UUID=75603708-44c4-4c7f-bed9-1a6f31127d22 -o subvol=@/srv,compress=zstd /mnt/arch/srv
    /mnt/arch🔒 ❯$ sudo mount UUID=75603708-44c4-4c7f-bed9-1a6f31127d22 -o subvol=@/tmp,compress=zstd /mnt/arch/tmp
    /mnt/arch🔒 ❯$ sudo mount UUID=75603708-44c4-4c7f-bed9-1a6f31127d22 -o subvol=@/usr/local,compress=zstd	/mnt/arch/usr/local
    /mnt/arch🔒 ❯$ sudo mount UUID=75603708-44c4-4c7f-bed9-1a6f31127d22 -o subvol=@/var/cache,nodatacow /mnt/arch/var/cache
    /mnt/arch🔒 ❯$ sudo mount UUID=75603708-44c4-4c7f-bed9-1a6f31127d22 -o subvol=@/var/log,nodatacow /mnt/arch/var/log
    /mnt/arch🔒 ❯$ sudo mount UUID=75603708-44c4-4c7f-bed9-1a6f31127d22 -o subvol=@/var/spool,nodatacow /mnt/arch/var/spool
    /mnt/arch🔒 ❯$ sudo mount UUID=75603708-44c4-4c7f-bed9-1a6f31127d22 -o subvol=@/srv,compress=zstd /mnt/arch/srv
    /mnt/arch🔒 ❯$ sudo mount UUID=75603708-44c4-4c7f-bed9-1a6f31127d22 -o subvol=@/var/tmp,nodatacow /mnt/arch/var/tmp
  3. Find large files in the installation. The largest unnecessary files will be downloaded packages in the /var/cache/pacman directory. The following listing shows the sorted and ordered output of du. Notably, it shows the contents of /var/cache/pacman/pkg is about 12 GiB large (the usage displayed by du is in units of 1024).
     75%  20:51:32  USER: brook HOST: 16ITH6-openSUSE   
    /mnt/arch🔒  ❯$ sudo du -a /mnt/arch/var/cache/pacman/ | sort -n -r | head -n 10
    12059072        /mnt/arch/var/cache/pacman/pkg
    12059072        /mnt/arch/var/cache/pacman/
    266544  /mnt/arch/var/cache/pacman/pkg/nvidia-utils-525.89.02-2-x86_64.pkg.tar.zst
    264860  /mnt/arch/var/cache/pacman/pkg/nvidia-utils-530.41.03-1-x86_64.pkg.tar.zst
    263812  /mnt/arch/var/cache/pacman/pkg/nvidia-utils-525.85.05-1-x86_64.pkg.tar.zst
    219604  /mnt/arch/var/cache/pacman/pkg/zoom-5.13.10-1-x86_64.pkg.tar.zst
    218636  /mnt/arch/var/cache/pacman/pkg/zoom-5.13.5-1-x86_64.pkg.tar.zst
    192944  /mnt/arch/var/cache/pacman/pkg/zoom-5.13.0-1-x86_64.pkg.tar.zst
    179576  /mnt/arch/var/cache/pacman/pkg/linux-6.2.2.arch1-1-x86_64.pkg.tar.zst
    177932  /mnt/arch/var/cache/pacman/pkg/linux-6.2.11.arch1-1-x86_64.pkg.tar.zst
    The largest files written before a certain date can also be listed. The following find command lists files in the pacman package cache older than 60 days.
    sudo find /mnt/arch/var/cache/pacman/pkg/ -type f -mtime +60
  4. The same find command as above with the -delete option will delete files in the pacman cache directory that are older than 60 days.
    sudo find /mnt/arch/var/cache/pacman/pkg/ -type f -mtime +60 -delete
    This operation reduced the size of the pacman cache directory by more than half, but did not allow the balance operation to run.
     71%  20:58:45  USER: brook HOST: 16ITH6-openSUSE   
    PCD: 2s /mnt/arch🔒  ❯$ sudo du -a /mnt/arch/var/cache/pacman/ | sort -n -r | head -n 10
    5795584 /mnt/arch/var/cache/pacman/pkg
    5795584 /mnt/arch/var/cache/pacman/
    264860  /mnt/arch/var/cache/pacman/pkg/nvidia-utils-530.41.03-1-x86_64.pkg.tar.zst
    219604  /mnt/arch/var/cache/pacman/pkg/zoom-5.13.10-1-x86_64.pkg.tar.zst
    179576  /mnt/arch/var/cache/pacman/pkg/linux-6.2.2.arch1-1-x86_64.pkg.tar.zst
    177932  /mnt/arch/var/cache/pacman/pkg/linux-6.2.11.arch1-1-x86_64.pkg.tar.zst
    177780  /mnt/arch/var/cache/pacman/pkg/linux-6.2.7.arch1-1-x86_64.pkg.tar.zst
    167644  /mnt/arch/var/cache/pacman/pkg/linux-lts-6.1.15-1-x86_64.pkg.tar.zst
    167628  /mnt/arch/var/cache/pacman/pkg/linux-lts-6.1.20-1-x86_64.pkg.tar.zst
    167492  /mnt/arch/var/cache/pacman/pkg/linux-lts-6.1.24-1-x86_64.pkg.tar.zst
    
     71%  20:58:49  USER: brook HOST: 16ITH6-openSUSE   
    /mnt/arch🔒  ❯$ 

Adding Another Device to the Btrfs Filesystem

Since the deletion of over 5 GiB from pacman's cache did not reduce Metadata block group allocation by the necessary amount, another method was required to allow enough free space in the Metadata allocation, namely by adding another device to the Arch installation's Btrfs filesystem by using the command

btrfs device add

I used unformatted storage space on one of the disks on the Legion 5i Pro to create a Btrfs partition, as shown in the following listing, and added it to the Btrfs filesystem, after which a Btrfs balance operation to complete successfully. An actual separate disk, such as a USB thumb drive could have been used as the added device.

 66%  21:09:20  USER: brook HOST: 16ITH6-openSUSE   
PCD: 3s ~  ❯$ sudo blkid | grep btrfs-mai
/dev/nvme0n1p10: LABEL="btrfs-maintenance" UUID="ab444deb-e5ac-4fcd-9ede-a498b146e943" UUID_SUB="fe78738e-983b-4bac-9211-702da6d6eeef" BLOCK_SIZE="4096" TYPE="btrfs" PARTUUID="0b448eba-2d18-c547-ad99-ba6a5db33680"
  1. Add the device with btrfs device add. If a filesystem already exists there will be an error as shown in the following listing.
     66%  21:07:45  USER: brook HOST: 16ITH6-openSUSE   
    /mnt/arch🔒  ❯$ sudo btrfs device add /dev/nvme0n1p10 /mnt/arch
    ERROR: /dev/nvme0n1p10 appears to contain an existing filesystem (btrfs)
    ERROR: use the -f option to force overwrite of /dev/nvme0n1p10
    As the listing shows the -f option is necessary to force overwrite of the existing filesystem on the added device, so the correct command would be
     66%  21:07:45  USER: brook HOST: 16ITH6-openSUSE   
     /mnt/arch🔒  ❯$ sudo btrfs device add /dev/nvme0n1p10 /mnt/arch
  2. After the additional device is added, the output of btrfs filesystem usage, shown in the following listing, indicates that the 'Device size' has increased from 85 GiB (as shown previously) to 94.04 GiB, an increase equal to the size of the added partition. The allocated space for Metadata increased to 4.50 GiB from 4.00 GiB, but what matters in terms of allowing the balance command to run (later in this article) is that there is now, as shown in the "Unallocated:" region of the output, unallocated space in the two device Btrfs filesystem of 9.01 GiB in the original partition and another 9.04 GiB on the newly added partition, where previously there had been only 1 MiB.
    	 63%  21:13:58  USER: brook HOST: 16ITH6-openSUSE   
    PCD: 5s /mnt/arch🔒  ❯$ sudo btrfs filesystem usage /mnt/arch
    
    Overall:
        Device size:                  94.04GiB
        Device allocated:             75.99GiB
        Device unallocated:           18.06GiB
        Device missing:                  0.00B
        Device slack:                    0.00B
        Used:                         51.38GiB
        Free (estimated):             41.29GiB      (min: 32.26GiB)
        Free (statfs, df):            41.29GiB
        Data ratio:                       1.00
        Metadata ratio:                   2.00
        Global reserve:              190.47MiB      (used: 0.00B)
        Multiple profiles:                  no
    
    Data,single: Size:66.97GiB, Used:43.74GiB (65.31%)
       /dev/nvme1n1p6         66.97GiB
    
    Metadata,DUP: Size:4.50GiB, Used:3.82GiB (84.92%)
       /dev/nvme1n1p6          9.00GiB
    
    System,DUP: Size:8.00MiB, Used:32.00KiB (0.39%)
       /dev/nvme1n1p6         16.00MiB
    
    Unallocated:
       /dev/nvme1n1p6          9.01GiB
       /dev/nvme0n1p10         9.04GiB
    
    
    

Performing the Balance Operation

Having added another device to the Btrfs filesystem, the balance operation can run.

  1.  53%  21:29:00  USER: brook HOST: 16ITH6-openSUSE   
     /mnt🔒  ❯$ sudo btrfs balance start -f -dconvert=single -mconvert=single /mnt/arch
    The command taken from the SUSE Support Page btrfs - No space left on device | Support SUSE uses the previously described -mconvert and -dconvert. The -dconvert option with the parameter value single does not really do anything in this case, since the Data block group allocation profile is already "single". The -mconvert filter with the "single" parameter value however does do something; it removes the duplicate copy of the Metadata clock group allocation, converting the profile from "DUP" to "single". It was not necessary to use either of these options but used the command as presented by the SUSE support page before adequately researching the command. It is possible to revert the Metadata profile to "DUP", as described later in the article.

    The operation took about ten minutes to complete. While it is running the progress can be viewed with the btrfs balance status

     51%  21:30:47  USER: brook HOST: 16ITH6-openSUSE   
     ~  ❯$ sudo btrfs balance status /mnt/arch
    Balance on '/mnt/arch' is running
    20 out of about 74 chunks balanced (21 considered),  73% left

    When the operation completes, the output indicates the number of block groups, using the term "chunk", that have been reallocated.

     53%  21:29:00  USER: brook HOST: 16ITH6-openSUSE   
     /mnt🔒  ❯$ sudo btrfs balance start -f -dconvert=single -mconvert=single /mnt/arch
    Done, had to relocate 74 out of 74 chunks
    

  2. Remove the additional device from the filesystem using the command btrfs device delete.
     45%  21:36:47  USER: brook HOST: 16ITH6-openSUSE   
    PCD: 6s /mnt🔒  ❯$ sudo btrfs device delete /dev/nvme1n1p10 /mnt/arch
    [sudo] password for root: 
    
    
  3. The state of the filesystem as it was at this point is shown in the following output of btrfs filesystem usage generated from a chroot into the Arch system.
       (chroot) [root@16ITH6-openSUSE /]# btrfs filesystem usage /
    Overall:
        Device size:                  85.00GiB
        Device allocated:             48.03GiB
        Device unallocated:           36.97GiB
        Device missing:                  0.00B
        Device slack:                    0.00B
        Used:                         47.60GiB
        Free (estimated):             37.21GiB      (min: 37.21GiB)
        Free (statfs, df):            37.21GiB
        Data ratio:                       1.00
        Metadata ratio:                   1.00
        Global reserve:              203.91MiB      (used: 0.00B)
        Multiple profiles:                  no
    
    Data,single: Size:44.00GiB, Used:43.76GiB (99.46%)
       /dev/nvme1n1p6         44.00GiB
    
    Metadata,single: Size:4.00GiB, Used:3.83GiB (95.85%)
       /dev/nvme1n1p6          4.00GiB
    
    System,single: Size:32.00MiB, Used:32.00KiB (0.10%)
       /dev/nvme1n1p6         32.00MiB
    
    Unallocated:
       /dev/nvme1n1p6         36.97GiB
    (chroot) [root@16ITH6-openSUSE /]# 

Repairing the System

When attempting to free enough space from the Metadata allocation group by deleting unnecessary files, before deleting files from /var/cache/pacman/pkg, I deleted files from among other places the SDDM cache, which resulted in a broken graphical interface, something I realized when I was presented with a console log in prompt when rebooting into the Arch system after a successful balancing of its Btrfs filesystem from openSUSE Tumbleweed. So I had to go back to Tumbleweed and chroot into the Arch system and reinstall some packages. The process to chroot to the Arch system with the openSUSE Btrfs layout with Snapper integration is more involved than with other filesystems because of the many subvolumes that need to be mounted individually before the chroot is activated.

  1. The subvolumes should be mounted in the installation from which the chroot is made as described in Step 1 - 2 in the Section Solution >> Finding and Deleting Files to Free Space to Allow btrfs balance to Run, above.
  2. Then change directory to the mount point.
     45% 00:52:23 USER: brook HOST: 16ITH6-openSUSE
    ~ ❯$ cd /mnt/arch
  3. Mount the other pseudo filesystems.
     47%  00:53:55  USER: brook HOST: 16ITH6-openSUSE   
    /mnt/arch🔒  ❯$ sudo mount -t proc /proc proc/
     47%  00:54:18  USER: brook HOST: 16ITH6-openSUSE   
    /mnt/arch🔒  ❯$ sudo mount -t sysfs /sys sys/
     47%  00:54:35  USER: brook HOST: 16ITH6-openSUSE   
    /mnt/arch🔒  ❯$ sudo mount -o bind /dev dev/
     48%  00:55:06  USER: brook HOST: 16ITH6-openSUSE   
    /mnt/arch🔒  ❯$ sudo mount --rbind /run run/
     48%  00:55:21  USER: brook HOST: 16ITH6-openSUSE   
    /mnt/arch🔒  ❯$ sudo mount -o bind /sys/firmware/efi/efivars sys/firmware/efi/efivars/
  4. Copy resolv.conf to be able to access the network from within the chroot environment.
     49% 00:55:59 USER: brook HOST: 16ITH6-openSUSE
    /mnt/arch🔒 ❯$ sudo cp /etc/resolv.conf etc/resolv.conf
  5. Execute the chroot command.
     50%  00:58:06  USER: brook HOST: 16ITH6-openSUSE   
     /mnt/arch🔒  ❯$ sudo chroot /mnt/arch /bin/bash
    [root@16ITH6-openSUSE /]# source /etc/profile
    [root@16ITH6-openSUSE /]# export PS1="(chroot) $PS1"
    (chroot) [root@16ITH6-openSUSE /]# 

Converting the Metadata Profile

The balance command used above included the options -mconvert with the value single. (See Sections Solution and Solution >> Performing the Balance Operation, above.) This was a mistake, as I wanted the to immediately fix the problem and followed the command given in the referenced SUSE support page without having read the documentation of the command. It did in fact fix the problem by of not enough space on the filesystem by reallocating the storage space among the three block group allocation types. However, the option deleted the duplicate Metadata and System block group allocations as created by default by the mkfs.btrfs command during the original installation.

To remake the profiles for the Metadata and System allocation groups as "DUP", all that is required is to run the balance command with the

-mconvert=DUP
which will convert the current single profiles of these two allocation groups as DUP, duplicating the allocation groups for these two types on a single device as they were originally. (When Metadata profiles are converted System profiles are also converted.)

The following image shows the state of the filesystem as indicated by btrfs filesystem usage before and after the profile conversion, where the Konsole window on the left shows the state before and the one on the right shows the state after the conversion. Prior to the conversion, the Metadate and System sections of the output indicate "single", whereas after it indicates "DUP". Also notable is the difference in the unallocated space; prior to the conversion, the "Overall" section of the output indicates a total unallocated space of 26.97 GiB and after the conversion it indicates unallocated space of 19.94 GiB. This difference of 7.03 GB is due to the duplicated Metadata (7 GiB) and System (0.032 GiB) block group allocations on the same device.

Converting the Metadata and System Block Group Allocations' Profiles from single to DUP
The btrfs balance command with the -mconvert=DUP option will convert the Metadata and System block group allocation profiles to DUP.

Conclusion

The Btrfs filesystem is filesystem with numerous advanced features that make it worthy of being the next standard filesystem for Linux. The drawback of the filesystem is that it requires its three types of block group allocations to be rebalanced or reallocated as stored data changes. Without this maintenance with the btrfs-balance command, either run manually or as a service, the system may become unusable when one of these block group allocations are depleted.

References