Btrfs

Revision as of 16:56, 24 September 2021 by Andreas85 (talk | contribs) (Btrfs)


Page under construction

btrfs is a modern CoW file system

A modern Copy on Write file system for Linux aimed at implementing advanced features while also focusing on fault tolerance, repair and easy administration. Btrfs not only is a file system, but also is partly a volume manager, software-raid, backup-tool, and it is flash-friendly.

Because btrfs is different, some things seem unfamiliar and strange. Then btrfs.wiki.kernel.org is a good starting point to search for answers. Development of Btrfs started in 2007. Since that time, Btrfs is a part of the Linux kernel and is under active development. The Btrfs code base is stable . However, new features are still under development. Its main features and benefits are:

  • Snapshots which do not make the full copy of files
  • RAID - support for software-based RAID 0, RAID 1, RAID 10
  • Self-healing - checksums for data and metadata, automatic detection of silent data corruptionssee btrfs@kernel.org, Btrfs@ARC-wiki, Btrfs@wikipedia

Familiar with btrfs-slang ?

Because Btrfs is different, you will find some words that do have a special meaning when used for btrfs. This may be a source of confusion.

▶ Btrfs volume
A volume is a pool of raw storage and consists of one or more devices. The size of the volume will be the addition of all devices that are part of this volume. In most cases you will only use one Btrfs volume. You are able to add/remove devices at any time. Usually you do not mount a btrfs volume.
▶ Btrfs chunk
A chunk is simply a piece of storage that Btrfs can use to put data on. Think of a chunk(usually 1GiB) as of a page in a book. The book is the volume, and the chunk is one page of it. When you start, all pages are empty. When you write data to the volume, one page(chunk) after the other is written to.
device
A device is some linux device. It may be a partition like /dev/sdz1 or /dev/sdz2. Or it may be a raw disk device like /dev/sdz without any partitioning. A Btrfs volume consists of at least one device.
▶ Btrfs subvolume
A Btrfs subvolume is an independently mountable POSIX filetree and not a block device. It is the part of a volume that will be mountet writable into your Linux system. By convention the names of subvolumes start with @ (@, @home, @snapshots ...). All subvolumes share the space of the Btrfs volume. You may create subvolumes at will. (You may think of subvolumes as sort of "dynamic partitions" inside a Btrfs volume)
default subvolume
The default subvolume of a Btrfs volume is special. When you mount, you normally have to name a subvolume to mount. When you don't name a subvolume, the default subvolume will be used by mount. The default subvolume can be changed to any subvolume. It is advisable to set that subvolume as default, that is used for mounting linux "/" this is often the subvolume with name "@"
▶ Btrfs volume-root "/", Btrfs layout
A Btrfs volume contains one ore more subvolumes. But they are not stored in form of a list. These subvolumes are stored in a tree-like structure like in a filesystem. Sometimes called the "top-level subvolume" or "root of the volume". But be carefull this is not linux-root "/", but Btrfs volume-root "/". There are several basic schemas to layout subvolumes in a volume
▶ Btrfs snapshot
A snapshot looks nearly the same as a subvolume. But don´t get confused. When we talk about snapshots we usually mean a "readonly photograph of a subvolume". While the subvolume changes with time. A snapshot stays in the state of the subvolume at the time we made it. You can mount snapshots into your linux system, but you only can read the content. And the content will never change while this snapshot exists. When creating snapshots you have to watch out for the Btrfs-layout in use.

It is possible to make a writable subvolume out of a readonly snapshot. This is the way roll back does work.


▶ Self-healing
A Volume consists of one or more devices. The size of the volume will be the addition of all devices that are part of this volume. In most cases you will only use one Btrfs Volume
▶ Btrfs Scrub
A Volume consists of one or more devices. The size of the volume will be the addition of all devices that are part of this volume. In most cases you will only use one Btrfs Volume
▶ Btrfs Balance
A Volume consists of one or more devices. The size of the volume will be the addition of all devices that are part of this volume. In most cases you will only use one Btrfs Volume
▶ Btrfs Quota
A Volume consists of one or more devices. The size of the volume will be the addition of all devices that are part of this volume. In most cases you will only use one Btrfs Volume

Parts of Btrfs

volume

A pool of raw storage. Consists of one or more devices. The size of the volume will be the addition of all included devices, unless you use RAID.

If you do use more then one device, please also read the section about RAID. You are able to add/remove devices at any time. With adding/removing devices it is also possible to move a volume from one device to another (without changing the UID).

Usually you do not mount the btrfs volume itself, but you mount subvolumes. There may be times when it is practical to mount the Btrfs volume itself.Then you are able to change the volume layout. All (writable)subvolumes inside a volume are movable inside the volume with mv. Moving subvolumes will not touch the data, but change the volume layout in an instant.

When not otherwise specified, additionall devices are handeled as a Bunch of Disks. this mean Template:Info

subvolume

snapshot

A snapshot looks nearly the same as a subvolume. But snapshots really are "readonly photographs of a subvolume". While the subvolume changes with time. The snapshot is frozen in the state of the subvolume at the time you made it. A snapshot is readonly. Therefore it is guaranteed not to change. In a snapshot you will find all files of the subvolume frozen in time.

where to place snapshots
When creating snapshots you have to watch out for the volume layout in use

Taking a snapshot is very fast, and nearly priceless. After the snapshot is taken, all future writes will go as in CoW usual. But none of the space occupied by files in the snapshot will be reusable. As you write more and more new files the filesystem will grow because it can not reuse the files in the snapshot. A new snapshot will freeze additional all created or modified files since the last snapshot and so on. If you don´t release(delete) any snapshot you will eventually run out of space soon(disk full)

Deleting a snapshot does not delete any files that are actually in use by other snapshots or the subvolume they where taken from. But to free some space, Btrfs has to test for every file in the snapshot, wehter it is in use, or it is not. If it is not, the space of this file/version will be freed.(This is greatly simplified) Therefore it is costly to remove snapshots. And btrfs will do this work in the background. You may notice this, because when you delete a snaphot there will be no immediate gain in freed space. After a while you will notice that some space was freed.

Snapshots (if regularly made) may be used for:

  • comparing config files from different "times"
  • merging config files
  • recovering accidentally deleted/overwritten files
  • system roll back
  • anchor for a backup with send/receive
  • basis for a seed
  • what do you use snapshots for ?

Making and deleting snapshots is best done automatically:

  • snapper
  • timeshift

If you need to roll back into a snapshot you have to replace the actual subvolume by the chosen snapshot.

  • Make a snapshot of the actual subvolume (for later reference)
  • Move the subvolume out of its actual place
  • Create a new subvolume from the snapshot chosen for roll back
  • Make the new subvolume the default


Snapshots together with quotas
There are reports about massive problems when using quotas together with snapshots (snapper, timeshift)

quotas

Quota support in Btrfs is implemented at subvolume level.

For more info see Quota_support@btrfs.kernel.org

Reports about problems
There are reports about massive problems when using quotas (especially together with snapshots, snapper, timeshift)

RAID

balance

scrub

compression

Grub needs to load the kernel and initrd
When you use compression on kernel, initrd, or grub config files, grub needs to decompress these files. Otherwise you will not be able to boot. GRUB introduced zstd support in 2.04.

encryption

send / receive

Btrfs Tools

btrfs

btrfsck

this is not what you think ;-)

Recomendations

We recommend using Btrfs with UEFI and GPT
Partition Filesystem Size Partition type
/dev/sda1 Fat32 1GiB EFI system partition
/dev/sda2 Btrfs 1Gib - 8EiB Btrfs Volume
/dev/sda3 swap 4GiB, at least your RAM-size Swap partition (optional)
IF you don't have UEFI, you may use Btrfs with BIOS and GPT
Partition Filesystem Size Partition type
/dev/sda1 (bootloader) 4MiB BIOS boot partition
/dev/sda2 Btrfs 1Gib - 8EiB Btrfs Volume
/dev/sda3 swap 4GiB, at least your RAM-size Swap partition (optional)
user $ example command should be here COPY TO CLIPBOARD



Example codes should be here.


Please be aware that the information on this page is a simplified version of the reality. Is is written to make the reader understand a little of these complex things. To get an in depth understanding it will be neccesary to read further at btrfs.wiki.kernel.org or other places.

additional Information

why not btrfs ?

A lot of people say: "I don't use btrfs because it is experimental and is not stable. You can´t use it in production. It is not safe!".

not stable ?

The status of btrfs was experimental for a long time, but the the core functionality is considered good enough for daily use. (from kernel.org)

If you see statements declaring Btrfs as not stable, please look for the date of them. Some seem to date from 10 years ago. So if you want to give btrfs a chance, you have to look for newer statements. Maybe even look at btrfs.wiki.kernel.org there sure is the best information regarding btrfs

experimental ?

Btrfs is feature-rich! There are new features being implemented and these should be considered experimental for a few releases when the bugs get ironed out when number of brave users help stabilizing it.(from kernel.org)

Some features are not implemented yet. Others are only partly implemented. Some are experimental and not suggested for production use. As is always the case in Linux-land you decide what to use, and so you are responsible for your own decisions.

not usable for production ?

  • Distro support for Btrfs as main filesystem
  • Some firms do use btrfs in production@wiki.btrfs.kernel.org
  • Some manufacturers do deploy devices where btrfs is used inside

difficult to repair ?

Indeed when you search for the usual ways to repair a file system like FAT or Ext4 then you don't find good information. But this is not because it is difficult to repair btrfs, but because repairing btrfs does work very different.

What's this "Copy on Write"

When you want to get the most out of using btrfs you do need to know some things about this file system. Then you are able to use it properly and to your advantage. btrfs is not difficult,l but different to some extend.

write in place (FAT32)

Most file systems did write "in place". This means that some data or metadata will be written "over" the previous data at the same place.

For example this is the case for FAT32 file systems. The File Allocation Table is at a fixed place on this file system. When the "FAT" changes (because a file got bigger and needs more blocks), this new FAT must be written with the new data to the same place as bevor. When the disk is ejected bevor(or while) this data is written, the file system will be corrupted. And the FAT does change a lot.

The danger of corruption is especially big while metadata (like filename, permission, usage of disk space ...) is being written.

write to a metadata-log (Ext4)

There is a solution to this with newer file systems like Ext4. Instead of writing metadata "in place", metadata is written into an "endless" log. Then it is not possible to be corrupted while overwritten. This is possible because metadata is only a very small part of the data in a file system.

There has to be an additional mechanism to make this safe. Sometimes this is called "barriers", and there have to be checksums that tell when a part of the log is corrupted.

This does protect the file system itself, but not the files in it. Because a file may be overwritten in place, and then the old file is lost, and the new one may not have been written completely.

Copy on Write! (Btrfs)

Copy on Write is a "new" concept. It means the file system will try to never write over existing data. How is this even possible?

  • Files are appended at the end of a "data page"
  • Metadata is appended at a "metadata page"
  • Inside a page nothing is ever overwritten
  • When a page is full the file system will use the next free page
  • Deleting a file does not write/clean its data, but writes metadata, that marks this file as deleted
  • Overwriting a file does first append the new file to the "data page", then writes the metadata for this file
  • Changing small parts of a file will write only the new parts, then link the rest to the old file
  • there are checksums for data and metadata

downsides

  • Management of space is complex
  • There are 2 sorts of pages
  • There has to be a clean-up-process who makes the space of deleted files reusable, so that the disk does not run out of free pages
  • It must be avoided to write data unnecessarily, because then the clean-up would also be very expensive

chances

  • It is possible to detect nearly any corruption because of the checksums
    • When the power is lost, or the disk is disconnected, all old data is save. WHY?
    • Every bit of "old" data from before the power loss or the disconnection is present because it is NOT overwritten
    • Only the newly written data may be partly damaged
    • The metadata may also be partly damaged
    • When mounting the volume it is possible by analysing checksums and metadata to find the point in the file system where all was good
    • Btrfs will automatically roll back to this point, then it can mount the file system writeable
  • CoW is a sound foundation to build upon
    • Snapshots
    • RAID
    • Volume management
    • Compression
    • Encryption (maybe some time in the future)


Don´t disable CoW in Btrfs
It is possible to disable CoW in Btrfs. But then you loose all benefits of Btrfs. It won´t even make checksums. If you don't like CoW, then you better use another filesystem

Use the Forum!

It is a good Idea to search the forum for posts related to btrfs.

Btrfs is fast moving! See Also: