Design and algorithms of `securefs`

Full format (format version 1,2,3)

ID

Each file, directory or symlink in the apparent filesystem corresponds to a pair of files in the underlying filesystem. The pair is identified by a 256-bit ID, generated by cryptographic strong pseudo-random generators (CSPRNG), except for the root dir, which always has an ID of zero.

The ID is used as inode numbers, as well as part of the associated data authenticated in the encryption scheme. The underlying directory structure is similar to git object store.

Encryption and authentication

Each file, directory or symlink is a stream of bytes. The stream is divided into 4KiB blocks without padding, each of which is separately encrypted and authenticated with an AEAD cipher (currently AES256-GCM). Every time a block is added or modified, a new IV/nounce is generated by a CSPRNG, and the whole block is reprocessed.

The IV as well as the MAC will be stored in the associated meta file. The meta file starts with a HMAC-SHA256 of its rest of contents to protect its integrity.

The two level scheme ensures integrity as well as fast access. A single HMAC over the whole ciphertext stream would also be sufficient for integrity protection, but that would be too slow on large files.

Key derivation

The master key of the whole system is derived from user password. Because passwords usually contain low entropy, they must be randomized and stretched before being used as key. Currently the default algorithm is argon2id. It is also possible to specify scrypt or pbkdf2-hmac instead.

The master key is not directly used to encrypt data. Instead, each file will have three separate keys derived from the master key with HKDF, one for encryption of main data, one for mac of the metadata, and one for encryption of extended attributes.

Extended attributes

If the underlying filesystem supports xattr, so will securefs. securefs only encrypts the contents, not the name of xattr. This is because different systems impose different restrictions on the name of xattr, so it is hard to produce a valid name on a cross-platform manner.

On OS X, securefs will never set the xattr of com.apple.FinderInfo and com.apple.quarantine. These are workarounds for bugs.

One can disable xattr processing upon mounting.

Algorithms

The respective algorithms are

Password stretching: PBKDF2-HMAC-SHA256
Regular key derivation: HKDF
Cipher and mode: AES256-GCM
MAC: HMAC-SHA256

Difference between 1, 2 and 3

2, 3 differ from 1 in that the former two have less paranoid parameters.

3 differs from 2 in that it stores timestamp in the meta file, rather than relies on the underlying filesystem. It is not for security, as the underlying filesystem has timestamp anyway, but for synchronization of such data across cloud services.

Lite format (format version 4)

File contents

Each file starts with a 16-byte random block which is then encrypted by AES with the master content key (256 bits) to derive the file specific key (128 bits). The AES block cipher is a pseudorandom permutation so the derived key is still sufficiently random.

Each block (with tunable block size at creation time) is encrypted with AES-GCM at each mutation, with a different IV at each time. The block number is an input to AES-GCM as associated data so blocks cannot be copied within or across files without verification failures.

Unlike in full format, where a meta file is used, in lite format the IV is prepended, tag appended to the ciphertext and then written to the underlying file. Copying/moving blocks within the file will generate errors since the block number affects the tag, and copying/moving blocks across files will generate errors since the keys are different. However, an attacker may replace a file block with an old one at the same position (which has valid tag) without being noticed by the security check.

All zeros blocks are passed through so that sparse files can be easily supported.

The file specific key is necessary because NIST recommends that a single key is not used with more than 2^32 IVs for AES-GCM. For this reason, the file sizes are limited to 2^31 - 1 blocks (for the default block size of 4KiB, the max file size is about 8TiB), accounting for possible overwrites of the same blocks. In the catastrophic event of leaking the file specific key (because too many IVs have been used), the master key remains safe and other files are still out of reach for the attackers.

Names of files, directories and symlinks

We cannot use probabilistic encryption for file names, for otherwise name lookups will become linear. We choose a deterministic authenticated encryption algorithm AES-SIV as defined by RFC 5297. This prevents the filenames from being deduced, except that identical filenames will show as identical encrypted names in the underlying directory.

The encrypted names are converted to ASCII in base32 encoding (in DUDE alphabet without padding). Base64 is not used because it won't work properly over case insensitive filesystems.

Because of the added IV and the base32 encoding, the underlying filename is longer than the virtual filename. Therefore the maximum filename length in the mounted filesystem is always shorter than its underlying filesystem. On Windows we use Unicode long names (prefixed with \\?\) but other programs may not work well with the encrypted filenames.

The key for name encryption is independent from the master content key.

Extended attributes

Extended attributes are supported on macOS since many applications won't work properly without it. The security of extended attributes are not a high concern for us, so only the values, not the names of xattr are encrypted (with AES-GCM). To ensure that the weak security of encryption of xattr does not affect other parts, the key for xattr is separately generated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

design.md

design.md

Design and algorithms of `securefs`

Full format (format version 1,2,3)

ID

Encryption and authentication

Key derivation

Directory

Extended attributes

Algorithms

Difference between 1, 2 and 3

Lite format (format version 4)

File contents

Names of files, directories and symlinks

Extended attributes

Files

design.md

Latest commit

History

design.md

File metadata and controls

Design and algorithms of securefs

Full format (format version 1,2,3)

ID

Encryption and authentication

Key derivation

Directory

Extended attributes

Algorithms

Difference between 1, 2 and 3

Lite format (format version 4)

File contents

Names of files, directories and symlinks

Extended attributes

Design and algorithms of `securefs`