Tumbleweed 和 MicroOS 中的 Systemd 引导和全磁盘加密

dongjincai · 发表于 2023-12-27 15:53:54

Tumbleweed 和 MicroOS 中的 Systemd 引导和全磁盘加密

openSUSE Tumbleweed 和 MicroOS 现在提供一个映像，该映像用作systemd-boot引导加载程序和基于systemd. 加密设备的解锁可以通过传统密码、TPM2系统运行状况良好时将附加设备的（系统中已存在的加密设备）或FIDO2验证所有权的密钥来完成。一个令牌。

这里有很多东西需要解释，但基本上这些变化都是为了将发行版转移到更安全的地方。一方面是使发行版的设计更加简单，另一方面是遵循其他发行版也正在遵循的当前安全趋势。

那么，让我们从头开始……

系统引导

我们都知道并且热爱GRUB2。这是一个很好的引导加载程序。它又大、复杂、丰富、庞大，并且在开发方面往往进展缓慢。

该引导加载程序的 openSUSE 软件包包含 200 多个补丁。其中一些补丁已经存在了过去 5 年、6 年……10 年。这既表明了维护者的才华，也表明了上游贡献过程有多慢的问题。

GRUB2支持所有相关系统，包括大型机arm 或powerpc. 多种类型的文件系统，包括btrfs或 NTFS. 它包含完整的网络堆栈、USB 堆栈、终端，可以编写脚本……从某种意义上说，它本身几乎就是一个迷你操作系统。

但UEFI18 年前发生的事情，使得几乎所有提供的功能都GRUB2变得多余了。系统固件已经将大部分功能作为服务提供，可供操作系统、引导加载程序或任何其他用户提供的应用程序使用。当然也GRUB2支持UEFI。

很快，Linux 内核就EFI 可以通过附加到内核代码的存根来编译为二进制文件。这意味着内核本身可以由固件直接启动，从而使引导加载程序在大多数情况下成为可选的。

随着时间的推移，出现了新的、更简单的引导加载程序UEFI ，例如gummiboot[1]。后来这段代码被集成 systemd并重命名为systemd-boot.

代码非常简单。比简单许多数量级 GRUB2。它基本上是一个非常小的EFI二进制文件，提供一个带有不同引导加载程序条目的菜单（引导加载程序规范[2]中描述的文本文件或BLS简称），以及对UEFI LoadImage函数的调用以将执行委托给选定的内核。

该引导加载程序还可以使用新的统一内核映像[3] ( UKI)，这些文件将内核、命令行和initrd. 这些UKI对于基于映像的发行版来说非常方便，并且 openSUSE 也计划支持它们。

提供systemd-boot替代方案GRUB2是 openSUSE 长期以来一直想做的事情。2023 年 8 月，Factory 邮件列表上发布了有关 Tumbleweed 支持的公告[4] systemd-boot。

GRUB2该公告引用了 wiki 条目 [5]，该条目解释了如何手动迁移安装systemd-boot。公告发布后不久，yast-bootloader就获得了[6]对新安装的支持。

支持另一个引导加载程序是有代价的。正如所争论的那样，代码库更小，错误更少，并且更容易推理。但这种UEFI依赖性减少了支持的架构（x86-64 和 aarch64）的数量。GRUB2通过提供另一个补丁来支持条目可以大大缓解这个问题BLS，因此引导加载程序之后的发行版架构可以独立于引导加载程序本身。好消息是该补丁已经存在，并且有可能添加到软件包中。

还有一个问题就是systemd-boot不说话btrfs。作为 EFI二进制文件，它只能从文件系统读取文件FAT32。通过将内核和 initrd 移至EFI系统分区 ( ESP) 可以解决此限制。

最后，还有在 Tumbleweed 中支持快照和 MicroOS 中支持事务的考虑。从引导加载程序中，用户应该能够选择从哪个快照引导，就像使用GRUB2. 这两个概念都是使用btrfs子卷实现的，并且只有内核、命令行、initrd组合的子集对每个子卷有效。

例如，假设我们的系统中有两个快照，每个快照都代表一个安装了两个内核的系统。所有快照中的这两个内核可能都不相同。也许其中一项升级用较新的版本替换了一个内核。我们需要一些工具来完成关联正确组合所需的簿记工作，从而成功引导到任何这些快照，并在这些限制下创建引导条目。

这个工具是sdbootutil[6]。每次snapper创建或销毁快照时（例如，当系统更新时），都会调用该工具来分析快照的内容，确保相应的内核安装在中，并且存在对该内核ESP有效的initrd（如果没有，它将通过调用创建mkinitrd）并创建一个启动项，initrd通过命令行连接内核、快照和快照。它还负责其他细节，例如检查分区上的可用空间。

通常他的过程是透明的，但最好记住我们可以通过以下方式强制进入干净状态：

sdbootutil add-all-kernelssdbootutil remove-all-kernels

以防万一，你知道……

全盘加密

我们要宣布的另一个方面是支持FDE基于systemd.

FDE不是新来的。很久以前GRUB2就可以使用该命令解锁卷。传统上，这将向用户请求密码两次：一次是在引导加载程序进行解锁时，另一次是在稍后执行相同操作时。有一些方法可以避免第二个请求将密码注入到.LUKScryptomountinitrdinitrdinitrd

最近GRUB2获得了两个新功能：对LUKS2 加密设备的部分支持（用作PBKDF2密钥派生函数，而不是更安全和推荐的Argon2id）以及可以在TPM2.

TPM2

详细解释如何TPM2工作是另一篇文章的主题，但现在我们可以将其视为一种加密设备，仅当满足与系统状态相关的某些条件时才用于解锁秘密。TPM2如果系统处于健康状态，将会解开这个秘密。

该术语是一个技术术语，与断言系统处于已知的良好状态有关。换句话说，我们确信固件没有被篡改，引导加载程序是我们安装的并且没有被替换，内核正是来自发行版的内核，内核命令行是我们期望的，并且initrd我们使用的不包含任何我们无法控制的额外二进制文件。

内部TPM2有一些寄存器，称为平台配置寄存器（PCR）。在规范中TPM2，有 24 个，其中 1 个的大小足以存储哈希函数的值，例如SHA1或SHA256。它们被银行分开：每个受支持的哈希函数一个，但目前还没有太多细节。

这些寄存器有点特殊。我们可以重置它们，通常将值设置为 0。我们可以读取值，或者我们可以“扩展”它们。写操作的设计方式是我们不能在寄存器中设置任何随机值，除了关联的哈希函数连接当前值PCR和用户提供的新值的结果之外。

的当前值PCR只能通过使用完全相同的值序列扩展该寄存器来生成。如果我们改变其中一个值的哪怕一位，我们都会为相同的产生截然不同的最终结果PCR。

此功能用于称为“测量启动”[8] 的过程，其中启动链中的每个阶段在执行之前都会进行测量。这意味着在固件的初始阶段运行之前，有一个进程将计算内存中代码的哈希值，并PCR使用该值扩展其中一个。重复此操作直到启动序列的最后：内核和initrd.

当测量启动到位时，前 10 PCR秒的最终值将包含只有在机器使用众所周知版本的固件、启动加载程序和内核以及证书、配置文件等相关数据时才能预测的值，或内核参数。如果这些元素之一发生变化（例如，通过使用不同的安全启动证书），它将生成PCR与我们预期不同的值。

TPM2芯片是非常有趣的设备，其功能远远超出了测量启动范围。如果您想了解更多信息，我推荐 [9] 或 [10] 等资源。

TPM2为了FDE

无论如何，这里的要点是我们可以创建一个“策略”，仅当某些s 包含预期值时才可以指示TPM2解密秘密。PCR细节有点不同，但现在让我们使用这个模型作为一个很好的初步近似值。

我们的想法是，我们可以使用某些寄存器的值加密密码 PCR，因此如果可以恢复密码，则GRUB2可以在以后连接设备，从而验证系统的健康状况，直到此时。如果无法解密，则意味着某些内容没有达到预期值，并且引导过程中的某些阶段发生了变化。在这种情况下，将询问用户密码以继续加载内核和系统的其余部分。它将对新状态的信任委托给用户。LUKS2TPM2TPM2PCRGRUB2

GRUB2还提供了一个工具，可以在 s 子集的当前值下密封秘密PCR。这很好，但也带来了一些问题。一是我们可能以一种我们知道 s 值会在下次启动期间发生变化的方式设置系统PCR（例如，在第一次安装、启动加载程序升级或固件更新期间）。在这种情况下，使用当前寄存器值密封密码是没有用的：我们需要能够预测新的值并使用这些假设值来进行密封。

另一个问题更加隐蔽，稍后会变得至关重要。期望值可能会经常变化并且不能是唯一的。也许有一组有效的。我们可以选择从不同的内核或不同的快照启动。TPM2使用称为授权策略的方法为此提供了解决方案。它们是创建可以更改的策略的一种方式，但它们是通过签名进行验证的。本质上，我们创建一个公钥和一个私钥，并创建PCR使用私钥签名的多个策略。现在，可以使用公共部分验证签名，并使用新策略中存储的 s 值TPM2解封秘密。PCR

自 2023 年初起，openSUSE 提供了pcr-oracle[11] 工具来帮助预测寄存器PCR，并使用策略或授权策略对这些值下的密钥进行加密PCR。使用这个工具，我们现在可以在一组PCR可以更改的 s 值下密封一个秘密！

在 openSUSE wiki[12] 中，我们可以找到有关这些主题的更多文档，包括有关如何在我们的安装中使用它的说明。

用于systemd磁盘加密

既然GRUB2它FDE工作正常，那为什么还要寻找其他东西呢？有一个原因非常明显：只有GRUB2使用我们的 openSUSE 版本，这种架构才能……好吧……发挥作用。它不适用于其他引导加载程序，例如systemd-boot. 事实上，它不适用于其自身的上游版本GRUB2。

但还有第二个原因：我们可以说没有一个完整的测量引导GRUB2。如果引导加载程序需要在加载内核之前解锁设备，那么PCR评估系统运行状况的策略自然无法对内核、命令行或initrd将要使用的进行断言。LUKS2这些将在设备打开后加载。

使用为systemd-boot我们提供了一种替代架构， FDE它可以与遵循的任何引导加载程序一起正常工作 BLS（请记住，有一个补丁可以GRUB2在某处支持它，因此不会先验地排除它），并且提供了执行完整操作的机会在解锁设备之前测量启动认证。

一个区别是内核和initrd将被放置在未加密的中，并且将使用提供的不同选项从内部完成ESP解锁。目前，它可以使用普通密码、授权策略（可选地必须由用户输入 PIN）或密钥设备来解锁设备。在文件中我们需要描述[13]解锁机制。sysrootinitrdsystemd-cryptsetupTPM2FIDO2/etc/crypttab

pcr-oracle已扩展到支持创建systemd可以理解的授权策略。它们存储在JSON 包含多个预测的文件中，每个预测都指示PCR所涉及的 s、TPM2策略哈希、公钥指纹和策略签名。它与公钥文件一起组成了使用密钥开封PEM所需的所有数据。systemd-cryptsetupTPM2LUKS2

RSA用于签署策略的 2048 密钥可以使用或openssl本身创建pcr-oracle。请注意：如果私钥被泄露，那么对于私钥可以TPM2提供的预期安全性来说，游戏就结束了。幸运的是，在这种情况下，解决方案很便宜：生成一个新密钥，将密钥重新注册到LUKS2密钥槽中systemd-cryptenroll，并使用它sdbootutil来重新生成每个启动条目的预测。是的……我们将在“systemd-fde”维基页面[14]中记录所有过程并提供更好的工具，但相信我，这确实是一个廉价的操作。

openSUSE 提供了一个名为 kvm-and-xen-sdboot[16] 的 MicroOS 映像[15]，它显示了所有这些是如何工作的。该图像包含一些已经提到的集成工具和一些其他新工具：

systemd-boot：使用引导加载程序而不是默认值GRUB2
sdbootutil：帮助脚本同步系统的启动项
pcr-oracle：预测PCR下次启动的s值，并创建授权策略systemd
disk-encryption-toolsysroot：加密首次启动时所在设备
dracut-pcr-signature：dracut将预测加载到的initrd模块ESP

这些工具旨在为这种新FDE 架构协同工作。以下是所有连接方式的简要描述。

一旦我们获得了新的 MicroOSqcow2映像并设置了 VM，我们就可以继续启动过程。如果VM有虚拟TPM2设备，它将开始测量执行的代码和数据，扩展相应的PCRs。一旦systemd-boot到达，它将找到该会话的正确引导条目，并initrd从中读取相应的内核。

此时图像尚未加密。initrd在第一次启动期间使用的脚本内，disk-encryption-tool将调用该脚本。使用一些启发式方法，它将找到所属的分区sysroot（系统所在的位置），并调整其大小，为LUKS2标头保留 32MB。之后，它将使用提供的所有魔力，cryptsetup使用本地生成的密码重新加密设备。截至今天，该密码对应于最终将呈现给用户的恢复密钥，用户应注意并妥善保管。

重新加密后，系统/etc/crypttab将进行更新，以告知该设备现已加密，稍后应使用不同的工具进行管理。

最后，initrd我们切换到新的sysroot，现在终于位于加密设备中。该disk-encryption-tool脚本已经完成了其主要工作，但它安装了两个模块 jeos-firstboot，这两个模块将在系统首次启动时执行，这目前正在发生！

第一个模块enroll将检测是否FIDO2插入了钥匙并且钥匙是否TPM2可用。如果是这样，它将显示一个对话框，询问您要使用什么来解锁系统。第二个模块将询问用户root密码是否也将作为新密钥注册在 LUKS2标头中，并将显示之前生成的恢复密钥。

截至目前，不建议同时注册两者。正如我们之前所描述的，FIDO2如果我们使用笔记本电脑或台式机并且我们希望使用我们拥有的令牌证明来解锁加密设备，则密钥将更有意义。这是一个互动的过程。TPM2当我们不想与系统交互，并且仅当我们可以断言系统的健康状况（引导链中没有发生篡改）时，我们才希望自动解锁设备，这更有意义。

如果我们注册FIDO2密钥，systemd-cryptenroll将被调用，并要求我们按两次按钮，安装过程将结束。下次启动时，我们将需要出示密钥，如果密钥丢失，将询问恢复密码。

如果我们注册TPM2设备，则会生成一个新的RSA2048 密钥并将其存储（公共和私有部分），/etc/systemd并将 systemd-cryptenroll用于注册公共密钥并注释PCR用于密封密钥的 s LUKS2。默认情况下，我们将使用 0、2、4、7 和 9。您可以在[17]中查看其含义。 PCRs 0 和 2 将测量所有UEFI固件代码。 PCR4 将测量引导加载程序 ( systemd-boot) 和内核（也是 UEFI 二进制文件）。 PCR7 将注册所有安全启动证书，PCR9 将被内核用来测量命令行和initrd.

这几乎涵盖了所有有意义的内容，但用户对测量内容拥有最终决定权。原因是预测是在内部完成的sdbootutil，请记住，每次系统更改（更新、包删除、快照管理等）后都会自动执行，并且该工具将仅针对标头PCR中注册的 s生成预测LUKS2 。

无论选择何种解锁机制，该/etc/crypttab 文件都将根据此选择进行更新，并且initrd将生成一个新文件以包含下次启动时的此信息。

最后，最后一个组件dracut-pcr-signature将负责在后续启动期间 systemd-cryptsetup解锁所需的所有信息将“即时”出现在initrd. 应该注意的是，initrd 将需要带有策略和密钥的 JSON 文件，但这些不能包含在initrd! 当我们对PCR用的散列扩展的a 进行预测时initrd，就这样，我们不能initrd再触及 a 了，因为这将产生一个新的散列并自动使预测无效。

该dracut模块将在任何加密设备的生成器启动之前执行systemd-cryptsetup ，并将在ESP分区中搜索tpm2-pcr-signature.json包含当前启动的所有有效预测的文件。一旦该文件就位，将systemd-crypsetup能够断言当前状态的设备是预期的状态，并且引导过程可以继续直到结束。

未来

图片在这里，是一个声音 PoC。它提供了一个更简单的架构，并将一些组件放置在正确的位置。这将在接下来的阶段有很大帮助，因为我们还想对与相关的分布做一些其他事情FDE。

一种非常清晰的方法disk-encryption-tool在基于图像的安装之外的使用有限。该代码的一部分应该位于YaST和中Agama。安装程序已经在创建LUKS2设备，因此以适合我们的方式扩展它应该很“容易”。

理想情况下，这些jeos-firstboot模块也应该存在于安装程序中，但不知何故它们在这里也有意义。无论如何，功能不应分离，而应合并。

加密工具从一开始就在做正确的事情：主密钥以及所有用户密钥都是在安装期间生成的，但一个可能的改进是稍后使用该systemd工具生成恢复密钥。这是一个小细节，但将系统密钥与用户密钥分开可以简化架构。

另一个需要改进的方面是用户可能希望同时使用TPM2 和密钥。FIDO2例如，默认情况下 TPM2使用，并且如果阶段更改的方式导致预测失败（或者已检测到安全漏洞），则用户可以将解锁委托给密钥FIDO2，而不是使用密码。

该sdbootutil脚本包含一堆也应该存在于systemd. 与上游合作将使这个工具随着时间的推移而过时，这将是一个更好的消息。

我们可以帮助的另一个改进是改进对拒绝开封密钥的systemd原因的诊断。今天，我们收到了一条一般性失败消息，但没有报告内部的哪些或哪些测量组件报告了与预测不同的哈希值。这将有助于了解哪里出了问题。引导加载程序是否已更改？或者固件里有什么东西？TPM2LUKS2PCRPCR

pcr-oracle是预测下一个值的非常好的工具PCR。扩展解析与完整测量的启动过程相关的日志中的新事件非常容易，包括内核、 12systemd-boot上的扩展PCR或JSON 生成systemd. 新的systemd255（在撰写本文时一周前发布）包含一个类似的工具，名为 systemd-pcrlockthat 可以帮助我们提供我们正在寻找的改进的诊断。评估这个工具来进行预测也将很快完成。

今天，Type#1 和 Type#2 条目BLS不是同构的。EFI该格式的文件中有些部分UKI在文本表示中不存在。UKI也许我们将来会决定使用s，也可能不使用。因此，一个很好的改进正在致力于帮助实现这种统一，这将（除其他外）提供一种标准方法来分割文件JSON并将预测与每个引导加载程序条目相关联。

如今，生成并注册新密钥或选择不同的 PCR密钥集都是手动过程。可以扩展当前的工具来帮助完成这些过程，或者可以提供更好的文档。

新方法FDE并不是将其排除GRUB2在等式之外。它是为了提供使用BLS. 验证正确的补丁（废话！）GRUB2是否可以解决所有这些问题仍然有待完成。

此外，另一件事需要验证和改进是使用多个加密磁盘的安装。原则上，设计和代码支持它（即使PCR每个卷的寄存器不同）。 openQA将会在这里创造奇迹。

最后，我们应该重新考虑这些UKI对 openSUSE 是否有意义。如果我们朝这个方向发展，用于签署策略的私钥将被保留OBS，并且这些策略也将使用一组不同的值在构建服务中生成PCR 。

无论如何，我们还有大量工作要做。

Systemd-boot and Full Disk Encryption in Tumbleweed and MicroOS

[color=rgb(108, 117, 125) !important]20. Dec 2023 | Alberto Planas | CC-BY-SA-3.0

Systemd-boot and Full Disk Encryption in Tumbleweed and MicroOS

openSUSE Tumbleweed and MicroOS are now delivering an image that is using systemd-boot as boot loader and full disk encryption based also on systemd. The unlock of the encrypted device can be done via the traditional password, a TPM2 (a crypto-device that is already present in your system) that will attach the device if the system is in good health, or a FIDO2 key that will validate the ownership of a token.

There is a lot to explain here, but basically those changes are in the direction of moving the distribution into a more safe place. For one side is making the design of the distribution much more simple, and for another it is following the current trends about security that other distributions are also aligning with.

So, lets start with the beginning …

systemd-boot

We all know and love GRUB2. It is a good boot loader. It is also big, complex, rich, massive and tends to move slow on the development side.

The openSUSE package for this boot loader contains more than 200 patches. Some of those patches are there for the last 5, 6 … 10 years. That is both an indication of the talent of the maintainers, but also can signal an issue in how slow the upstream contribution process can be.

GRUB2 supports all the relevant systems, including mainframes, arm or powerpc. Multiple types of file systems, including btrfs or NTFS. It contains a full network stack, an USB stack, a terminal, can be scripted … In some sense, it is almost a mini OS by itself.

But then UEFI happened 18 years ago, making almost all the features provided by GRUB2 somehow redundant. The system firmware was already providing most of these functionalities as services that can be consumed by the operating system, the boot loader or any other user provided application. And of course GRUB2 supported UEFI too.

Soon the Linux kernel gained the option of being compiled as an EFI binary, via a stub that can be attached to the kernel code. This implies that the kernel itself could be launched by the firmware directly, making the boot loader something optional in most of the cases.

Over time new and more straightforward boot loaders focused on UEFI appeared, like gummiboot[1]. Later this code was integrated into systemd and renamed as systemd-boot.

The code is very simple. Many orders of magnitude simpler than GRUB2. It is basically a very small EFI binary that presents a menu with the different boot loader entries (text files described in the Boot Loader Specification[2] or BLS for short), and a call to the UEFI LoadImage function to delegate the execution to the selected kernel.

This boot loader can also work with the new unified kernel images[3] (UKI), that are files that aggregate in a single unit the kernel, the command line, and the initrd. Those UKIs can be very handy for image based distributions, and openSUSE plans to support them as well.

Providing systemd-boot as an alternative for GRUB2 is something that openSUSE wanted to do for a long time. In August 2023 there was an announcement[4] on the Factory mailing list about Tumbleweed supporting systemd-boot.

The announcement references a wiki entry[5] that explains how to migrate an installation using GRUB2 to systemd-boot manually. Soon after the announcement, yast-bootloader gained[6] support for it for new installations.

Supporting another boot loader comes with a cost. As argued, the code base is smaller, with less bugs and more easy to reason about. But the UEFI dependency decreases the amount of supported architectures (x86-64 and aarch64). That problem can be very much alleviated by providing another patch for GRUB2 to support the BLS entries, so the architecture of the distribution after the boot loader can be independent of the boot loader itself. The good news is that the patch already exists, and could potentially be added into the package.

Another problem is that systemd-boot does not speak btrfs. As an EFI binary, it can read files only from a FAT32 file system. This limitation can be resolved by moving the kernel and the initrd into the EFI system partition (ESP).

Finally, there is also the consideration of supporting snapshots in Tumbleweed and transactions in MicroOS. From the boot loader the user should be able to select what snapshot to boot from, like it is actually possible to do when using GRUB2. Both concepts are implemented using btrfs subvolumes, and there is only a subset of kernel, command line, initrd combinations that are valid for each of those subvolumes.

For example, let’s say we have two snapshots in our system, and each of these represents a system that has two kernels installed. It is possible that those two kernels are not the same across all the snapshots. Maybe one of the upgrades replaced one kernel with a newer version. We need some tool that can do the bookkeeping required to associate the correct combination that will produce a successful boot into any of those snapshots, creating the boot entries under those restrictions.

This tool is sdbootutil[6]. Every time snapper creates or destroys a snapshot (for example, when the system gets updated), it will call this tool that will analyze the content of the snapshots, making sure that the corresponding kernel is installed in the ESP, a valid initrd for this kernel is present (if not it will be created calling mkinitrd) and a boot entry is created that connects the kernel, the initrd and the snapshot via the command line. It also takes care of other details, like checking the free space on the partition.

Usually his process works transparently, but is good to remember that we can force a clean state with:

sdbootutil add-all-kernelssdbootutil remove-all-kernels

Just in case, you know …

Full disk encryption

The other aspect that we want to announce is the support of full disk encryption (FDE) based on systemd.

FDE is not the new kid on the block. GRUB2 could unlock LUKS volumes since long ago using the cryptomount command. Traditionally this will request the password from the user two times: once when the boot loader does the unlock and again when the initrd does the same later. There are ways to avoid the second request injecting the password into the initrd or, if you are using the openSUSE package, it will inject the password transparently into the initrd.

Recently GRUB2 gained two new features: partial support of LUKS2 encrypted devices (using PBKDF2 as key derivation function instead of the more secure and recommended Argon2id) and a key protection mechanism that can store secrets in devices like the TPM2.

TPM2

Explaining how TPM2 works in detail is a topic for another post, but for now we can think of it as a crypto device that be used to unlock secrets only when certain conditions related to the state of the system are met. The TPM2 will unlock the secret if the system is in a healthy state.

This term is a technical one, and is related to assert that the system is in a known good state. In other words, we know for sure that the firmware has not been tampered with, the boot loader is the one that we installed and has not been replaced, that the kernel is exactly the one that comes from the distribution, that the kernel command line is the one that we expect, and that the initrd that we used does not contain any extra binary that we do not control.

Internally the TPM2 has some registers, known as platform configuration register (PCR). In the TPM2 specification there are 24 of them and the size of one is enough to store the value of a hash function, like SHA1 or SHA256. They are separated by banks: one per supported hash function, but this is too much detail for now.

Those registers are kind of special. We can reset them, usually setting the value to 0. We can read the value, or we can “extend” them. The write operation is designed in a way that we cannot set any random value in the register, except the result of the associated hash function concatenating the current PCR value and a new value provided by the user.

The current value of the PCR can only be produced by extending this register using exactly the same sequence of values. If we change even one bit of one of the values, we will produce a wildly different final result for the same PCR.

This feature is used in a process known as “measured boot”[8], where each stage in the boot chain is measured before it is executed. This means that before the initial stages of the firmware are running, there is a process that will calculate the hash of the code in memory, and extend one of the PCRs using this value. This is repeated until the very end of the boot sequence: the kernel and the initrd.

When measured boot is in place, the final values of the first 10 PCRs will contain values than can only be predicted if the machine is using a well known version of firmware, boot loader and kernel, together with the associated data like certificates, configuration files, or kernel parameters. If one of those elements change (for example, by using a different secure boot certificate), it will generate PCR values different from the ones that we expect.

TPM2 chips are very interesting devices, and the set of features go far beyond measured boot. If you want to learn more I recommend resources like [9] or [10].

TPM2 for FDE

Anyway, the gist here is that we can create a “policy” that can instruct the TPM2 to decrypt a secret only if certain PCRs contains the expected values. The details are a bit different, but for now lets use this model as a good first approximation.

The idea is that we can encrypt a password with the values of certain PCR registers, so GRUB2 can later attach the LUKS2 device if the TPM2 can recover the password, validating the health of the system until this point. If the TPM2 fails to decrypt it, that would mean that some PCR has not the expected value and some stage in the boot process changed. In this situation GRUB2 will ask the password from the user to continue loading the kernel and the rest of the system. It delegates the trust about the new state to the user.

GRUB2 also provides a tool to seal secrets under the current values of a subset of PCRs. This is nice but also presents several problems. One is that maybe we are setting the system up in a way that we know the PCRs values will change during the next boot (for example, during the first installation, a boot loader upgrade or a firmware update). In this case sealing the password using the current register values is useless: we need to be able to predict the new ones and use those hypothetical values to do the sealing.

The other problem is more insidious and will become critical later. The expected values can change frequently and can not be unique. Maybe there is a set of valid ones. We can choose to boot from a different kernel or from a different snapshot. The TPM2 provides a solution for this using something known as authorized policies. They are a way of creating policies that can change, but they are validated by a signature. In essence, we create a public and a private key, and we create multiple PCR policies that are signed using the private key. Now the TPM2 can validate the signature using the public part, and unseal the secret using the PCRs values stored in the new policy.

Since early 2023 openSUSE provides the pcr-oracle[11] tool to help with the prediction of the PCR registers, and encrypt a key under those values using both PCR policies or authorized policies. Using this tool we can now seal a secret under a set of PCRs values that can change!

In the openSUSE wiki[12] we can find more documentation about those topics, including instructions about how to use it in our installation.

Using systemd for disk encryption

With GRUB2 the FDE is working properly, so why look for something else? One reason is very evident: this architecture can only work … well … only if our openSUSE GRUB2 version is used. It will not work for other boot loaders like systemd-boot. In fact it will not work with the the upstream version of GRUB2 itself.

But there is a second reason: we can argue that there is not a full measured boot in place with GRUB2. If the boot loader needs to unlock the device before it can load the kernel, is natural that the PCR policies that will evaluate the health of the system cannot make asserts on the kernel, command line or initrd that will be used. Those will be loaded after the LUKS2 device has been opened.

The use of systemd-boot gives us an alternative architecture for FDE that can work properly with any boot loader that follows the BLS (remember, there is a patch for GRUB2 to support it somewhere, so it is not excluded a priori), and provides the chance to do a full measured boot attestation before unlocking the device.

One difference is that the kernel and the initrd will be placed in the unencrypted ESP, and the unlock of the sysroot will be done from inside the initrd using the different options that systemd-cryptsetup offers. Currently it can unlock the device using a normal password, a TPM2 with authorized policies (with optionally a PIN that must be entered by the user) or a FIDO2 key device. In the /etc/crypttab file we need to describe[13] the unlocking mechanism.

pcr-oracle has been extended to support the creation of authorized policies that systemd can understand. They are stored in a JSON file that contains multiple predictions, each one of them indicating the PCRs involved, the TPM2 policy hash, the fingerprint of the public key and the signature of the policy. This, together with the public key PEM file, composes all the data required for systemd-cryptsetup to use the TPM2 for the unseal of the LUKS2 key.

The RSA 2048 key used to sign the policy can be created with openssl or with pcr-oracle itself. A note of caution: if the private key gets leaked, this is a game over for the expected security that the TPM2 could provide. Luckily the solution is cheap in this case: generate a new key, re-register the key in the LUKS2 key slot with systemd-cryptenroll and use sdbootutil to regenerate the predictions for each boot entry. Yeah … we will document all the process in the “systemd-fde” wiki page[14] and provide better tools, but trust me, it is indeed a cheap operation.

openSUSE is providing a MicroOS image[15] named kvm-and-xen-sdboot[16] that shows how all of this is working. This image contains some of the already mentioned tools integrated and some other new ones:

systemd-boot: Boot loader used instead of the default GRUB2
sdbootutil: Helper scripts to synchronize the boot entries of the system
pcr-oracle: Predict the PCRs values for the next boot, and creates the authorized policies for systemd
disk-encryption-tool: Encrypt the device where sysroot is located on the first boot
dracut-pcr-signature: dracut module that will load the predictions into the initrd from the ESP

Those tools are designed to work together for this new FDE architecture. What follows is a brief description on how all is connected.

Once we get the new MicroOS qcow2 image and we setup the VM, we can proceed with the boot process. If the VM has a virtual TPM2 device it will start measuring the executed code and data, extending the corresponding PCRs. Once systemd-boot has been reached, it will find the correct boot entry for this session and will read the corresponding kernel and initrd from it.

At this moment the image is not encrypted. Inside the initrd that is used during this first boot, the disk-encryption-tool script will be called. Using some heuristics it will find the partition that belongs to sysroot (where the system is located), and will resize it to reserve 32MB for the LUKS2 header. After that it will use all the magic that cryptsetup provides to re-encrypt the device using a locally generated password. This password, as of today, corresponds to the recovery key that will be presented to the user at the end and the user should take note and keep it safe.

After the re-encryption, the system /etc/crypttab will be updated to communicate that this device is now encrypted and should be managed with different tools later.

At the end of the initrd we switch to the new sysroot, now finally located in an encrypted device. The disk-encryption-tool script already did its main job, but it installed two modules for jeos-firstboot, that will be executed on the first boot of the system, which is currently happening!

The first module, enroll, will detect if there is a FIDO2 key inserted and a TPM2 available. If so it will present a dialog asking what do you want to use to unlock the system. The second module will ask the user if the root password will also be enrolled in the LUKS2 header as a new key, and will show the recovery key generated earlier.

As of today it is not advisable to register both. As we described earlier the FIDO2 key will make more sense if we are using a laptop or a desktop machine and we want unlock the encrypted device with proof of a token that we own. This is an interactive process. The TPM2 makes more sense on situations where we do not want to interact with the system, and we want to automatically unlock the device only if we can assert the health of the system (no tamper occured in the boot chain).

If we register the FIDO2 key, systemd-cryptenroll will be called and we will be asked to press the button two times and the installation process will be over. At the next boot we will be required to present the key, and if the key is missing, the recovery password will be asked.

If we register the TPM2 device, a new RSA 2048 key gets generated and stored (the public and private parts) in /etc/systemd and systemd-cryptenroll will be used to enroll the public key and to annotate the PCRs that are used in the sealing of the LUKS2 key. By default we will be using 0, 2, 4, 7, and 9. You can check the meaning in [17]. PCRs 0 and 2 will measure all the UEFI firmware code. PCR 4 will measure the boot loader (systemd-boot) and the kernel (also UEFI binaries). PCR 7 will register all the secure boot certificates, and PCR 9 will be used by the kernel to measure the command line and the initrd.

This covers pretty much all that can make sense, but it is the user who has the final word on what to measure. The reason is that the predictions are done inside sdbootutil that, remember, will be automatically executed after each change in the system (updates, package removal, snapshots management, etc), and this tool will produce predictions only for the PCRs registered in the LUKS2 header.

Regardless of the selected unlocking mechanism, the /etc/crypttab file will be updated with this selection and a new initrd will be generated to contain this information for the next boot.

Finally, the last component, dracut-pcr-signature will be responsible that during the subsequent boots all the information that systemd-cryptsetup requires for the unlock will be present “on-the-fly” inside the initrd. It should be noted that the initrd will require the JSON file with the policies and the key, but those cannot be included in the initrd! The moment that we make a prediction of a PCR that is extended with the hash of the initrd, that is all, and we cannot touch the initrd anymore as this would produce a new hash and automatically will invalidate the prediction.

This dracut module will be executed before the systemd-cryptsetup generator for any encrypted devices has started, and will search in the ESP partition for a tpm2-pcr-signature.json file, that contains all the valid prediction for the current boot. Once this file is in place, the systemd-crypsetup will be able to assert the device in the current state is the expected one and the boot process can continue until the end.

Future

The image is here, and is a sound PoC. It provides a much more simple architecture and will place some components in the correct place. This will help a lot in the next stages, as there are some other things that we want to do with the distribution in relation to FDE.

One pretty clear disk-encryption-tool has limited use outside image based installation. Part of this code should be living in YaST and in Agama. The installer is already creating LUKS2 devices, so it should be “easy” to extend it in a way that works for us.

Ideally, the jeos-firstboot modules should also live in the installer, but somehow they make sense here too. In any case the functionality should not be separated, and both should be merged.

The encryption tool is doing something right from the very start: the master key, together with all the user keys are generated during installation time, but one possible improvement is generating the recovery key a bit later using the systemd tools. It is a small detail, but separating system keys from users keys can simplify the architecture.

Another aspect to improve is that the user may want to use the TPM2 and the FIDO2 key at the same time. For example, by default the TPM2 is used, and if the stage changed in a way that fails the prediction (or there is a security breach that has been detected), the user can delegate the unlock to the FIDO2 key, instead of using a password.

The sdbootutil script contains a bunch of features that should be also living in systemd. Working with upstream will make this tool obsolete with time, which would be more good news.

Another improvement that we can help with in systemd is to improve the diagnosis about the reasons making the TPM2 reject the unseal of the LUKS2 key. Today we have a general fail message without reporting what PCR or what measured component inside the PCR is reporting a different hash than the one predicted. This will help a lot understating what did go wrong. Was the boot loader changed? Or something in the firmware?

pcr-oracle is a very good tool for predicting the next PCR values. It was very easy to extend to parse the new events in the log related with the full measured boot process, including the kernel, systemd-boot extensions on PCR 12, or generating the JSON document required by systemd. The new systemd 255 (released a week ago from the time of writing this) includes a similar tool named systemd-pcrlock that can help us in providing the improved diagnosis that we are looking for. Evaluating this tool to do the predictions will be done soon too.

As today Type#1 and Type#2 entries from the BLS are not isomorphic. There are sections in the EFI file of the UKI format that do not exist in the text representation. Maybe we will decide to use UKIs in the future, or maybe not. So a good improvement is working on helping with this unification, that will (among other things) provide a standard way of splitting the JSON file and associating the predictions to each boot loader entry.

Generating and registering a new key, or selecting a different set of PCRs is today a manual process. The current tools can be extended to help in those processes, or better documentation could be provided.

The new approach for FDE is not about excluding GRUB2 from the equation. It is about providing a chance of using different boot loaders that follows the BLS. Validating that a proper patched (duh!) GRUB2 can work with all this is still something to be done.

Also, another thing that needs to be validated and improved are installations with multiple encrypted disks. In principle the design and the code is supporting it (even when the PCR registers per volume are different). openQA will do wonders here.

And finally, we should rethink if the UKIs do make sense for openSUSE or not. If we go in that direction, the private key used for signing the policies will be kept in OBS and those policies will also be generated in the build service, using a different set of PCR values.

In any case, there is a bunch of work ahead of us.

References

[1] https://cgit.freedesktop.org/gummiboot/

[2] https://uapi-group.org/specifica ... ader_specification/

[3] https://uapi-group.org/specifications/specs/unified_kernel_image/

[4] https://lists.opensuse.org/archi ... JVFNPN7PXWHZZRU5H5/

[5] https://en.opensuse.org/Systemd-boot

[6] https://github.com/yast/yast-bootloader/pull/686

[7] https://github.com/openSUSE/sdbootutil

[8] https://en.opensuse.org/Portal:M ... ation#Measured_boot

[9] https://developers.tpm.dev/

[10] https://trustedcomputinggroup.or ... l-guide-to-tpm-2-0/