Releases: systemd/systemd
Releases Β· systemd/systemd
systemd v254-rc2
systemd System and Service Manager
CHANGES WITH 254 in spe:
Announcements of Future Feature Removals and Incompatible Changes:
* The next release (v255) will remove support for split-usr (/usr/
mounted separately during late boot, instead of being mounted by the
initrd before switching to the rootfs) and unmerged-usr (parallel
directories /bin/ and /usr/bin/, /lib/ and /usr/lib/, β¦). For more
details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html
* We intend to remove cgroup v1 support from a systemd release after
the end of 2023. If you run services that make explicit use of
cgroupΒ v1 features (i.e. the "legacy hierarchy" with separate
hierarchies for each controller), please implement compatibility with
cgroupΒ v2 (i.e. the "unified hierarchy") sooner rather than later.
Most of Linux userspace has been ported over already.
* Support for System V service scripts is now deprecated and will be
removed in a future release. Please make sure to update your software
*now* to include a native systemd unit file instead of a legacy
System V script to retain compatibility with future systemd releases.
* EnvironmentFile= now treats the line following a comment line
trailing with escape as a non comment line. For details, see:
https://github.com/systemd/systemd/issues/27975
* Behaviour of sandboxing options for the per-user service manager
units has changed. They now imply PrivateUsers=yes, which means user
namespaces will be implicitly enabled when a sandboxing option is
enabled in a user unit. Enabling user namespaces has the the drawback
that system users will no longer be visible (and processes/files will
appear as owned by 'nobody') in the user unit.
By definition a sandboxed user unit should run with reduced
privileges, so impact should be small. This will remove a great
source of confusion that has been reported by users over the years,
due to how these options require an extra setting to be manually
enabled when used in the per-user service manager, which is not
needed in the system service manager. For more details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-December/048682.html
Security Relevant Changes:
* pam_systemd will now by default pass the CAP_WAKE_ALARM ambient
process capability to invoked session processes of regular users on
local seats (as well as to systemd --user), unless configured
otherwise via data from JSON user records, or via the PAM module's
parameter list. This is useful in order allow desktop tools such as
GNOME's Alarm Clock application to set a timer for
CLOCK_REALTIME_ALARM that wakes up the system when it elapses. A
per-user service unit file may thus use AmbientCapability= to pass
the capability to invoked processes. Note that this capability is
relatively narrow in focus (in particular compared to other process
capabilities such as CAP_SYS_ADMIN) and we already β by default β
permit more impactful operations such as system suspend to local
users.
Service Manager:
* "Startup" memory settings are now supported. Previously IO and CPU
settings were already supported via StartupCPUWeight= and similar.
The same logic has been added for the various per-unit memory
settings StartupMemoryMax= and related.
* The service manager gained support for enqueuing POSIX signals to
services that carry an additional integer value, exposing the
sigqueue() system call. This is accessible via new D-Bus calls
org.freedesktop.systemd1.Manager.QueueSignalUnit() and
org.freedesktop.systemd1.Unit.QueueSignal(), as well as in systemctl
via the new --kill-value= option.
* systemctl gained a new "list-paths" verb, which shows all currently
active .path units, similarly to how "systemctl list-timers" shows
active timers, and "systemctl list-sockets" shows active sockets.
* systemctl gained a new --when= switch which is honoured by the various
forms of shutdown (i.e. reboot, kexec, poweroff, halt) and allows
scheduling these operations by time, similar in fashion to how this
has been supported by SysV shutdown.
* If MemoryDenyWriteExecute= is enabled for a service and the kernel
supports the new PR_SET_MDWE prctl() call, it is used instead of the
seccomp()-based system call filter to achieve the same effect.
* A new set of kernel command line options is now understood:
systemd.tty.term.<name>=, systemd.tty.rows.<name>=,
systemd.tty.columns.<name>= allow configuring the TTY type and
dimensions for the tty specified via <name>. When systemd invokes a
service on a tty (via TTYName=) it will look for these and configure
the TTY accordingly. This is particularly useful in VM environments
to propagate host terminal settings into the appropriate TTYs of the
guest.
* A new RootEphemeral= setting is now understood in service units. It
takes a boolean argument. If enabled for services that use RootImage=
or RootDirectory= an ephemeral copy of the disk image or directory
tree is made when the service is started. It is removed automatically
when the service is stopped. That ephemeral copy is made using
btrfs/xfs reflinks or btrfs snaphots, if available.
* The service activation logic gained new settings RestartSteps= and
RestartMaxDelaySec= which allow exponentially-growing restart
intervals for Restart=.
* The service activation logic gained a new setting RestartMode= which
can be set to 'direct' to skip the inactive/failed states when
restarting, so that dependent units are not notified until the service
converges to a final (successful or failed) state. For example, this
means that OnSuccess=/OnFailure= units will not be triggered until the
service state has converged.
* PID 1 will now automatically load the virtio_console kernel module
during early initialization if running in a suitable VM. This is done
so that early-boot logging can be written to the console if available.
* Similarly, virtio-vsock support is loaded early in suitable VM
environments. PID 1 will send sd_notify() notifications via AF_VSOCK
to the VMM if configured, thus loading this early is beneficial.
* A new verb "fdstore" has been added to systemd-analyze to show the
current contents of the file descriptor store of a unit. This is
backed by a new D-Bus call DumpUnitFileDescriptorStore() provided by
the service manager.
* The service manager will now set a new $FDSTORE environment variable
when invoking processes for services that have the file descriptor
store enabled.
* A new service option FileDescriptorStorePreserve= has been added that
allows tuning the life-cycle of the per-service file descriptor
store. If set to "yes", the entries in the fd store are retained even
after the service has been fully stopped.
* The "systemctl clean" command may now be used to clear the fdstore of
a service.
* Unit *.preset files gained a new directive "ignore", in addition to
the existing "enable" and "disable". As the name suggests, matching
units are left unchanged, i.e. neither enabled nor disabled.
* Service units gained a new setting DelegateSubgroup=. It takes the
name of a sub-cgroup to place any processes the service manager forks
off in. Previously, the service manager would place all service
processes directly in the top-level cgroup it created for the
service. This usually meant that main process in a service with
delegation enabled would first have to create a subgroup and move
itself down into it, in order to not conflict with the "no processes
in inner cgroups" rule of cgroupΒ v2. With this option, this step is
now handled by PID 1.
* The service manager will now look for .upholds/ directories,
similarly to the existing support for .wants/ and .requires/
directories. Symlinks in this directory result in Upholds=
dependencies.
The [Install] section of unit files gained support for a new
UpheldBy= directive to generate .upholds/ symlinks automatically when
a unit is enabled.
* The service manager now supports a new kernel command line option
systemd.default_device_timeout_sec=, which may be used to override
the default timeout for .device units.
* A new "soft-reboot" mechanism has been added to the service manager.
A "soft reboot" is similar to a regular reboot, except that it
affects userspace only: the service manager shuts down any running
services and other units, then optionally switches into a new root
file system (mounted to /run/nextroot/), and then passes control to a
systemd instance in the new file system which then starts the system
up again. The kernel is not reb...
systemd v254-rc1
systemd System and Service Manager
CHANGES WITH 254 in spe:
Announcements of Future Feature Removals and Incompatible Changes:
* The next release (v255) will remove support for split-usr (/usr/
mounted separately during late boot, instead of being mounted by the
initrd before switching to the rootfs) and unmerged-usr (parallel
directories /bin/ and /usr/bin/, /lib/ and /usr/lib/, β¦). For more
details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html
* We intend to remove cgroup v1 support from a systemd release after
the end of 2023. If you run services that make explicit use of
cgroupΒ v1 features (i.e. the "legacy hierarchy" with separate
hierarchies for each controller), please implement compatibility with
cgroupΒ v2 (i.e. the "unified hierarchy") sooner rather than later.
Most of Linux userspace has been ported over already.
* Support for System V service scripts is now deprecated and will be
removed in a future release. Please make sure to update your software
*now* to include a native systemd unit file instead of a legacy
System V script to retain compatibility with future systemd releases.
* EnvironmentFile= now treats the line following a comment line
trailing with escape as a non comment line. For details, see:
https://github.com/systemd/systemd/issues/27975
* Behaviour of sandboxing options for the per-user service manager
units has changed. They now imply PrivateUsers=yes, which means user
namespaces will be implicitly enabled when a sandboxing option is
enabled in a user unit. Enabling user namespaces has the the drawback
that system users will no longer be visible (and processes/files will
appear as owned by 'nobody') in the user unit.
By definition a sandboxed user unit should run with reduced
privileges, so impact should be small. This will remove a great
source of confusion that has been reported by users over the years,
due to how these options require an extra setting to be manually
enabled when used in the per-user service manager, which is not
needed in the system service manager. For more details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-December/048682.html
Security Relevant Changes:
* pam_systemd will now by default pass the CAP_WAKE_ALARM ambient
process capability to invoked session processes of regular users on
local seats (as well as to systemd --user), unless configured
otherwise via data from JSON user records, or via the PAM module's
parameter list. This is useful in order allow desktop tools such as
GNOME's Alarm Clock application to set a timer for
CLOCK_REALTIME_ALARM that wakes up the system when it elapses. A
per-user service unit file may thus use AmbientCapability= to pass
the capability to invoked processes. Note that this capability is
relatively narrow in focus (in particular compared to other process
capabilities such as CAP_SYS_ADMIN) and we already β by default β
permit more impactful operations such as system suspend to local
users.
Service Manager:
* "Startup" memory settings are now supported. Previously IO and CPU
settings were already supported via StartupCPUWeight= and similar.
The same logic has been added for the various per-unit memory
settings StartupMemoryMax= and related.
* The service manager gained support for enqueuing POSIX signals to
services that carry an additional integer value, exposing the
sigqueue() system call. This is accessible via new D-Bus calls
org.freedesktop.systemd1.Manager.QueueSignalUnit() and
org.freedesktop.systemd1.Unit.QueueSignal(), as well as in systemctl
via the new --kill-value= option.
* systemctl gained a new "list-paths" verb, which shows all currently
active .path units, similarly to how "systemctl list-timers" shows
active timers, and "systemctl list-sockets" shows active sockets.
* systemctl gained a new --when= switch which is honoured by the various
forms of shutdown (i.e. reboot, kexec, poweroff, halt) and allows
scheduling these operations by time, similar in fashion to how this
has been supported by SysV shutdown.
* If MemoryDenyWriteExecute= is enabled for a service and the kernel
supports the new PR_SET_MDWE prctl() call, it is used instead of the
seccomp()-based system call filter to achieve the same effect.
* A new set of kernel command line options is now understood:
systemd.tty.term.<name>=, systemd.tty.rows.<name>=,
systemd.tty.columns.<name>= allow configuring the TTY type and
dimensions for the tty specified via <name>. When systemd invokes a
service on a tty (via TTYName=) it will look for these and configure
the TTY accordingly. This is particularly useful in VM environments
to propagate host terminal settings into the appropriate TTYs of the
guest.
* A new RootEphemeral= setting is now understood in service units. It
takes a boolean argument. If enabled for services that use RootImage=
or RootDirectory= an ephemeral copy of the disk image or directory
tree is made when the service is started. It is removed automatically
when the service is stopped. That ephemeral copy is made using
btrfs/xfs reflinks or btrfs snaphots, if available.
* The service activation logic gained new settings RestartSteps= and
RestartMaxDelaySec= which allow exponentially-growing restart
intervals for Restart=.
* The service activation logic gained a new setting RestartMode= which
can be set to 'direct' to skip the inactive/failed states when
restarting, so that dependent units are not notified until the service
converges to a final (successful or failed) state. For example, this
means that OnSuccess=/OnFailure= units will not be triggered until the
service state has converged.
* PID 1 will now automatically load the virtio_console kernel module
during early initialization if running in a suitable VM. This is done
so that early-boot logging can be written to the console if available.
* Similarly, virtio-vsock support is loaded early in suitable VM
environments. PID 1 will send sd_notify() notifications via AF_VSOCK
to the VMM if configured, thus loading this early is beneficial.
* A new verb "fdstore" has been added to systemd-analyze to show the
current contents of the file descriptor store of a unit. This is
backed by a new D-Bus call DumpUnitFileDescriptorStore() provided by
the service manager.
* The service manager will now set a new $FDSTORE environment variable
when invoking processes for services that have the file descriptor
store enabled.
* A new service option FileDescriptorStorePreserve= has been added that
allows tuning the life-cycle of the per-service file descriptor
store. If set to "yes", the entries in the fd store are retained even
after the service has been fully stopped.
* The "systemctl clean" command may now be used to clear the fdstore of
a service.
* Unit *.preset files gained a new directive "ignore", in addition to
the existing "enable" and "disable". As the name suggests, matching
units are left unchanged, i.e. neither enabled nor disabled.
* Service units gained a new setting DelegateSubgroup=. It takes the
name of a sub-cgroup to place any processes the service manager forks
off in. Previously, the service manager would place all service
processes directly in the top-level cgroup it created for the
service. This usually meant that main process in a service with
delegation enabled would first have to create a subgroup and move
itself down into it, in order to not conflict with the "no processes
in inner cgroups" rule of cgroupΒ v2. With this option, this step is
now handled by PID 1.
* The service manager will now look for .upholds/ directories,
similarly to the existing support for .wants/ and .requires/
directories. Symlinks in this directory result in Upholds=
dependencies.
The [Install] section of unit files gained support for a new
UpheldBy= directive to generate .upholds/ symlinks automatically when
a unit is enabled.
* The service manager now supports a new kernel command line option
systemd.default_device_timeout_sec=, which may be used to override
the default timeout for .device units.
* A new "soft-reboot" mechanism has been added to the service manager.
A "soft reboot" is similar to a regular reboot, except that it
affects userspace only: the service manager shuts down any running
services and other units, then optionally switches into a new root
file system (mounted to /run/nextroot/), and then passes control to a
systemd instance in the new file system which then starts the system
up again. The kernel is not reb...
systemd v253
systemd System and Service Manager
CHANGES WITH 253:
Announcements of Future Feature Removals and Incompatible Changes:
* We intend to remove cgroup v1 support from systemd release after the
end of 2023. If you run services that make explicit use of cgroup v1
features (i.e. the "legacy hierarchy" with separate hierarchies for
each controller), please implement compatibility with cgroup v2 (i.e.
the "unified hierarchy") sooner rather than later. Most of Linux
userspace has been ported over already.
* We intend to remove support for split-usr (/usr mounted separately
during boot) and unmerged-usr (parallel directories /bin and
/usr/bin, /lib and /usr/lib, etc). This will happen in the second
half of 2023, in the first release that falls into that time window.
For more details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html
* We intend to change behaviour w.r.t. units of the per-user service
manager and sandboxing options, so that they work without having to
manually enable PrivateUsers= as well, which is not required for
system units. To make this work, we will implicitly enable user
namespaces (PrivateUsers=yes) when a sandboxing option is enabled in a
user unit. The drawback is that system users will no longer be visible
(and appear as 'nobody') to the user unit when a sandboxing option is
enabled. By definition a sandboxed user unit should run with reduced
privileges, so impact should be small. This will remove a great source
of confusion that has been reported by users over the years, due to
how these options require an extra setting to be manually enabled when
used in the per-user service manager, as opposed as to the system
service manager. We plan to enable this change in the next release
later this year. For more details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-December/048682.html
Deprecations and incompatible changes:
* systemctl will now warn when invoked without /proc/ mounted
(e.g. when invoked after chroot() into an directory tree without the
API mount points like /proc/ being set up.) Operation in such an
environment is not fully supported.
* The return value of 'systemctl is-active|is-enabled|is-failed' for
unknown units is changed: previously 1 or 3 were returned, but now 4
(EXIT_PROGRAM_OR_SERVICES_STATUS_UNKNOWN) is used as documented.
* 'udevadm hwdb' subcommand is deprecated and will emit a warning.
systemd-hwdb (added in 2014) should be used instead.
* 'bootctl --json' now outputs a single JSON array, instead of a stream
of newline-separated JSON objects.
* Udev rules in 60-evdev.rules have been changed to load hwdb
properties for all modalias patterns. Previously only the first
matching pattern was used. This could change what properties are
assigned if the user has more and less specific patterns that could
match the same device, but it is expected that the change will have
no effect for most users.
* systemd-networkd-wait-online exits successfully when all interfaces
are ready or unmanaged. Previously, if neither '--any' nor
'--interface=' options were used, at least one interface had to be in
configured state. This change allows the case where systemd-networkd
is enabled, but no interfaces are configured, to be handled
gracefully. It may occur in particular when a different network
manager is also enabled and used.
* Some compatibility helpers were dropped: EmergencyAction= in the user
manager, as well as measuring kernel command line into PCR 8 in
systemd-stub, along with the -Defi-tpm-pcr-compat compile-time
option.
* The '-Dupdate-helper-user-timeout=' build-time option has been
renamed to '-Dupdate-helper-user-timeout-sec=', and now takes an
integer as parameter instead of a string.
* The DDI image dissection logic (which backs RootImage= in service
unit files, the --image= switch in various tools such as
systemd-nspawn, as well as systemd-dissect) will now only mount file
systems of types btrfs, ext4, xfs, erofs, squashfs, vfat. This list
can be overridden via the $SYSTEMD_DISSECT_FILE_SYSTEMS environment
variable. These file systems are fairly well supported and maintained
in current kernels, while others are usually more niche, exotic or
legacy and thus typically do not receive the same level of security
support and fixes.
* The default per-link multicast DNS mode is changed to "yes"
(that was previously "no"). As the default global multicast DNS mode
has been "yes" (but can be changed by the build option), now the
multicast DNS is enabled on all links by default. You can disable the
multicast DNS on all links by setting MulticastDNS= in resolved.conf,
or on an interface by calling "resolvectl mdns INTERFACE no".
New components:
* A tool 'ukify' tool to build, measure, and sign Unified Kernel Images
(UKIs) has been added. This replaces functionality provided by
'dracut --uefi' and extends it with automatic calculation of PE file
offsets, insertion of signed PCR policies generated by
systemd-measure, support for initrd concatenation, signing of the
embedded Linux image and the combined image with sbsign, and
heuristics to autodetect the kernel uname and verify the splash
image.
Changes in systemd and units:
* A new service type Type=notify-reload is defined. When such a unit is
reloaded a UNIX process signal (typically SIGHUP) is sent to the main
service process. The manager will then wait until it receives a
"RELOADING=1" followed by a "READY=1" notification from the unit as
response (via sd_notify()). Otherwise, this type is the same as
Type=notify. A new setting ReloadSignal= may be used to change the
signal to send from the default of SIGHUP.
user@.service, systemd-networkd.service, systemd-udevd.service, and
systemd-logind have been updated to this type.
* Initrd environments which are not on a pure memory file system (e.g.
overlayfs combination as opposed to tmpfs) are now supported. With
this change, during the initrd β host transition ("switch root")
systemd will erase all files of the initrd only when the initrd is
backed by a memory file system such as tmpfs.
* New per-unit MemoryZSwapMax= option has been added to configure
memory.zswap.max cgroup properties (the maximum amount of zswap
used).
* A new LogFilterPatterns= option has been added for units. It may be
used to specify accept/deny regular expressions for log messages
generated by the unit, that shall be enforced by systemd-journald.
Rejected messages are neither stored in the journal nor forwarded.
This option may be used to suppress noisy or uninteresting messages
from units.
* The manager has a new
org.freedesktop.systemd1.Manager.GetUnitByPIDFD() D-Bus method to
query process ownership via a PIDFD, which is more resilient against
PID recycling issues.
* Scope units now support OOMPolicy=. Login session scopes default to
OOMPolicy=continue, allowing login scopes to survive the OOM killer
terminating some processes in the scope.
* systemd-fstab-generator now supports x-systemd.makefs option for
/sysroot/ (in the initrd).
* The maximum rate at which daemon reloads are executed can now be
limited with the new ReloadLimitIntervalSec=/ReloadLimitBurst=
options. (Or the equivalent on the kernel command line:
systemd.reload_limit_interval_sec=/systemd.reload_limit_burst=). In
addition, systemd now logs the originating unit and PID when a reload
request is received over D-Bus.
* When enabling a swap device systemd will now reinitialize the device
when the page size of the swap space does not match the page size of
the running kernel. Note that this requires the 'swapon' utility to
provide the '--fixpgsz' option, as implemented by util-linux, and it
is not supported by busybox at the time of writing.
* systemd now executes generator programs in a mount namespace
"sandbox" with most of the file system read-only and write access
restricted to the output directories, and with a temporary /tmp/
mount provided. This provides a safeguard against programming errors
in the generators, but also fixes here-docs in shells, which
previously didn't work in early boot when /tmp/ wasn't available
yet. (This feature has no security implications, because the code is
still privileged and can trivially exit the sandbox.)
* The system manager will now parse a new "vmm.notify_socket"
system credential, which may be supplied to a VM via SMBIOS. If
found, the manager will send a "READY=1" notification on the
specified socket after boot is comple...
systemd v253-rc3
systemd System and Service Manager
CHANGES WITH 253 in spe:
Announcements of Future Feature Removals and Incompatible Changes:
* We intend to remove cgroup v1 support from systemd release after the
end of 2023. If you run services that make explicit use of cgroup v1
features (i.e. the "legacy hierarchy" with separate hierarchies for
each controller), please implement compatibility with cgroup v2 (i.e.
the "unified hierarchy") sooner rather than later. Most of Linux
userspace has been ported over already.
* We intend to remove support for split-usr (/usr mounted separately
during boot) and unmerged-usr (parallel directories /bin and
/usr/bin, /lib and /usr/lib, etc). This will happen in the second
half of 2023, in the first release that falls into that time window.
For more details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html
* We intend to change behaviour w.r.t. units of the per-user service
manager and sandboxing options, so that they work without having to
manually enable PrivateUsers= as well, which is not required for
system units. To make this work, we will implicitly enable user
namespaces (PrivateUsers=yes) when a sandboxing option is enabled in a
user unit. The drawback is that system users will no longer be visible
(and appear as 'nobody') to the user unit when a sandboxing option is
enabled. By definition a sandboxed user unit should run with reduced
privileges, so impact should be small. This will remove a great source
of confusion that has been reported by users over the years, due to
how these options require an extra setting to be manually enabled when
used in the per-user service manager, as opposed as to the system
service manager. We plan to enable this change in the next release
later this year. For more details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-December/048682.html
Deprecations and incompatible changes:
* systemctl will now warn when invoked without /proc/ mounted
(e.g. when invoked after chroot() into an directory tree without the
API mount points like /proc/ being set up.) Operation in such an
environment is not fully supported.
* The return value of 'systemctl is-active|is-enabled|is-failed' for
unknown units is changed: previously 1 or 3 were returned, but now 4
(EXIT_PROGRAM_OR_SERVICES_STATUS_UNKNOWN) is used as documented.
* 'udevadm hwdb' subcommand is deprecated and will emit a warning.
systemd-hwdb (added in 2014) should be used instead.
* 'bootctl --json' now outputs a single JSON array, instead of a stream
of newline-separated JSON objects.
* Udev rules in 60-evdev.rules have been changed to load hwdb
properties for all modalias patterns. Previously only the first
matching pattern was used. This could change what properties are
assigned if the user has more and less specific patterns that could
match the same device, but it is expected that the change will have
no effect for most users.
* systemd-networkd-wait-online exits successfully when all interfaces
are ready or unmanaged. Previously, if neither '--any' nor
'--interface=' options were used, at least one interface had to be in
configured state. This change allows the case where systemd-networkd
is enabled, but no interfaces are configured, to be handled
gracefully. It may occur in particular when a different network
manager is also enabled and used.
* Some compatibility helpers were dropped: EmergencyAction= in the user
manager, as well as measuring kernel command line into PCR 8 in
systemd-stub, along with the -Defi-tpm-pcr-compat compile-time
option.
* The '-Dupdate-helper-user-timeout=' build-time option has been
renamed to '-Dupdate-helper-user-timeout-sec=', and now takes an
integer as parameter instead of a string.
* The DDI image dissection logic (which backs RootImage= in service
unit files, the --image= switch in various tools such as
systemd-nspawn, as well as systemd-dissect) will now only mount file
systems of types btrfs, ext4, xfs, erofs, squashfs, vfat. This list
can be overridden via the $SYSTEMD_DISSECT_FILE_SYSTEMS environment
variable. These file systems are fairly well supported and maintained
in current kernels, while others are usually more niche, exotic or
legacy and thus typically do not receive the same level of security
support and fixes.
New components:
* A tool 'ukify' tool to build, measure, and sign Unified Kernel Images
(UKIs) has been added. This replaces functionality provided by
'dracut --uefi' and extends it with automatic calculation of PE file
offsets, insertion of signed PCR policies generated by
systemd-measure, support for initrd concatenation, signing of the
embedded Linux image and the combined image with sbsign, and
heuristics to autodetect the kernel uname and verify the splash
image.
Changes in systemd and units:
* A new service type Type=notify-reload is defined. When such a unit is
reloaded a UNIX process signal (typically SIGHUP) is sent to the main
service process. The manager will then wait until it receives a
"RELOADING=1" followed by a "READY=1" notification from the unit as
response (via sd_notify()). Otherwise, this type is the same as
Type=notify. A new setting ReloadSignal= may be used to change the
signal to send from the default of SIGHUP.
user@.service, systemd-networkd.service, systemd-udevd.service, and
systemd-logind have been updated to this type.
* Initrd environments which are not on a pure memory file system (e.g.
overlayfs combination as opposed to tmpfs) are now supported. With
this change, during the initrd β host transition ("switch root")
systemd will erase all files of the initrd only when the initrd is
backed by a memory file system such as tmpfs.
* New per-unit MemoryZSwapMax= option has been added to configure
memory.zswap.max cgroup properties (the maximum amount of zswap
used).
* A new LogFilterPatterns= option has been added for units. It may be
used to specify accept/deny regular expressions for log messages
generated by the unit, that shall be enforced by systemd-journald.
Rejected messages are neither stored in the journal nor forwarded.
This option may be used to suppress noisy or uninteresting messages
from units.
* The manager has a new
org.freedesktop.systemd1.Manager.GetUnitByPIDFD() D-Bus method to
query process ownership via a PIDFD, which is more resilient against
PID recycling issues.
* Scope units now support OOMPolicy=. Login session scopes default to
OOMPolicy=continue, allowing login scopes to survive the OOM killer
terminating some processes in the scope.
* systemd-fstab-generator now supports x-systemd.makefs option for
/sysroot/ (in the initrd).
* The maximum rate at which daemon reloads are executed can now be
limited with the new ReloadLimitIntervalSec=/ReloadLimitBurst=
options. (Or the equivalent on the kernel command line:
systemd.reload_limit_interval_sec=/systemd.reload_limit_burst=). In
addition, systemd now logs the originating unit and PID when a reload
request is received over D-Bus.
* When enabling a swap device systemd will now reinitialize the device
when the page size of the swap space does not match the page size of
the running kernel. Note that this requires the 'swapon' utility to
provide the '--fixpgsz' option, as implemented by util-linux, and it
is not supported by busybox at the time of writing.
* systemd now executes generator programs in a mount namespace
"sandbox" with most of the file system read-only and write access
restricted to the output directories, and with a temporary /tmp/
mount provided. This provides a safeguard against programming errors
in the generators, but also fixes here-docs in shells, which
previously didn't work in early boot when /tmp/ wasn't available
yet. (This feature has no security implications, because the code is
still privileged and can trivially exit the sandbox.)
* The system manager manager will now parse a new "vmm.notify_socket"
system credential, which may be supplied to a VM via SMBIOS. If
found, the manager will send a "READY=1" notification on the
specified socket after boot is complete. This allows readiness
notification to be sent from a VM guest to the VM host over a VSOCK
socket.
* The sample PAM configuration file for systemd-user@.service now
includes a call to pam_namespace. This puts children of user@.service
in the expected namespace. (Many distributions replace their file
with something custom, so this change has limited effect.)
* A new e...
systemd v253-rc2
systemd System and Service Manager
CHANGES WITH 253 in spe:
Deprecations and incompatible changes:
* systemctl will now warn when invoked without /proc/ mounted
(e.g. when invoked after chroot() into an directory tree without the
API mount points like /proc/ being set up.) Operation in such an
environment is not fully supported.
* The return value of 'systemctl is-active|is-enabled|is-failed' for
unknown units is changed: previously 1 or 3 were returned, but now 4
(EXIT_PROGRAM_OR_SERVICES_STATUS_UNKNOWN) is used as documented.
* 'udevadm hwdb' subcommand is deprecated and will emit a warning.
systemd-hwdb (added in 2014) should be used instead.
* 'bootctl --json' now outputs a single JSON array, instead of a stream
of newline-separated JSON objects.
* Udev rules in 60-evdev.rules have been changed to load hwdb
properties for all modalias patterns. Previously only the first
matching pattern was used. This could change what properties are
assigned if the user has more and less specific patterns that could
match the same device, but it is expected that the change will have
no effect for most users.
* systemd-networkd-wait-online exits successfully when all interfaces
are ready or unmanaged. Previously, if neither '--any' nor
'--interface=' options were used, at least one interface had to be in
configured state. This change allows the case, where systemd-networkd
is enabled but no interfaces are configured, to be handled
gracefully. It may occur in particular when a different network
manager is also enabled and used.
* Some compatibility helpers were dropped: EmergencyAction= in the user
manager, as well as measuring kernel command line into PCR 8 in
systemd-stub, along with the -Defi-tpm-pcr-compat compile-time
option.
* The '-Dupdate-helper-user-timeout=' build-time option has been
renamed to '-Dupdate-helper-user-timeout-sec=', and now takes an
integer as parameter instead of a string.
* The DDI image dissection logic (which backs RootImage= in service
unit files, the --image= switch in various tools such as
systemd-nspawn, as well as systemd-dissect) will now only mount file
systems of types btrfs, ext4, xfs, erofs, squashfs, vfat. This list
can be overridden via the $SYSTEMD_DISSECT_FILE_SYSTEMS environment
variable. These file systems are fairly well supported and maintained
in current kernels, while others are usually more niche, exotic or
legacy and thus typically do not receive the same level of security
support and fixes.
New components:
* A tool 'ukify' tool to build, measure, and sign Unified Kernel Images
(UKIs) has been added. This replaces functionality provided by
'dracut --uefi' and extends it with automatic calculation of PE file
offsets, insertion of signed PCR policies generated by
systemd-measure, support for initrd concatenation, signing of the
embedded Linux image and the combined image with sbsign, and
heuristics to autodetect the kernel uname and verify the splash
image.
Changes in systemd and units:
* A new service type Type=notify-reload is defined. When such a unit is
reloaded a UNIX process signal (typically SIGHUP) is sent to the main
service process. The manager will then wait until it receives a
"RELOADING=1" followed by a "READY=1" notification from the unit as
response (via sd_notify()). Otherwise, this type is the same as
Type=notify. A new setting ReloadSignal= may be used to change the
signal to send from the default of SIGHUP.
user@.service, systemd-networkd.service, systemd-udevd.service, and
systemd-logind have been updated to this type.
* Initrd environments which are not on a pure memory file system (e.g.
overlayfs combination as opposed to tmpfs) are now supported. With
this change, during the initrd β host transition ("switch root")
systemd will no longer erase all files of the initrd unless it's
backed by a memory file system such as tmpfs.
* New per-unit MemoryZSwapMax= option has been added to configure
memory.zswap.max cgroup properties (the maximum amount of zswap
used).
* A new LogFilterPatterns= option has been added for units. It may be
used to specify accept/deny regular expressions for log messages
generated by the unit, that shall be enforced by systemd-journald.
Rejected messages are neither stored in the journal nor forwarded.
This option may be used to suppress noisy or uninteresting messages
from units.
* The manager has a new
org.freedesktop.systemd1.Manager.GetUnitByPIDFD() D-Bus method to
query process ownership via a PIDFD, which is more resilient against
PID recycling issues.
* Scope units now support OOMPolicy=. Login session scopes default to
OOMPolicy=continue, allowing login scopes to survive the OOM killer
terminating some processes in the scope.
* systemd-fstab-generator now supports x-systemd.makefs option for
/sysroot/ (in the initrd).
* The maximum rate at which daemon reloads are executed can now be
limited with the new ReloadLimitIntervalSec=/ReloadLimitBurst=
options. (Or the equivalent on the kernel command line:
systemd.reload_limit_interval_sec=/systemd.reload_limit_burst=). In
addition, systemd now logs the originating unit and PID when a reload
request is received over D-Bus.
* When enabling a swap device systemd will now reinitialize the device
when the page size of the swap space does not match the page size of
the running kernel.
* systemd now executes generator programs in a mount namespace
"sandbox" with most of the file system read-only and write access
restricted to the output directories, and with a temporary /tmp/
mount provided. This provides a safeguard against programming errors
in the generators, but also fixes here-docs in shells, which
previously didn't work in early boot when /tmp/ wasn't available
yet. (This feature has no security implications, because the code is
still privileged and can trivially exit the sandbox.)
* The system manager manager will now parse a new "vmm.notify_socket"
system credential, which may be supplied to a VM via SMBIOS. If
found, it will send a "READY=1" notification on the specified socket
after boot is complete. This allows readiness notification to be sent
from a VM guest to the VM host over a VSOCK socket.
* The sample PAM configuration file for systemd-user@.service now
includes a call to pam_namespace. This puts children of user@.service
in the expected namespace. (Many distributions replace their file
with something custom, so this change has limited effect.)
* A new environment variable $SYSTEMD_DEFAULT_MOUNT_RATE_LIMIT_BURST
can can be used to override the mount units burst late limit for
parsing '/proc/self/mountinfo', which was introduced in
v249. Defaults to 5.
* Drop-ins for init.scope changing control group resource limits are
now applied, while they were previously ignored.
* New build-time configuration options '-Ddefault-timeout-sec=' and
'-Ddefault-user-timeout-sec=' have been added, to let distributions
choose the default timeout for starting/stopping/aborting system and
user units respectively.
* Service units gained a new setting OpenFile= which may be used to
open arbitrary files in the file system (or connect to arbitrary
AF_UNIX sockets in the file system), and pass the open file
descriptor to the invoked process via the usual file descriptor
passing protocol. This is useful to give unprivileged services access
to select files which have restrictive access modes that would
normally not allow this. It's also useful in case RootDirectory= or
RootImage= is used to allow access to files from the host environment
(which is after all not visible from the service if these two options
are used.)
Changes in udev:
* The new net naming scheme "v253" has been introduced. In the new
scheme, ID_NET_NAME_PATH is also set for USB devices not connected via
a PCI bus. This extends the coverage of predictable interface names
in some embedded systems.
The "amba" bus path is now included in ID_NET_NAME_PATH, resulting in
a more informative path on some embedded systems.
* Partition block devices will now also get symlinks in
/dev/disk/by-diskseq/<seq>-part<n>, which may be used to reference
block device nodes via the kernel's "diskseq" value. Previously those
symlinks were only created for the main block device.
* A new operator '-=' is supported for SYMLINK variables. This allows
symlinks to be unconfigured even if an earlier rule added them.
* 'udevadm --trigger --settle' now also works for network devices
...
systemd v253-rc1
systemd System and Service Manager
CHANGES WITH 253 in spe:
Deprecations and incompatible changes
* systemctl will now warn when invoked without /proc mounted (e.g. when
invoked after chroot into an image without the API mount points like
/proc being set up.) Operation in such an environment is not fully
supported.
* The return value of 'systemctl is-active|is-enabled|is-failed' for
unknown units is changed: previously 1 or 3 were returned, but now 4
(EXIT_PROGRAM_OR_SERVICES_STATUS_UNKNOWN) is used as documented.
* 'udevadm hwdb' subcommand is deprecated and will emit a warning.
systemd-hwdb (added in 2014) should be used instead.
* 'bootctl --json' now outputs well-formed JSON, instead of a stream
of newline-separated JSON objects.
* Udev rules in 60-evdev.rules have been changed to load hwdb properties
for all modalias patterns. Previously only the first matching pattern
was used. This could change what properties are assigned if the user
has more and less specific patterns that could match the same device,
but it is expected that the change will have no effect for most users.
* systemd-networkd-wait-online exits successfully when all interfaces
are ready or unmanaged. Previously, if neither '--any' nor
'--interface=' options were used, at least one interface had to be in
configured state. This change allows the case, where systemd-networkd
is enabled but no interfaces are configured, to be handled
gracefully. It may occur in particular when a different network
manager is also enabled and used.
* Some compatibility helpers were dropped: EmergencyAction= in the user
manager, measuring kernel command line into PCR 8 along with the
-Defi-tpm-pcr-compat compile-time option.
New components:
* A tool 'ukify' tool to build, measure, and sign Unified Kernel Images
(UKIs) has been added. This replaces functionality provided by
'dracut --uefi' and extends it with automatic calculation of offsets,
insertion of signed PCR policies generated by systemd-measure,
support for initrd concatenation, signing of the embedded Linux image
and the combined image with sbsign, and heuristics to autodetect the
kernel uname and verify the splash image.
Changes in systemd and units:
* A new unit type Type=notify-reload is defined. When such a unit is
reloaded via a signal, the manager will wait until it receives a
"READY=1" notification from the unit. Otherwise, this type is the
same as Type=notify.
user@.service, systemd-networkd.service, systemd-udevd.service, and
systemd-logind have been updated to this type; their reloads are now
synchronuous.
* Initrd environments which are not on a temporary file system (for
example an overlayfs combination) are now supported. Systemd will only
skip removal of the files in the initrd if it doesn't detect a
temporary file system.
* New MemoryZSwapMax= option has been added to configure
memory.zswap.max cgroup properties (the maximum amount of zswap used).
* New LogFilterPatterns= option can be used to specify regexp
accept/deny patterns for log entries generated by the unit. Based on
the option value, the manager sets the
user.journald_log_filter_patterns extended attribute on the unit
cgroup. systemd-journald checks for this attribute when receiving
messages, and will filter messages by matching the MESSAGE= part.
Rejected messages are neither stored in the journal nor forwarded.
This option can be used to filter noisy or uninteresting messages
from units.
* The manager has a new
org.freedesktop.systemd1.Manager.GetUnitByPIDFD() method to query
process ownership via a PIDFD, which is more resilient against PID
recycling issues.
* Scope units now support OOMPolicy=. Login session scopes default to
OOMPolicy=continue, allowing login scopes to survive the OOM killer
terminating some processes in the scope.
* systemd-fstab-generator now supports x-systemd.makefs option for
/sysroot (in the initrd).
* The maximum rate at which daemon reloads are executed can now be
limited with the new ReloadLimitIntervalSec=/ReloadLimitBurst=
options. (Or the equivalent on the kernel command line:
systemd.reload_limit_interval_sec=/systemd.reload_limit_burst=).
In addition, systemd now logs the originating unit and PID when
a reload request is received over D-Bus.
* When enabling a swap device, instead of failing, systemd will now
reinitialize the device when the page size of the swap space does not
match the page size of the running kernel.
* Systemd now executes generators in a mount namespace "sandbox" with
most of the file system read-only, but with write access to the
output directories, and with a temporary /tmp/ mount provided. This
provides a safeguard against programming errors in the generators,
but also fixes here-docs in shells, which previously didn't work in
early boot when /tmp/ wasn't available yet. (This feature has no
security implications, because the code is still privileged and can
trivially exit the sandbox.)
* The manager will load the vmm.notify_socket credential. If found,
it will send a "READY=1" notification on the specified socket after
boot is complete. This allows readiness notification to be sent
from a VM guest to the host over a VSOCK socket.
* The sample PAM configuration file for systemd-user@.service now
includes a call to pam_namespace. This puts children of user@.service
in the expected namespace. (Many distributions replace their file
with something custom, so this change has limited effect.)
* A new environment variable $SYSTEMD_DEFAULT_MOUNT_RATE_LIMIT_BURST can
can be used to override the mount units burst late limit for parsing
'/proc/self/mountinfo', which was introduced in v249. Defaults to 5.
* Drop-ins for init.scope changing control cgroup resource limits are
now applied, while they were previously ignored.
Changes in udev:
* The new net naming scheme "v253" has been introduced. In the new
scheme, ID_NET_NAME_PATH is also set for USB devices not connected via
a PCI bus. This extends the coverage of predictable interface names
in some embedded systems.
The "amba" bus path is now included in ID_NET_NAME_PATH, resulting in
a more informative path on some embedded systems.
* Block partitions will now also get symlinks in
/dev/disk/by-diskseq/<seq>-part<n>, which may be used to reference
block device nodes via the kernel's "diskseq" value. Previously those
symlinks were only created for the main block device.
* A new operator '-=' is supported for SYMLINK variables. This allows
symlinks to be unconfigured even if an earlier rule added them.
* 'udevadm --trigger --settle' now also works for network devices
that are being renamed.
Changes in sd-boot, bootctl, and the Boot Loader Specification:
* systemd-boot now passes its random seed directly to the kernel's RNG
via the LINUX_EFI_RANDOM_SEED_TABLE_GUID configuration table, which
means the RNG gets seeded very early in boot before userspace has
started.
* systemd-boot will pass a random seed when secure boot is enabled if
it can additionally get a random seed from EFI itself, via EFI's RNG
protocol or a prior seed in LINUX_EFI_RANDOM_SEED_TABLE_GUID from a
preceding bootloader.
* systemd-boot-system-token.service was renamed to
systemd-boot-random-seed.service and extended to always save the
random seed to ESP on every boot when a compatible boot loader is
used. This allows a refreshed random seed to be used in the boot
loader.
* systemd-boot handles various seed inputs using a domain- and
field-separated hashing scheme.
* systemd-boot's 'random-seed-mode' option has been removed. A system
token is now always required to be present for random seeds to be
used.
* systemd-boot now supports being loaded not from the ESP, for example
for direct kernel boot under QEMU or when embedded into the firmware.
* systemd-boot now parses SMBIOS info to detect virtualization. This
information is used to skip some warnings which are not useful in a
VM and to conditionalize other aspects of behaviour.
* systemd-stub now processes random seeds in the same way as
systemd-boot, in case a unified kernel image is being used from a
different bootloader than systemd-boot.
* bootctl will now generate a system token on all EFI systems, even
virtualized ones, and is activated in the case that the system token
is missing from either sd-boot and sd-stub booted systems.
* bootctl now implements two new verbs: 'kernel-identify' prints the
type of a kernel image, and 'kern...
systemd v252
systemd System and Service Manager
CHANGES WITH 252 π:
Announcements of Future Feature Removals:
* We intend to remove cgroup v1 support from systemd release after the
end of 2023. If you run services that make explicit use of cgroup v1
features (i.e. the "legacy hierarchy" with separate hierarchies for
each controller), please implement compatibility with cgroup v2 (i.e.
the "unified hierarchy") sooner rather than later. Most of Linux
userspace has been ported over already.
* We intend to remove support for split-usr (/usr mounted separately
during boot) and unmerged-usr (parallel directories /bin and
/usr/bin, /lib and /usr/lib, etc). This will happen in the second
half of 2023, in the first release that falls into that time window.
For more details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html
Compatibility Breaks:
* ConditionKernelVersion= checks that use the '=' or '!=' operators
will now do simple string comparisons (instead of version comparisons
Γ‘ la stverscmp()). Version comparisons are still done for the
ordering operators '<', '>', '<=', '>='. Moreover, if no operator is
specified, a shell-style glob match is now done. This creates a minor
incompatibility compared to older systemd versions when the '*', '?',
'[', ']' characters are used, as these will now match as shell globs
instead of literally. Given that kernel version strings typically do
not include these characters we expect little breakage through this
change.
* The service manager will now read the SELinux label used for SELinux
access checks from the unit file at the time it loads the file.
Previously, the label would be read at the moment of the access
check, which was problematic since at that time the unit file might
already have been updated or removed.
New Features:
* systemd-measure is a new tool for calculating and signing expected
TPM2 PCR values for a given unified kernel image (UKI) booted via
sd-stub. The public key used for the signature and the signed
expected PCR information can be embedded inside the UKI. This
information can be extracted from the UKI by external tools and code
in the image itself and is made available to userspace in the booted
kernel.
systemd-cryptsetup, systemd-cryptenroll, and systemd-creds have been
updated to make use of this information if available in the booted
kernel: when locking an encrypted volume/credential to the TPM
systemd-cryptenroll/systemd-creds will use the public key to bind the
volume/credential to any kernel that carries PCR information signed
by the same key pair. When unlocking such volumes/credentials
systemd-cryptsetup/systemd-creds will use the signature embedded in
the booted UKI to gain access.
Binding TPM-based disk encryption to public keys/signatures of PCR
values β instead of literal PCR values β addresses the inherent
"brittleness" of traditional PCR-bound TPM disk encryption schemes:
disks remain accessible even if the UKI is updated, without any TPM
specific preparation during the OS update β as long as each UKI
carries the necessary PCR signature information.
Net effect: if you boot a properly prepared kernel, TPM-bound disk
encryption now defaults to be locked to kernels which carry PCR
signatures from the same key pair. Example: if a hypothetical distro
FooOS prepares its UKIs like this, TPM-based disk encryption is now β
by default β bound to only FooOS kernels, and encrypted volumes bound
to the TPM cannot be unlocked on kernels from other sources. (But do
note this behaviour requires preparation/enabling in the UKI, and of
course users can always enroll non-TPM ways to unlock the volume.)
* systemd-pcrphase is a new tool that is invoked at six places during
system runtime, and measures additional words into TPM2 PCR 11, to
mark milestones of the boot process. This allows binding access to
specific TPM2-encrypted secrets to specific phases of the boot
process. (Example: LUKS2 disk encryption key only accessible in the
initrd, but not later.)
Changes in systemd itself, i.e. the manager and units
* The cpu controller is delegated to user manager units by default, and
CPUWeight= settings are applied to the top-level user slice units
(app.slice, background.slice, session.slice). This provides a degree
of resource isolation between different user services competing for
the CPU.
* Systemd can optionally do a full preset in the "first boot" condition
(instead of just enable-only). This behaviour is controlled by the
compile-time option -Dfirst-boot-full-preset. Right now it defaults
to 'false', but the plan is to switch it to 'true' for the subsequent
release.
* Drop-ins are now allowed for transient units too.
* Systemd will set the taint flag 'support-ended' if it detects that
the OS image is past its end-of-support date. This date is declared
in a new /etc/os-release field SUPPORT_END= described below.
* Two new settings ConditionCredential= and AssertCredential= can be
used to skip or fail units if a certain system credential is not
provided.
* ConditionMemory= accepts size suffixes (K, M, G, T, β¦).
* DefaultSmackProcessLabel= can be used in system.conf and user.conf to
specify the SMACK security label to use when not specified in a unit
file.
* DefaultDeviceTimeoutSec= can be used in system.conf and user.conf to
specify the default timeout when waiting for device units to
activate.
* C.UTF-8 is used as the default locale if nothing else has been
configured.
* [Condition|Assert]Firmware= have been extended to support certain
SMBIOS fields. For example
ConditionFirmware=smbios-field(board_name = "Custom Board")
conditionalizes the unit to run only when
/sys/class/dmi/id/board_name contains "Custom Board" (without the
quotes).
* ConditionFirstBoot= now correctly evaluates as true only during the
boot phase of the first boot. A unit executed later, after booting
has completed, will no longer evaluate this condition as true.
* Socket units will now create sockets in the SELinuxContext= of the
associated service unit, if any.
* Boot phase transitions (start initrd β exit initrd β boot complete β
shutdown) will be measured into TPM2 PCR 11, so that secrets can be
bound to a specific runtime phase. E.g.: a LUKS encryption key can be
unsealed only in the initrd.
* Service credentials (i.e. SetCredential=/LoadCredential=/β¦) will now
also be provided to ExecStartPre= processes.
* Various units are now correctly ordered against
initrd-switch-root.target where previously a conflict without
ordering was configured. A stop job for those units would be queued,
but without the ordering it could be executed only after
initrd-switch-root.service, leading to units not being restarted in
the host system as expected.
* In order to fully support the IPMI watchdog driver, which has not yet
been ported to the new common watchdog device interface,
/dev/watchdog0 will be tried first and systemd will silently fallback
to /dev/watchdog if it is not found.
* New watchdog-related D-Bus properties are now published by systemd:
WatchdogDevice, WatchdogLastPingTimestamp,
WatchdogLastPingTimestampMonotonic.
* At shutdown, API virtual files systems (proc, sys, etc.) will be
unmounted lazily.
* At shutdown, systemd will now log about processes blocking unmounting
of file systems.
* A new meson build option 'clock-valid-range-usec-max' was added to
allow disabling system time correction if RTC returns a timestamp far
in the future.
* Propagated restart jobs will no longer be discarded while a unit is
activating.
* PID 1 will now import system credentials from SMBIOS Type 11 fields
("OEM vendor strings"), in addition to qemu_fwcfg. This provides a
simple, fast and generic path for supplying credentials to a VM,
without involving external tools such as cloud-init/ignition.
* The CPUWeight= setting of unit files now accepts a new special value
"idle", which configures "idle" level scheduling for the unit.
* Service processes that are activated due to a .timer or .path unit
triggering will now receive information about this via environment
variables. Note that this is information is lossy, as activation
might be coalesced and only one of the activating triggers will be
reported. This is hence more suited for debugging or tracing rather
than for behaviour decisions.
* The riscv_flush_icache(2) system call has been added to the list of
system calls allowed by default when ...
systemd v252-rc3
systemd System and Service Manager
CHANGES WITH 252 in spe:
Announcements of Future Feature Removals:
* We intend to remove cgroup v1 support from systemd release after the
end of 2023. If you run services that make explicit use of cgroup v1
features (i.e. the "legacy hierarchy" with separate hierarchies for
each controller), please implement compatibility with cgroup v2 (i.e.
the "unified hierarchy") sooner rather than later. Most of Linux
userspace has been ported over already.
* We intend to remove support for split-usr (/usr mounted separately
during boot) and unmerged-usr (parallel directories /bin and
/usr/bin, /lib and /usr/lib, etc). This will happen in the second
half of 2023, in the first release that falls into that time window.
For more details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html
Compatibility Breaks:
* ConditionKernelVersion= checks that use the '=' or '!=' operators
will now do simple string comparisons (instead of version comparisons
Γ‘ la stverscmp()). Version comparisons are still done for the
ordering operators '<', '>', '<=', '>='. Moreover, if no operator is
specified, a shell-style glob match is now done. This creates a minor
incompatibility compared to older systemd versions when the '*', '?',
'[', ']' characters are used, as these will now match as shell globs
instead of literally. Given that kernel version strings typically do
not include these characters we expect little breakage through this
change.
* The service manager will now read the SELinux label used for SELinux
access checks from the unit file at the time it loads the file.
Previously, the label would be read at the moment of the access
check, which was problematic since at that time the unit file might
already have been updated or removed.
New Features:
* systemd-measure is a new tool for calculating and signing expected
TPM2 PCR values for a given unified kernel image (UKI) booted via
sd-stub. The public key used for the signature and the signed
expected PCR information can be embedded inside the UKI. This
information can be extracted from the UKI by external tools and code
in the image itself and is made available to userspace in the booted
kernel.
systemd-cryptsetup, systemd-cryptenroll, and systemd-creds have been
updated to make use of this information if available in the booted
kernel: when locking an encrypted volume/credential to the TPM
systemd-cryptenroll/systemd-creds will use the public key to bind the
volume/credential to any kernel that carries PCR information signed
by the same key pair. When unlocking such volumes/credentials
systemd-cryptsetup/systemd-creds will use the signature embedded in
the booted UKI to gain access.
Binding TPM-based disk encryption to public keys/signatures of PCR
values β instead of literal PCR values β addresses the inherent
"brittleness" of traditional PCR-bound TPM disk encryption schemes:
disks remain accessible even if the UKI is updated, without any TPM
specific preparation during the OS update β as long as each UKI
carries the necessary PCR signature information.
Net effect: if you boot a properly prepared kernel, TPM-bound disk
encryption now defaults to be locked to kernels which carry PCR
signatures from the same key pair. Example: if a hypothetical distro
FooOS prepares its UKIs like this, TPM-based disk encryption is now β
by default β bound to only FooOS kernels, and encrypted volumes bound
to the TPM cannot be unlocked on kernels from other sources. (But do
note this behaviour requires preparation/enabling in the UKI, and of
course users can always enroll non-TPM ways to unlock the volume.)
* systemd-pcrphase is a new tool that is invoked at six places during
system runtime, and measures additional words into TPM2 PCR 11, to
mark milestones of the boot process. This allows binding access to
specific TPM2-encrypted secrets to specific phases of the boot
process. (Example: LUKS2 disk encryption key only accessible in the
initrd, but not later.)
Changes in systemd itself, i.e. the manager and units
* The cpu controller is delegated to user manager units by default, and
CPUWeight= settings are applied to the top-level user slice units
(app.slice, background.slice, session.slice). This provides a degree
of resource isolation between different user services competing for
the CPU.
* Systemd can optionally do a full preset in the "first boot" condition
(instead of just enable-only). This behaviour is controlled by the
compile-time option -Dfirst-boot-full-preset. Right now it defaults
to 'false', but the plan is to switch it to 'true' for the subsequent
release.
* Drop-ins are now allowed for transient units too.
* Systemd will set the taint flag 'support-ended' if it detects that
the OS image is past its end-of-support date. This date is declared
in a new /etc/os-release field SUPPORT_END= described below.
* Two new settings ConditionCredential= and AssertCredential= can be
used to skip or fail units if a certain system credential is not
provided.
* ConditionMemory= accepts size suffixes (K, M, G, T, β¦).
* DefaultSmackProcessLabel= can be used in system.conf and user.conf to
specify the SMACK security label to use when not specified in a unit
file.
* DefaultDeviceTimeoutSec= can be used in system.conf and user.conf to
specify the default timeout when waiting for device units to
activate.
* C.UTF-8 is used as the default locale if nothing else has been
configured.
* [Condition|Assert]Firmware= have been extended to support certain
SMBIOS fields. For example
ConditionFirmware=smbios-field(board_name = "Custom Board")
conditionalizes the unit to run only when
/sys/class/dmi/id/board_name contains "Custom Board" (without the
quotes).
* ConditionFirstBoot= now correctly evaluates as true only during the
boot phase of the first boot. A unit executed later, after booting
has completed, will no longer evaluate this condition as true.
* Socket units will now create sockets in the SELinuxContext= of the
associated service unit, if any.
* Boot phase transitions (start initrd β exit initrd β boot complete β
shutdown) will be measured into TPM2 PCR 11, so that secrets can be
bound to a specific runtime phase. E.g.: a LUKS encryption key can be
unsealed only in the initrd.
* Service credentials (i.e. SetCredential=/LoadCredential=/β¦) will now
also be provided to ExecStartPre= processes.
* Various units are now correctly ordered against
initrd-switch-root.target where previously a conflict without
ordering was configured. A stop job for those units would be queued,
but without the ordering it could be executed only after
initrd-switch-root.service, leading to units not being restarted in
the host system as expected.
* In order to fully support the IPMI watchdog driver, which has not yet
been ported to the new common watchdog device interface,
/dev/watchdog0 will be tried first and systemd will silently fallback
to /dev/watchdog if it is not found.
* New watchdog-related D-Bus properties are now published by systemd:
WatchdogDevice, WatchdogLastPingTimestamp,
WatchdogLastPingTimestampMonotonic.
* At shutdown, API virtual files systems (proc, sys, etc.) will be
unmounted lazily.
* At shutdown, systemd will now log about processes blocking unmounting
of file systems.
* A new meson build option 'clock-valid-range-usec-max' was added to
allow disabling system time correction if RTC returns a timestamp far
in the future.
* Propagated restart jobs will no longer be discarded while a unit is
activating.
* PID 1 will now import system credentials from SMBIOS Type 11 fields
("OEM vendor strings"), in addition to qemu_fwcfg. This provides a
simple, fast and generic path for supplying credentials to a VM,
without involving external tools such as cloud-init/ignition.
* The CPUWeight= setting of unit files now accepts a new special value
"idle", which configures "idle" level scheduling for the unit.
* Service processes that are activated due to a .timer or .path unit
triggering will now receive information about this via environment
variables. Note that this is information is lossy, as activation
might be coalesced and only one of the activating triggers will be
reported. This is hence more suited for debugging or tracing rather
than for behaviour decisions.
* The riscv_flush_icache(2) system call has been added to the list of
system calls allowed by default when ...
systemd v252-rc2
CHANGES WITH 252 in spe:
Announcements of Future Feature Removals:
* We intend to remove cgroup v1 support from systemd release after the
end of 2023. If you run services that make explicit use of cgroup v1
features (i.e. the "legacy hierarchy" with separate hierarchies for
each controller), please implement compatibility with cgroup v2 (i.e.
the "unified hierarchy") sooner rather than later. Most of Linux
userspace has been ported over already.
* We intend to remove support for split-usr (/usr mounted separately
during boot) and unmerged-usr (parallel directories /bin and
/usr/bin, /lib and /usr/lib, etc). This will happen in the second
half of 2023, in the first release that falls into that time window.
For more details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html
Compatibility Breaks:
* ConditionKernelVersion= checks that use the '=' or '!=' operators
will now do simple string comparisons (instead of version comparisons
Γ‘ la stverscmp()). Version comparisons are still done for the
ordering operators '<', '>', '<=', '>='. Moreover, if no operator is
specified, a shell-style glob match is now done. This creates a minor
incompatibility compared to older systemd versions when the '*', '?',
'[', ']' characters are used, as these will now match as shell globs
instead of literally. Given that kernel version strings typically do
not include these characters we expect little breakage through this
change.
* The service manager will now read the SELinux label used for SELinux
access checks from the unit file at the time it loads the file.
Previously, the label would be read at the moment of the access
check, which was problematic since at that time the unit file might
already have been updated or removed.
New Features:
* systemd-measure is a new tool for calculating and signing expected
TPM2 PCR values for a given unified kernel image (UKI) booted via
sd-stub. The public key used for the signature and the signed
expected PCR information can be embedded inside the UKI. This
information can be extracted from the UKI by external tools and code
in the image itself and is made available to userspace in the booted
kernel.
systemd-cryptsetup, systemd-cryptenroll, and systemd-creds have been
updated to make use of this information if available in the booted
kernel: when locking an encrypted volume/credential to the TPM
systemd-cryptenroll/systemd-creds will use the public key to bind the
volume/credential to any kernel that carries PCR information signed
by the same key pair. When unlocking such volumes/credentials
systemd-cryptsetup/systemd-creds will use the signature embedded in
the booted UKI to gain access.
Binding TPM-based disk encryption to public keys/signatures of PCR
values β instead of literal PCR values β addresses the inherent
"brittleness" of traditional PCR-bound TPM disk encryption schemes:
disks remain accessible even if the UKI is updated, without any TPM
specific preparation during the OS update β as long as each UKI
carries the necessary PCR signature information.
Net effect: if you boot a properly prepared kernel, TPM-bound disk
encryption now defaults to be locked to kernels which carry PCR
signatures from the same key pair. Example: if a hypothetical distro
FooOS prepares its UKIs like this, TPM-based disk encryption is now β
by default β bound to only FooOS kernels, and encrypted volumes bound
to the TPM cannot be unlocked on kernels from other sources. (But do
note this behaviour requires preparation/enabling in the UKI, and of
course users can always enroll non-TPM ways to unlock the volume.)
* systemd-pcrphase is a new tool that is invoked at six places during
system runtime, and measures additional words into TPM2 PCR 11, to
mark milestones of the boot process. This allows binding access to
specific TPM2-encrypted secrets to specific phases of the boot
process. (Example: LUKS2 disk encryption key only accessible in the
initrd, but not later.)
Changes in systemd itself, i.e. the manager and units
* The cpu controller is delegated to user manager units by default, and
CPUWeight= settings are applied to the top-level user slice units
(app.slice, background.slice, session.slice). This provides a degree
of resource isolation between different user services competing for
the CPU.
* Systemd can optionally do a full preset in the "first boot" condition
(instead of just enable-only). This behaviour is controlled by the
compile-time option -Dfirst-boot-full-preset. Right now it defaults
to 'false', but the plan is to switch it to 'true' for the subsequent
release.
* Drop-ins are now allowed for transient units too.
* Systemd will set the taint flag 'support-ended' if it detects that
the OS image is past its end-of-support date. This date is declared
in a new /etc/os-release field SUPPORT_END= described below.
* Two new settings ConditionCredential= and AssertCredential= can be
used to skip or fail units if a certain system credential is not
provided.
* ConditionMemory= accepts size suffixes (K, M, G, T, β¦).
* DefaultSmackProcessLabel= can be used in system.conf and user.conf to
specify the SMACK security label to use when not specified in a unit
file.
* DefaultDeviceTimeoutSec= can be used in system.conf and user.conf to
specify the default timeout when waiting for device units to
activate.
* C.UTF-8 is used as the default locale if nothing else has been
configured.
* [Condition|Assert]Firmware= have been extended to support certain
SMBIOS fields. For example
ConditionFirmware=smbios-field(board_name = "Custom Board")
conditionalizes the unit to run only when
/sys/class/dmi/id/board_name contains "Custom Board" (without the
quotes).
* ConditionFirstBoot= now correctly evaluates as true only during the
boot phase of the first boot. A unit executed later, after booting
has completed, will no longer evaluate this condition as true.
* Socket units will now create sockets in the SELinuxContext= of the
associated service unit, if any.
* Boot phase transitions (start initrd β exit initrd β boot complete β
shutdown) will be measured into TPM2 PCR 11, so that secrets can be
bound to a specific runtime phase. E.g.: a LUKS encryption key can be
unsealed only in the initrd.
* Service credentials (i.e. SetCredential=/LoadCredential=/β¦) will now
also be provided to ExecStartPre= processes.
* Various units are now correctly ordered against
initrd-switch-root.target where previously a conflict without
ordering was configured. A stop job for those units would be queued,
but without the ordering it could be executed only after
initrd-switch-root.service, leading to units not being restarted in
the host system as expected.
* In order to fully support the IPMI watchdog driver, which has not yet
been ported to the new common watchdog device interface,
/dev/watchdog0 will be tried first and systemd will silently fallback
to /dev/watchdog if it is not found.
* New watchdog-related D-Bus properties are now published by systemd:
WatchdogDevice, WatchdogLastPingTimestamp,
WatchdogLastPingTimestampMonotonic.
* At shutdown, API virtual files systems (proc, sys, etc.) will be
unmounted lazily.
* At shutdown, systemd will now log about processes blocking unmounting
of file systems.
* A new meson build option 'clock-valid-range-usec-max' was added to
allow disabling system time correction if RTC returns a timestamp far
in the future.
* Propagated restart jobs will no longer be discarded while a unit is
activating.
* PID 1 will now import system credentials from SMBIOS Type 11 fields
("OEM vendor strings"), in addition to qemu_fwcfg. This provides a
simple, fast and generic path for supplying credentials to a VM,
without involving external tools such as cloud-init/ignition.
* The CPUWeight= setting of unit files now accepts a new special value
"idle", which configures "idle" level scheduling for the unit.
* Service processes that are activated due to a .timer or .path unit
triggering will now receive information about this via environment
variables. Note that this is information is lossy, as activation
might be coalesced and only one of the activating triggers will be
reported. This is hence more suited for debugging or tracing rather
than for behaviour decisions.
* The riscv_flush_icache(2) system call has been added to the list of
system calls allowed by default when SystemCallFilter= is used.
...
systemd v252-rc1
CHANGES WITH 252 in spe:
Announcement of Future Feature Removal:
* We intend to remove cgroup v1 support from systemd release after the
end of 2023. If you run services that make explicit use of cgroup v1
features (i.e. the "legacy hierarchy" with separate hierarchies for
each controller), please implement compatibility with cgroup v2 (i.e.
the "unified hierarchy") sooner rather than later. Most of Linux
userspace has been ported over already.
* We intend to remove support for split-usr (/usr mounted separately
during boot) and unmerged-usr (parallel directories /bin and
/usr/bin, /lib and /usr/lib, etc). This will happen in the second
half of 2023, in the first release that falls into that time window.
For more details, see:
https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html
Compatibility Breaks:
* ConditionKernelVersion= checks that use the '=' or '!=' operators
will now do simple string comparisons (instead of version comparisons
Γ‘ la stverscmp()). Version comparisons are still done for the
ordering operators '<', '>', '<=', '>='. Moreover, if no operator is
specified, a shell-style glob match is now done. This creates a minor
incompatibility compared to older systemd versions when the '*', '?',
'[', ']' characters are used, as these will now match as shell globs
instead of literally. Given that kernel version strings typically do
not include these characters we expect little breakage through this
change.
* The service manager will now read the SELinux label used for SELinux
access checks from the unit file at the time it loads the file.
Previously, the label would be read at the moment of the access
check, which was problematic since at that time the unit file might
already have been updated or removed.
New Features:
* systemd-measure is a new tool for precalculating and signing expected
TPM2 PCR values seen once a given unified kernel image (UKI) with
systemd-stub is booted. This is useful for implementing TPM2 policies
for LUKS encrypted volumes and encrypted system/service credentials,
that robustly bind to kernels carrying appropriate PCR signature
information. The signed expected PCR information may be embedded
inside UKI images for this purpose so that it is automatically
available in userspace, once the UKI is booted.
systemd-cryptsetup, systemd-cryptenroll and systemd-creds have been
updated to make use of this information if available in the booted
kernel.
Net effect: if you boot a properly prepared kernel, TPM-bound disk
encryption now defaults to be locked to kernels which carry PCR
signatures from the same signature key pair. Example: if a
hypothetical distro FooOS prepares its UKI kernels like this,
TPM-based disk encryption is now β by default β bound to only FooOS
kernels, and encrypted volumes bound to the TPM cannot be unlocked on
other kernels from other sources. (But do note this behaviour
requires preparation/enabling in the UKI, and of course users can
always enroll non-TPM ways to unlock the volume.)
Binding TPM-based disk encryption to public keys/signatures of PCR
values β instead of literal PCR values β addresses the inherent
"brittleness" of traditional PCR-bound TPM disk encryption schemes:
disks remain accessible even if the UKI image is updated, without any
prepartion during the update scheme β as long as each UKI carries the
necessary PCR signature information.
* systemd-pcrphase is a new tool that is invoked at 4 places during
system runtime, and measures additional words into TPM2 PCR 11, to
mark milestones of the boot process. This allows binding access to
specific TPM2-encrypted secrets to specific phases of the boot
process. (Think: LUKS2 disk encryption key only accessible in the
initrd, but not later.)
Changes in systemd itself, i.e. the manager, and units
* The cpu controller is delegated to user manager units by default, and
CPUWeight= settings are applied to the top-level user slice units
(app.slice, background.slice, session.slice). This provides a degree
of resource isolation between different user services competing for
the CPU.
* Systemd can optionally do a full preset in the "first boot" condition
(instead of just enable-only). This behaviour is controlled by the
compile-time option -Dfirst-boot-full-preset. Right now it defaults
to 'false', but the plan is to switch it to 'true' for the subsequent
release.
* Systemd will set the taint flag 'support-ended' if it detects that
the OS image is past its end-of-support date. This date is declared
in a new /etc/os-release field SUPPORT_END= described below.
* Two new settings ConditionCredential= and AssertCredential= can be
used to skip or fail units if a certain system credential is not
provided.
* ConditionMemory= accepts size suffixes (K, M, G, T, β¦).
* DefaultSmackProcessLabel= can be used in system.conf and user.conf to
specify the SMACK security label to use when not specified in a unit
file.
* DefaultDeviceTimeoutSec= can be used in system.conf and user.conf to
specify the default timeout when waiting for device units to
activate.
* C.UTF-8 is used as the default locale if nothing else has been
configured.
* [Condition|Assert]Firmware= have been extended to support certain
SMBIOS fields. For example
ConditionFirmware=smbios-field(board_name = "Custom Board")
conditionalizes the unit to run only when
/sys/class/dmi/id/board_name contains "Custom Board" (without the
quotes).
* ConditionFirstBoot= now correctly evaluates as true only during the
boot phase of the first boot. A unit executed later, after booting
has completed, will no longer evaluate this condition as true.
* Socket units will now create sockets in the SELinuxContext= of the
associated service unit, if any.
* Boot phase transitions (start initrd β exit initrd β boot complete β
shutdown) will be measured into TPM2 PCR 11, so that secrets can be
bound to a specific runtime phase. E.g.: a LUKS encryption key can be
unsealed only in the initrd.
* Service credentials (i.e. SetCredential=/LoadCredential=/β¦) will now
also be provided to ExecStartPre= processes.
* Various units are now correctly ordered against
initrd-switch-root.target where previously a conflict without
ordering was configured. A stop job for those units would be queued,
but without the ordering it could be executed only after
initrd-switch-root.service, leading to units not being restarted in
the host system as expected.
* In order to fully support the IPMI watchdog driver, which has not yet
been ported to the new common watchdog device interface,
/dev/watchdog0 will be tried first and systemd will silently fallback
to /dev/watchdog if it is not found.
* New watchdog-related D-Bus properties are now published by systemd:
WatchdogDevice, WatchdogLastPingTimestamp,
WatchdogLastPingTimestampMonotonic.
* At shutdown, API virtual files systems (proc, sys, etc.) will be
unmounted lazily.
* At shutdown, systemd will now log about processes blocking unmounting
of file systems.
* A new meson build option 'clock-valid-range-usec-max' was added to
allow disabling system time correction if RTC returns a timestamp far
in the future.
* Propagated restart jobs will no longer be discarded while a unit is
activating.
* PID 1 will now import system credentials from SMBIOS Type 11 fields
("OEM vendor strings"), in addition to qemu_fwcfg. This provides a
simple, fast and generic path for supplying credentials to a VM,
without involving external tools such as cloud-init/ignition.
* The CPUWeight= setting of unit files now accepts a new special value
"idle", which configures "idle" level scheduling for the unit.
* Service processes that are activated due to a .timer or .path unit
triggering will now receive information about this via environment
variables. Note that this is information is lossy, as activation
might be coalesced and only one of the activating triggers will be
reported. This is hence more suited for debugging or tracing rather
than for behaviour decisions.
Changes in sd-boot, bootctl, and the Boot Loader Specification:
* The Boot Loader Specification has been cleaned up and clarified.
Various corner cases in version string comparisons have been fixed
(e.g. comparisons for empty strings). Boot counting is now part of
the main specification.
* New PCRs measurements are performed during boot: PCR 11 for the the
kernel+initrd combo, PCR 13 for any sysext images. If a m...