Requirements Overview
Dividing resources among multiple workloads in a controlled manner requires comprehensive control strategies covering all resources: CPU, IO, and memory. Many of the components and features required to implement these strategies are new, and there are currently a substantial number of specific system and configuration requirements for the demo to run.
resctl-demo checks for all needed requirements and configures the ones it can, but some requirements can't be satisfied without system-level changes. The second "config" row of the top left summary panel shows the number of satisfied and missed requirements followed by resource control enable status per resource type.
If any of the requirements are unmet, you’ll see an error message at the top of the right pane of the demo window.
warning
The failed requirements are marked* red. You can ignore some of the system configuration failures with --force. However, resource isolation may not work as expected.
As resource control becomes more widely adopted, the need for specific configurations will diminish: Some will be adopted as standard practices, while others become unnecessary as the underlying features grow more versatile.
System and Configuration Requirements
Each requirement required for the demo to run is described below, along with its purpose, the automatic configurations resctl-demo may apply, and information on meeting the requirement if it's currently unmet.
Cgroup2 Controllers:
- Purpose: cgroup2 provides the foundation for resource control.
- Requirement: resct-demo requires
systemd
to manage the system using cgroup2 with all three major local resource controllers (cpu, memory and, io) enabled. - If requirement unmet: If
/sys/fs/cgroup/cgroup.controllers
is not present, reboot the system withsystemd.unified_cgroup_hierarchy=1
specified as a boot parameter. If the file is present but doesn't contain all the controllers, either the kernel doesn't have them enabled, they're disabled withcgroup_disable
boot parameter, or cgroup1 hierarchies are using them. Resolve them and restart resctl-demo.
Freezer:
- Purpose: cgroup2 freezer is used to strictly limit the impact of side workloads under heavy load.
- Available in kernels >= v5.2.
MemCgRecursiveProt:
- Purpose: Recursive propagation for memory controller's
memory.min
/low
protections. This greatly simplifies protection configurations. - Auto Configuration: If available, resctl-demo will automatically remount cgroup2 fs w/ the mount option. For details, see https://lkml.org/lkml/2019/12/19/1272
- Available in kernels >= v5.6.
- Enable it with cgroup2
memory_recursiveprot
mount option.
- Purpose: Recursive propagation for memory controller's
IoCost:
- Purpose:
blk-iocost
is the new IO controller which can comprehensively control IO capacity distribution proportionally. - Enable with
CONFIG_BLK_CGROUP_IOCOST
. For details, see https://lwn.net/Articles/793460/
- Purpose:
IoCostVer:
blk-iocost
received significant updates to improve control quality and visibility during the v5.10 development cycle. A kernel with these updates is recommended. For details, see https://lwn.net/Articles/830397/
NoOtherIoControllers:
- Requirement: Other IO controllers -
io.max
andio.latency
- can interfere and shouldn't have active configurations. If configured throughsystemd
, remove all IO{Read|Write}{Bandwidth|IOPS}Max andIoDeviceLatencyTargetSec
configurations.
- Requirement: Other IO controllers -
AnonBalance:
- Kernel memory management received a major update during the v5.8 development cycle which put anonymous memory on an equal footing with page cache and made swap useful, especially on SSDs. For details, see https://lwn.net/Articles/821105/
Btrfs:
- Purpose: Working IO isolation requires support from filesystem to avoid priority inversions. Currently, btrfs is the only supported filesystem.
- Requirement:The OS must be installed with btrfs as the root filesystem.
BtrfsAsyncDiscard:
- Purpose: Many SSDs show significant latency spikes when discards are issued in bulk, which can lead to severe priority inversions. Async discard is a btrfs feature that paces and reduces the total amount of discards.
- Auto Configuration: If available, resctl-demo will automatically remount the filesystem with the mount option. For details, see https://lwn.net/Articles/805300/
- You can enable with
discard=async
mount option on kernels >= v5.6.
NoCompositeStorage:
- Currently, composite block devices, such as dm and md, break the chain of custody for IOs, allowing cgroups to escape IO control and cause severe priority inversions.
- Requirement: The filesystem must be on a physical device.
SysReq::IoSched:
- Requirement: bfq IO scheduler's implementation of proportional IO
control conflicts with
blk-iocost
and breaks IO isolation. Usemq-deadline
. - Auto Configuration: resctl-demo automatically switches to
mq-deadline
if available. - IO scheduler can be selected by writing to
/sys/block/$DEV/queue/scheduler
.
- Requirement: bfq IO scheduler's implementation of proportional IO
control conflicts with
NoWbt:
- Requirement: Write-Back-Throttling(WBT) should be disabled. WBT is a block layer mechanism to prevent writebacks from overwhelming IO devices, which may interfere with IO control and should be disabled.
- Auto configuration: resctl-demo automatically disables
wbt
. - You can disable by writing 0 to
/sys/block/$DEV/queue/wbt_lat_usec
.
Swap:
- Requirement: Swap must be enabled with the default swappiness and at least as large as the smaller of a third of the system memory, or 32G. See SwapOnScratch.
SwapOnScratch:
- Requirement: Swap must be on the same device as the root filesystem. The recommended configuration is btrfs root filesystem, which serves both the scratch directory and swap file. This isn't an inherent requirement of resource control but exists to simplify experiments.
- Setting up btrfs swapfiles: https://wiki.archlinux.org/index.php/Btrfs#Swap_file
Oomd:
- Requirement: OOMD binary >= 0.3.0 && != 0.4.0 must be present. Note that 0.4.0 is excluded due to a bug in Senpai implementation. See https://github.com/facebookincubator/oomd.
NoSysOomd:
- Requirement: Disable system-level OOMD and earlyoom services.
Instances of OOMD or earlyoom at the system-level may
interfere and should be disabled. They usually run as a systemd service of
the same name. You can use
systemctl
to locate and stop the services. - Auto Configuration: resctl-demo automatically stops and restarts system-level OOMD instance.
- Requirement: Disable system-level OOMD and earlyoom services.
Instances of OOMD or earlyoom at the system-level may
interfere and should be disabled. They usually run as a systemd service of
the same name. You can use
HostCriticalServices:
- Requirement:
sshd.service
,systemd-journald.service
,dbus.service
,dbus-broker.service
must be in hostcritical. - Auto Configuration: resctl-demo automatically creates the needed configurations but for the changes to take effect, either the machine or services need to be restarted.
- Requirement:
DepsBase:
- Requirement:
python3
must be available on the system.
- Requirement:
DepsIoCostCoefGen:
- Requirement:
findmnt
,dd
,fio
andstdbuf
must be available to generate iocost parameters withiocost_coef_gen.py
.
- Requirement:
DepsSide:
- Requirement:
stress
must be available for some of theside/sysloads
.
- Requirement:
DepsLinuxBuild:
- Requirement:
gcc
,ld
,make
,bison
,flex
,pkg-config
,libssl
andlibelf
must be available for linux buildsys/sideloads
.
- Requirement: