From: Asier Gutierrez Add documentation for the DAMON-based Hugepage Management (DAMON_HUGEPAGE) feature, which automatically manages huge pages by identifying cold memory regions and collapsing them back to regular pages. The documentation covers the module's features, operation, and all available module parameters. Signed-off-by: Asier Gutierrez --- .../admin-guide/mm/damon/hugepage.rst (new) | 258 ++++++++++++++++++ 1 file changed, 258 insertions(+) diff --git a/Documentation/admin-guide/mm/damon/hugepage.rst b/Documentation/admin-guide/mm/damon/hugepage.rst new file mode 100644 index 000000000000..ee50cfa79281 --- /dev/null +++ b/Documentation/admin-guide/mm/damon/hugepage.rst @@ -0,0 +1,258 @@ +.. SPDX-License-Identifier: GPL-2.0 + +======================= +DAMON-based huge page collapsing +======================= + +DAMON-based huge page collapsing (DAMON_HUGEPAGE) is a static kernel module +that aimed to collapse hot regions into huge pages. + +Where Proactive huge page collapsing is Required? +======================================== + +The amount of available memory grows faster than the amount of TLB entries. +This leads to higher amount of TLB misses and excesive cycle wastes. Huge +pages are meant to solve this problem. However, huge pages usually lead to +memory fragmentation and memory waste. + +Collapsing selectively hot regions in a specific process can avoid big +memory fragmentation, while increasing TLB performance. + +DAMON_HUGEPAGE solves this by: + +- Identifying hot regions that have been accessed for a configured time +- Automatically collapsing the regions into huge pages back +- Auto tune the huge page usage ratio to meet desired targets +- Controlling the collapse rate with configurable quotas to avoid performance + degradation + + +How It Works? +============= + +DAMON_HUGEPAGE uses kdamond to identify anonymous memory regions that are: + +1. Large enough to be backed by huge pages (``PMD_SIZE`` or larger) +2. Have been accessed for a configured time period + +Once identified, DAMON_HUGEPAGE triggers synchronous partial collapse of those +regions. The collapse operation is controlled by quotas to limit the impact on +system performance. + +The module also supports automatic tuning of the collapse rate to achieve a +desired huge page usage ratio. Administrators can configure a target percentage +of huge page usage vs total anonymous memory usage. + +Additionally, the module accepts manual feedback from system administrators to +adjust the effective quota level based on observed system behavior. + +Interface: Module Parameters +============================ + +To use this feature, you should first ensure your system is running on a kernel +that is built with ``CONFIG_DAMON_HUGEPAGE=y``. + +To let sysadmins enable or disable it and tune for the given system, +DAMON_HUGEPAGE utilizes module parameters. That is, you can put +``damon_hugepage.=`` on the kernel boot command line or write +proper values to ``/sys/module/damon_hugepage/parameters/`` files. + +Below are the description of each parameter. + +enabled +------- + +Enable or disable DAMON_HUGEPAGE. + +You can enable DAMON_HUGEPAGE by setting the value of this parameter as ``Y``. +Setting it as ``N`` disables DAMON_HUGEPAGE. Note that DAMON_HUGEPAGE could do +no real monitoring and collapse due to the activation condition. + +commit_inputs +------------- + +Make DAMON_HUGEPAGE reads the input parameters again, except ``enabled``. + +Input parameters that updated while DAMON_HUGEPAGE is running are not applied +by default. Once this parameter is set as ``Y``, DAMON_HUGEPAGE reads values +of parametrs except ``enabled`` again. Once the re-reading is done, this +parameter is set as ``N``. If invalid parameters are found while the +re-reading, DAMON_HUGEPAGE will be disabled. + +Once ``Y`` is written to this parameter, the user must not write to any +parameters until reading ``commit_inputs`` again returns ``N``. If users +violate this rule, the kernel may exhibit undefined behavior. + +min_age +------- + +Time threshold for hot memory regions identification in microseconds. + +100 milliseconds by default. + +quota_ms +-------- + +Limit of time for the collapse in milliseconds. + +DAMON_HUGEPAGE tries to use only up to this time within a time window +(quota_reset_interval_ms) for trying collapse hot pages. This can be +used for limiting CPU consumption of DAMON_HUGEPAGE. If the value is zero, the +limit is disabled. + +10 ms by default. + +quota_reset_interval_ms +----------------------- + +The time/size quota charge reset interval in milliseconds. + +The charget reset interval for the quota of time (quota_ms) and size +(quota_sz). That is, DAMON_HUGEPAGE does not try collapsing for more than +quota_ms milliseconds or quota_sz bytes within quota_reset_interval_ms +milliseconds. + +1 second by default. + +quota_autotune_feedback +----------------------- + +User-specifiable feedback for auto-tuning of the effective quota. + +While keeping the caps that set by other quotas, DAMON_HUGEPAGE automatically +increases and decreases the effective level of the quota aiming receiving this +feedback of value ``10,000`` from the user. DAMON_HUGEPAGE assumes the feedback +value and the quota are positively proportional. Value zero means disabling +this auto-tuning feature. + +Disabled by default. + +monitored_pid +---------------- + +PID of the task that is going to be monitored for hot regions. + +quota_percentage_hugepage +---------------- + +Huge page consumption to total memory anonymous memory consumption ratio goal +in bp ``(10,000)``. DAMON_HUGEPAGE automatically increases and decreases page +collapse aggressiveness in order to achieve the given value. + +sample_interval +--------------- + +Sampling interval for the monitoring in microseconds. + +The sampling interval of DAMON for the cold memory monitoring. Please refer to +the DAMON documentation (:doc:`usage`) for more detail. + +aggr_interval +------------- + +Aggregation interval for the monitoring in microseconds. + +The aggregation interval of DAMON for the cold memory monitoring. Please +refer to the DAMON documentation (:doc:`usage`) for more detail. + +min_nr_regions +-------------- + +Minimum number of monitoring regions. + +The minimal number of monitoring regions of DAMON for the cold memory +monitoring. This can be used to set lower-bound of the monitoring quality. +But, setting this too high could result in increased monitoring overhead. +Please refer to the DAMON documentation (:doc:`usage`) for more detail. + +Note that this must be 3 or higher. Please refer to the :ref:`Monitoring +` section of the design document for the rationale +behind this lower bound. + +max_nr_regions +-------------- + +Maximum number of monitoring regions. + +The maximum number of monitoring regions of DAMON for the cold memory +monitoring. This can be used to set upper-bound of the monitoring overhead. +However, setting this too low could result in bad monitoring quality. Please +refer to the DAMON documentation (:doc:`usage`) for more detail. + +addr_unit +--------- + +A scale factor for memory addresses and bytes. + +This parameter is for setting and getting the :ref:`address unit +` parameter of the DAMON instance for DAMON_HUGEPAGE. + +``monitor_region_start`` and ``monitor_region_end`` should be provided in this +unit. For example, let's suppose ``addr_unit``, ``monitor_region_start`` and +``monitor_region_end`` are set as ``1024``, ``0`` and ``10``, respectively. +Then DAMON_HUGEPAGE will work for 10 KiB length of physical address range that +starts from address zero (``[0 * 1024, 10 * 1024)`` in bytes). + +``bytes_hugepage_tried_regions`` and ``bytes_hugepage_regions`` are also in +this unit. For example, let's suppose values of ``addr_unit``, +``bytes_hugepage_tried_regions`` and ``bytes_hugepage_regions`` are +``1048576``, ``42``, and ``32``, respectively. Then it means DAMON_HUGEPAGE +tried to collapse 42 MiB memory and successfully collapse 32 MiB memory in +total. + +If unsure, use only the default value (``1``) and forget about this. + + +kdamond_pid +----------- + +PID of the DAMON thread. + +If DAMON_HUGEPAGE is enabled, this becomes the PID of the worker thread. Else, +-1. + +nr_hugepage_tried_regions +------------------------ + +Number of memory regions that tried to be collapsed by DAMON_HUGEPAGE. + +bytes_hugepage_tried_regions +--------------------------- + +Total bytes of memory regions that tried to be collapsed by DAMON_HUGEPAGE. + +nr_hugepage_regions +-------------------- + +Number of memory regions that successfully be collapsed by DAMON_HUGEPAGE. + +bytes_hugepage_regions +----------------------- + +Total bytes of memory regions that successfully be collapsed by DAMON_HUGEPAGE. + +nr_quota_exceeds +---------------- + +Number of times that the time/space quota limits have exceeded. + +Example +======= + +Below runtime example commands make DAMON_HUGEPAGE to find memory regions of +the task with PID 1234 that have been accessed in the last 100 millseconds or +more and pages out. The pagecollapsing is limited to be done only up to 1 GiB +per second to avoid DAMON_HUGEPAGE consuming too much CPU time for the collapse +operation. :: + + # cd /sys/module/damon_hugepage/parameters + # echo 100000 > min_age + # echo $((1 * 1024 * 1024 * 1024)) > quota_sz + # echo 1000 > quota_reset_interval_ms + # echo 1234 > monitored_pid + # echo Y > enabled + +Note that this module (damon_hugepage) cannot run simultaneously with other +DAMON-based special-purpose modules. Refer to :ref:`DAMON design special +purpose modules exclusivity ` +for more details. -- 2.43.0