diff options
Diffstat (limited to 'target/linux/generic/backport-5.15/020-v6.1-10-mm-multigenerational-lru-documentation.patch')
-rw-r--r-- | target/linux/generic/backport-5.15/020-v6.1-10-mm-multigenerational-lru-documentation.patch | 161 |
1 files changed, 161 insertions, 0 deletions
diff --git a/target/linux/generic/backport-5.15/020-v6.1-10-mm-multigenerational-lru-documentation.patch b/target/linux/generic/backport-5.15/020-v6.1-10-mm-multigenerational-lru-documentation.patch new file mode 100644 index 0000000000..f4716fb68d --- /dev/null +++ b/target/linux/generic/backport-5.15/020-v6.1-10-mm-multigenerational-lru-documentation.patch @@ -0,0 +1,161 @@ +From f59c618ed70a1e48accc4cad91a200966f2569c9 Mon Sep 17 00:00:00 2001 +From: Yu Zhao <yuzhao@google.com> +Date: Tue, 2 Feb 2021 01:27:45 -0700 +Subject: [PATCH 10/10] mm: multigenerational lru: documentation + +Add Documentation/vm/multigen_lru.rst. + +Signed-off-by: Yu Zhao <yuzhao@google.com> +Tested-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> +Change-Id: I1902178bcbb5adfa0a748c4d284a6456059bdd7e +--- + Documentation/vm/index.rst | 1 + + Documentation/vm/multigen_lru.rst | 132 ++++++++++++++++++++++++++++++ + 2 files changed, 133 insertions(+) + create mode 100644 Documentation/vm/multigen_lru.rst + +--- a/Documentation/vm/index.rst ++++ b/Documentation/vm/index.rst +@@ -17,6 +17,7 @@ various features of the Linux memory man + + swap_numa + zswap ++ multigen_lru + + Kernel developers MM documentation + ================================== +--- /dev/null ++++ b/Documentation/vm/multigen_lru.rst +@@ -0,0 +1,132 @@ ++.. SPDX-License-Identifier: GPL-2.0 ++ ++===================== ++Multigenerational LRU ++===================== ++ ++Quick Start ++=========== ++Build Configurations ++-------------------- ++:Required: Set ``CONFIG_LRU_GEN=y``. ++ ++:Optional: Set ``CONFIG_LRU_GEN_ENABLED=y`` to turn the feature on by ++ default. ++ ++Runtime Configurations ++---------------------- ++:Required: Write ``1`` to ``/sys/kernel/mm/lru_gen/enable`` if the ++ feature was not turned on by default. ++ ++:Optional: Write ``N`` to ``/sys/kernel/mm/lru_gen/min_ttl_ms`` to ++ protect the working set of ``N`` milliseconds. The OOM killer is ++ invoked if this working set cannot be kept in memory. ++ ++:Optional: Read ``/sys/kernel/debug/lru_gen`` to confirm the feature ++ is turned on. This file has the following output: ++ ++:: ++ ++ memcg memcg_id memcg_path ++ node node_id ++ min_gen birth_time anon_size file_size ++ ... ++ max_gen birth_time anon_size file_size ++ ++``min_gen`` is the oldest generation number and ``max_gen`` is the ++youngest generation number. ``birth_time`` is in milliseconds. ++``anon_size`` and ``file_size`` are in pages. ++ ++Phones/Laptops/Workstations ++--------------------------- ++No additional configurations required. ++ ++Servers/Data Centers ++-------------------- ++:To support more generations: Change ``CONFIG_NR_LRU_GENS`` to a ++ larger number. ++ ++:To support more tiers: Change ``CONFIG_TIERS_PER_GEN`` to a larger ++ number. ++ ++:To support full stats: Set ``CONFIG_LRU_GEN_STATS=y``. ++ ++:Working set estimation: Write ``+ memcg_id node_id max_gen ++ [swappiness] [use_bloom_filter]`` to ``/sys/kernel/debug/lru_gen`` to ++ invoke the aging, which scans PTEs for accessed pages and then ++ creates the next generation ``max_gen+1``. A swap file and a non-zero ++ ``swappiness``, which overrides ``vm.swappiness``, are required to ++ scan PTEs mapping anon pages. Set ``use_bloom_filter`` to 0 to ++ override the default behavior which only scans PTE tables found ++ populated. ++ ++:Proactive reclaim: Write ``- memcg_id node_id min_gen [swappiness] ++ [nr_to_reclaim]`` to ``/sys/kernel/debug/lru_gen`` to invoke the ++ eviction, which evicts generations less than or equal to ``min_gen``. ++ ``min_gen`` should be less than ``max_gen-1`` as ``max_gen`` and ++ ``max_gen-1`` are not fully aged and therefore cannot be evicted. ++ Use ``nr_to_reclaim`` to limit the number of pages to evict. Multiple ++ command lines are supported, so does concatenation with delimiters ++ ``,`` and ``;``. ++ ++Framework ++========= ++For each ``lruvec``, evictable pages are divided into multiple ++generations. The youngest generation number is stored in ++``lrugen->max_seq`` for both anon and file types as they are aged on ++an equal footing. The oldest generation numbers are stored in ++``lrugen->min_seq[]`` separately for anon and file types as clean ++file pages can be evicted regardless of swap and writeback ++constraints. These three variables are monotonically increasing. ++Generation numbers are truncated into ++``order_base_2(CONFIG_NR_LRU_GENS+1)`` bits in order to fit into ++``page->flags``. The sliding window technique is used to prevent ++truncated generation numbers from overlapping. Each truncated ++generation number is an index to an array of per-type and per-zone ++lists ``lrugen->lists``. ++ ++Each generation is divided into multiple tiers. Tiers represent ++different ranges of numbers of accesses from file descriptors only. ++Pages accessed ``N`` times via file descriptors belong to tier ++``order_base_2(N)``. Each generation contains at most ++``CONFIG_TIERS_PER_GEN`` tiers, and they require additional ++``CONFIG_TIERS_PER_GEN-2`` bits in ``page->flags``. In contrast to ++moving between generations which requires list operations, moving ++between tiers only involves operations on ``page->flags`` and ++therefore has a negligible cost. A feedback loop modeled after the PID ++controller monitors refaulted % across all tiers and decides when to ++protect pages from which tiers. ++ ++The framework comprises two conceptually independent components: the ++aging and the eviction, which can be invoked separately from user ++space for the purpose of working set estimation and proactive reclaim. ++ ++Aging ++----- ++The aging produces young generations. Given an ``lruvec``, the aging ++traverses ``lruvec_memcg()->mm_list`` and calls ``walk_page_range()`` ++to scan PTEs for accessed pages (a ``mm_struct`` list is maintained ++for each ``memcg``). Upon finding one, the aging updates its ++generation number to ``max_seq`` (modulo ``CONFIG_NR_LRU_GENS``). ++After each round of traversal, the aging increments ``max_seq``. The ++aging is due when ``min_seq[]`` reaches ``max_seq-1``. ++ ++Eviction ++-------- ++The eviction consumes old generations. Given an ``lruvec``, the ++eviction scans pages on the per-zone lists indexed by anon and file ++``min_seq[]`` (modulo ``CONFIG_NR_LRU_GENS``). It first tries to ++select a type based on the values of ``min_seq[]``. If they are ++equal, it selects the type that has a lower refaulted %. The eviction ++sorts a page according to its updated generation number if the aging ++has found this page accessed. It also moves a page to the next ++generation if this page is from an upper tier that has a higher ++refaulted % than the base tier. The eviction increments ``min_seq[]`` ++of a selected type when it finds all the per-zone lists indexed by ++``min_seq[]`` of this selected type are empty. ++ ++To-do List ++========== ++KVM Optimization ++---------------- ++Support shadow page table walk. |