summaryrefslogtreecommitdiff
path: root/Documentation/mm/damon/monitoring_intervals_tuning_example.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/mm/damon/monitoring_intervals_tuning_example.rst')
-rw-r--r--Documentation/mm/damon/monitoring_intervals_tuning_example.rst247
1 files changed, 247 insertions, 0 deletions
diff --git a/Documentation/mm/damon/monitoring_intervals_tuning_example.rst b/Documentation/mm/damon/monitoring_intervals_tuning_example.rst
new file mode 100644
index 000000000000..334a854efb40
--- /dev/null
+++ b/Documentation/mm/damon/monitoring_intervals_tuning_example.rst
@@ -0,0 +1,247 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=================================================
+DAMON Moniting Interval Parameters Tuning Example
+=================================================
+
+DAMON's monitoring parameters need tuning based on given workload and the
+monitoring purpose. There is a :ref:`tuning guide
+<damon_design_monitoring_params_tuning_guide>` for that. This document
+provides an example tuning based on the guide.
+
+Setup
+=====
+
+For below example, DAMON of Linux kernel v6.11 and `damo
+<https://github.com/damonitor/damo>`_ (DAMON user-space tool) v2.5.9 was used to
+monitor and visualize access patterns on the physical address space of a system
+running a real-world server workload.
+
+5ms/100ms intervals: Too Short Interval
+=======================================
+
+Let's start by capturing the access pattern snapshot on the physical address
+space of the system using DAMON, with the default interval parameters (5
+milliseconds and 100 milliseconds for the sampling and the aggregation
+intervals, respectively). Wait ten minutes between the start of DAMON and
+the capturing of the snapshot, to show a meaningful time-wise access patterns.
+::
+
+ # damo start
+ # sleep 600
+ # damo record --snapshot 0 1
+ # damo stop
+
+Then, list the DAMON-found regions of different access patterns, sorted by the
+"access temperature". "Access temperature" is a metric representing the
+access-hotness of a region. It is calculated as a weighted sum of the access
+frequency and the age of the region. If the access frequency is 0 %, the
+temperature is multipled by minus one. That is, if a region is not accessed,
+it gets minus temperature and it gets lower as not accessed for longer time.
+The sorting is in temperature-ascendint order, so the region at the top of the
+list is the coldest, and the one at the bottom is the hottest one. ::
+
+ # damo report access --sort_regions_by temperature
+ 0 addr 16.052 GiB size 5.985 GiB access 0 % age 5.900 s # coldest
+ 1 addr 22.037 GiB size 6.029 GiB access 0 % age 5.300 s
+ 2 addr 28.065 GiB size 6.045 GiB access 0 % age 5.200 s
+ 3 addr 10.069 GiB size 5.983 GiB access 0 % age 4.500 s
+ 4 addr 4.000 GiB size 6.069 GiB access 0 % age 4.400 s
+ 5 addr 62.008 GiB size 3.992 GiB access 0 % age 3.700 s
+ 6 addr 56.795 GiB size 5.213 GiB access 0 % age 3.300 s
+ 7 addr 39.393 GiB size 6.096 GiB access 0 % age 2.800 s
+ 8 addr 50.782 GiB size 6.012 GiB access 0 % age 2.800 s
+ 9 addr 34.111 GiB size 5.282 GiB access 0 % age 2.300 s
+ 10 addr 45.489 GiB size 5.293 GiB access 0 % age 1.800 s # hottest
+ total size: 62.000 GiB
+
+The list shows not seemingly hot regions, and only minimum access pattern
+diversity. Every region has zero access frequency. The number of region is
+10, which is the default ``min_nr_regions value``. Size of each region is also
+nearly idential. We can suspect this is because “adaptive regions adjustment”
+mechanism was not well working. As the guide suggested, we can get relative
+hotness of regions using ``age`` as the recency information. That would be
+better than nothing, but given the fact that the longest age is only about 6
+seconds while we waited about ten minuts, it is unclear how useful this will
+be.
+
+The temperature ranges to total size of regions of each range histogram
+visualization of the results also shows no interesting distribution pattern. ::
+
+ # damo report access --style temperature-sz-hist
+ <temperature> <total size>
+ [-,590,000,000, -,549,000,000) 5.985 GiB |********** |
+ [-,549,000,000, -,508,000,000) 12.074 GiB |********************|
+ [-,508,000,000, -,467,000,000) 0 B | |
+ [-,467,000,000, -,426,000,000) 12.052 GiB |********************|
+ [-,426,000,000, -,385,000,000) 0 B | |
+ [-,385,000,000, -,344,000,000) 3.992 GiB |******* |
+ [-,344,000,000, -,303,000,000) 5.213 GiB |********* |
+ [-,303,000,000, -,262,000,000) 12.109 GiB |********************|
+ [-,262,000,000, -,221,000,000) 5.282 GiB |********* |
+ [-,221,000,000, -,180,000,000) 0 B | |
+ [-,180,000,000, -,139,000,000) 5.293 GiB |********* |
+ total size: 62.000 GiB
+
+In short, the parameters provide poor quality monitoring results for hot
+regions detection. According to the :ref:`guide
+<damon_design_monitoring_params_tuning_guide>`, this is due to the too short
+aggregation interval.
+
+100ms/2s intervals: Starts Showing Small Hot Regions
+====================================================
+
+Following the guide, increase the interval 20 times (100 milliseocnds and 2
+seconds for sampling and aggregation intervals, respectively). ::
+
+ # damo start -s 100ms -a 2s
+ # sleep 600
+ # damo record --snapshot 0 1
+ # damo stop
+ # damo report access --sort_regions_by temperature
+ 0 addr 10.180 GiB size 6.117 GiB access 0 % age 7 m 8 s # coldest
+ 1 addr 49.275 GiB size 6.195 GiB access 0 % age 6 m 14 s
+ 2 addr 62.421 GiB size 3.579 GiB access 0 % age 6 m 4 s
+ 3 addr 40.154 GiB size 6.127 GiB access 0 % age 5 m 40 s
+ 4 addr 16.296 GiB size 6.182 GiB access 0 % age 5 m 32 s
+ 5 addr 34.254 GiB size 5.899 GiB access 0 % age 5 m 24 s
+ 6 addr 46.281 GiB size 2.995 GiB access 0 % age 5 m 20 s
+ 7 addr 28.420 GiB size 5.835 GiB access 0 % age 5 m 6 s
+ 8 addr 4.000 GiB size 6.180 GiB access 0 % age 4 m 16 s
+ 9 addr 22.478 GiB size 5.942 GiB access 0 % age 3 m 58 s
+ 10 addr 55.470 GiB size 915.645 MiB access 0 % age 3 m 6 s
+ 11 addr 56.364 GiB size 6.056 GiB access 0 % age 2 m 8 s
+ 12 addr 56.364 GiB size 4.000 KiB access 95 % age 16 s
+ 13 addr 49.275 GiB size 4.000 KiB access 100 % age 8 m 24 s # hottest
+ total size: 62.000 GiB
+ # damo report access --style temperature-sz-hist
+ <temperature> <total size>
+ [-42,800,000,000, -33,479,999,000) 22.018 GiB |***************** |
+ [-33,479,999,000, -24,159,998,000) 27.090 GiB |********************|
+ [-24,159,998,000, -14,839,997,000) 6.836 GiB |****** |
+ [-14,839,997,000, -5,519,996,000) 6.056 GiB |***** |
+ [-5,519,996,000, 3,800,005,000) 4.000 KiB |* |
+ [3,800,005,000, 13,120,006,000) 0 B | |
+ [13,120,006,000, 22,440,007,000) 0 B | |
+ [22,440,007,000, 31,760,008,000) 0 B | |
+ [31,760,008,000, 41,080,009,000) 0 B | |
+ [41,080,009,000, 50,400,010,000) 0 B | |
+ [50,400,010,000, 59,720,011,000) 4.000 KiB |* |
+ total size: 62.000 GiB
+
+DAMON found two distinct 4 KiB regions that pretty hot. The regions are also
+well aged. The hottest 4 KiB region was keeping the access frequency for about
+8 minutes, and the coldest region was keeping no access for about 7 minutes.
+The distribution on the histogram also looks like having a pattern.
+
+Especially, the finding of the 4 KiB regions among the 62 GiB total memory
+shows DAMON’s adaptive regions adjustment is working as designed.
+
+Still the number of regions is close to the ``min_nr_regions``, and sizes of
+cold regions are similar, though. Apparently it is improved, but it still has
+rooms to improve.
+
+400ms/8s intervals: Pretty Improved Results
+===========================================
+
+Increase the intervals four times (400 milliseconds and 8 seconds
+for sampling and aggregation intervals, respectively). ::
+
+ # damo start -s 400ms -a 8s
+ # sleep 600
+ # damo record --snapshot 0 1
+ # damo stop
+ # damo report access --sort_regions_by temperature
+ 0 addr 64.492 GiB size 1.508 GiB access 0 % age 6 m 48 s # coldest
+ 1 addr 21.749 GiB size 5.674 GiB access 0 % age 6 m 8 s
+ 2 addr 27.422 GiB size 5.801 GiB access 0 % age 6 m
+ 3 addr 49.431 GiB size 8.675 GiB access 0 % age 5 m 28 s
+ 4 addr 33.223 GiB size 5.645 GiB access 0 % age 5 m 12 s
+ 5 addr 58.321 GiB size 6.170 GiB access 0 % age 5 m 4 s
+ [...]
+ 25 addr 6.615 GiB size 297.531 MiB access 15 % age 0 ns
+ 26 addr 9.513 GiB size 12.000 KiB access 20 % age 0 ns
+ 27 addr 9.511 GiB size 108.000 KiB access 25 % age 0 ns
+ 28 addr 9.513 GiB size 20.000 KiB access 25 % age 0 ns
+ 29 addr 9.511 GiB size 12.000 KiB access 30 % age 0 ns
+ 30 addr 9.520 GiB size 4.000 KiB access 40 % age 0 ns
+ [...]
+ 41 addr 9.520 GiB size 4.000 KiB access 80 % age 56 s
+ 42 addr 9.511 GiB size 12.000 KiB access 100 % age 6 m 16 s
+ 43 addr 58.321 GiB size 4.000 KiB access 100 % age 6 m 24 s
+ 44 addr 9.512 GiB size 4.000 KiB access 100 % age 6 m 48 s
+ 45 addr 58.106 GiB size 4.000 KiB access 100 % age 6 m 48 s # hottest
+ total size: 62.000 GiB
+ # damo report access --style temperature-sz-hist
+ <temperature> <total size>
+ [-40,800,000,000, -32,639,999,000) 21.657 GiB |********************|
+ [-32,639,999,000, -24,479,998,000) 17.938 GiB |***************** |
+ [-24,479,998,000, -16,319,997,000) 16.885 GiB |**************** |
+ [-16,319,997,000, -8,159,996,000) 586.879 MiB |* |
+ [-8,159,996,000, 5,000) 4.946 GiB |***** |
+ [5,000, 8,160,006,000) 260.000 KiB |* |
+ [8,160,006,000, 16,320,007,000) 0 B | |
+ [16,320,007,000, 24,480,008,000) 0 B | |
+ [24,480,008,000, 32,640,009,000) 0 B | |
+ [32,640,009,000, 40,800,010,000) 16.000 KiB |* |
+ [40,800,010,000, 48,960,011,000) 8.000 KiB |* |
+ total size: 62.000 GiB
+
+The number of regions having different access patterns has significantly
+increased. Size of each region is also more varied. Total size of non-zero
+access frequency regions is also significantly increased. Maybe this is already
+good enough to make some meaningful memory management efficieny changes.
+
+800ms/16s intervals: Another bias
+=================================
+
+Further double the intervals (800 milliseconds and 16 seconds for sampling
+and aggregation intervals, respectively). The results is more improved for the
+hot regions detection, but starts looking degrading cold regions detection. ::
+
+ # damo start -s 800ms -a 16s
+ # sleep 600
+ # damo record --snapshot 0 1
+ # damo stop
+ # damo report access --sort_regions_by temperature
+ 0 addr 64.781 GiB size 1.219 GiB access 0 % age 4 m 48 s
+ 1 addr 24.505 GiB size 2.475 GiB access 0 % age 4 m 16 s
+ 2 addr 26.980 GiB size 504.273 MiB access 0 % age 4 m
+ 3 addr 29.443 GiB size 2.462 GiB access 0 % age 4 m
+ 4 addr 37.264 GiB size 5.645 GiB access 0 % age 4 m
+ 5 addr 31.905 GiB size 5.359 GiB access 0 % age 3 m 44 s
+ [...]
+ 20 addr 8.711 GiB size 40.000 KiB access 5 % age 2 m 40 s
+ 21 addr 27.473 GiB size 1.970 GiB access 5 % age 4 m
+ 22 addr 48.185 GiB size 4.625 GiB access 5 % age 4 m
+ 23 addr 47.304 GiB size 902.117 MiB access 10 % age 4 m
+ 24 addr 8.711 GiB size 4.000 KiB access 100 % age 4 m
+ 25 addr 20.793 GiB size 3.713 GiB access 5 % age 4 m 16 s
+ 26 addr 8.773 GiB size 4.000 KiB access 100 % age 4 m 16 s
+ total size: 62.000 GiB
+ # damo report access --style temperature-sz-hist
+ <temperature> <total size>
+ [-28,800,000,000, -23,359,999,000) 12.294 GiB |***************** |
+ [-23,359,999,000, -17,919,998,000) 9.753 GiB |************* |
+ [-17,919,998,000, -12,479,997,000) 15.131 GiB |********************|
+ [-12,479,997,000, -7,039,996,000) 0 B | |
+ [-7,039,996,000, -1,599,995,000) 7.506 GiB |********** |
+ [-1,599,995,000, 3,840,006,000) 6.127 GiB |********* |
+ [3,840,006,000, 9,280,007,000) 0 B | |
+ [9,280,007,000, 14,720,008,000) 136.000 KiB |* |
+ [14,720,008,000, 20,160,009,000) 40.000 KiB |* |
+ [20,160,009,000, 25,600,010,000) 11.188 GiB |*************** |
+ [25,600,010,000, 31,040,011,000) 4.000 KiB |* |
+ total size: 62.000 GiB
+
+It found more non-zero access frequency regions. The number of regions is still
+much higher than the ``min_nr_regions``, but it is reduced from that of the
+previous setup. And apparently the distribution seems bit biased to hot
+regions.
+
+Conclusion
+==========
+
+With the above experimental tuning results, we can conclude the theory and the
+guide makes sense to at least this workload, and could be applied to similar
+cases.