- Oct 2024
-
elixir.bootlin.com elixir.bootlin.com
-
/* Soft offline could migrate non-LRU movable pages */ if ((flags & MF_SOFT_OFFLINE) && __PageMovable(page)) return true;
The code includes a policy for soft offline pages (MF_SOFT_OFFLINE). This is a feature where the kernel attempts to migrate pages to avoid using faulty memory areas. The policy allows non-LRU movable pages (pages that aren’t in the Least Recently Used list) to be migrated.
-
if (TestSetPageHWPoison(p)) { pr_err("%#lx: already hardware poisoned\n", pfn); res = -EHWPOISON; if (flags & MF_ACTION_REQUIRED) res = kill_accessing_process(current, pfn, flags); if (flags & MF_COUNT_INCREASED) put_page(p); goto unlock_mutex; }
The page isolation policy ensures that faulty pages are marked as hardware poisoned (using TestSetPageHWPoison). The kernel checks if a page has already been isolated (marked as poisoned) before taking any further action. This prevents faulty pages from being reused by the system and isolates them to avoid further corruption.
-
-
elixir.bootlin.com elixir.bootlin.com
-
/* * Should not even be attempting cluster allocations when huge * page swap is disabled. Warn and fail the allocation. */ if (!IS_ENABLED(CONFIG_THP_SWAP)) { VM_WARN_ON_ONCE(1); return 0; }
The policy dictates that if huge page swapping is disabled at compile time (via the CONFIG_THP_SWAP configuration option), the kernel should not attempt to allocate swap clusters for huge pages and should fail the operation with a warning.
-
#ifdef CONFIG_HIBERNATION
If CONFIG_HIBERNATION is defined, the kernel includes code to write the entire system memory state to the swapfile before powering down the system. This involves allocating swap slots for the entire memory state and ensuring that the data is properly stored.
-
if (count == SWAP_HAS_CACHE && !swap_page_trans_huge_swapped(p, entry)) __try_to_reclaim_swap(p, swp_offset(entry), TTRS_UNMAPPED | TTRS_FULL);
This policy optimizes memory usage by attempting to free both the swap entry and the page cache when they are no longer needed. The system avoids keeping cached pages in memory unnecessarily, thus freeing up resources for other processes.
-
#define SWAPFILE_CLUSTER 256
This configuration defines the size of the swap cluster, which is the number of swap pages that the system tries to allocate as a unit. Is set to 256.
-
/* * Use percpu scan base for SSD to reduce lock contention on * cluster and swap cache. For HDD, sequential access is more * important. */ if (si->flags & SWP_SOLIDSTATE) scan_base = this_cpu_read(*si->cluster_next_cpu); else scan_base = si->cluster_next; offset = scan_base;
Algorithmic Policy: Check if the device is HDD or SSD. If SSD, allow different cpus to scan different parts of the swap map. If HDD, check for slots sequentially.
-
/* reuse swap entry of cache-only swap if not busy. */ if (vm_swap_full() && si->swap_map[offset] == SWAP_HAS_CACHE) { int swap_was_freed; unlock_cluster(ci); spin_unlock(&si->lock); swap_was_freed = __try_to_reclaim_swap(si, offset, TTRS_ANYWAY); spin_lock(&si->lock); /* entry was freed successfully, try to use this again */ if (swap_was_freed) goto checks; goto scan; /* check next one */ }
This is an algorithmic policy. If you're running out of swap spaces, try to reclaim space in the cache-only swap.
-
#define LATENCY_LIMIT 256
LATENCY_LIMIT is an upper bound on how many slots can be checked by scan_swap_map_slots before it yields the cpu back. It's set to 64. This policy prevents the function from monopolizing the cpu.
-
node = numa_node_id();
This algorithmic policy is choosing which node to look at for swap devices.
In particular, it only tries swapping pages with devices that are at the same Numa Node as the core, and not any other.
-
plist_for_each_entry_safe(si, next, &swap_avail_heads[node], avail_lists[node]) { /* requeue si to after same-priority siblings */ plist_requeue(&si->avail_lists[node], &swap_avail_heads[node]);
This is an algorithmic policy that decides which of the swap devices should be used to get swap pages from, in that numa node. What's interesting here is that once a device is seen, it is requeued, so that the burden is spread evenly over all the devices with same priority, on that that node.
-
n_goal = min3((long)n_goal, (long)SWAP_BATCH, avail_pgs);
SWAP_BATCH: This is a configuration policy (Macro) that caps the maximum number of entries that can be swapped in a single operation. It's set to 64. This is done to limit amount of swapping that happens.
-