The Linux Memory Manager
I am writing a book about the Linux memory management subsystem which I anticipate will be ready around mid-2024. I provide periodic updates in my development diary and more frequent ones on Mastodon or Twitter.
Please register your interest in the book to receive an email as soon as it is released (and optionally infrequent updates on progress).
- Allocators - Describe in detail how dynamic memory allocation works, the trade-offs, the algorithms - essentially the what, why and how of memory allocation. [INCOMPLETE]
- Physical memory - The buddy allocator, struct page*, struct folio*, the kernel functions that provide memory (e.g. alloc_pages()), nodes, zones (in brief, more detailed in NUMA chapter), watermarks, migrate types, page blocks, GFP flags, migrate types, detailed analysis of physical page allocation and freeing. [ROUGH DRAFT COMPLETE]
- Virtual memory - The why, what and how, page tables, page table flags, page table sizes, virtual memory layout, direct mapping, kernel/userland split. vmalloc(), kernel process address space. [ROUGH DRAFT COMPLETE]
- Process memory - mm_struct, Process VMAs, page faults, demand paging, dynamic stack allocation, how memory is copied between userland/kernel, how malloc() actually feeds into the kernel (e.g. sbrk(), mmap()). Copy-on-write, forking, basics of huge pages (covered in more detail in huge pages chapter), rmaps, pagevecs, lru + lruvecs. GUP.[CURRENTLY WORKING ON]
- Appendix: Physical page flags - A list of all physical page (i.e. struct page) flags with descriptions. [INCOMPLETE]
- Memory pressure/reclaim - What memory pressure is, how it fits in with demand paging, higher order page starvation, compaction, reclaim, etc.
- Swap - Internals (the description for this chapter has been swapped out).
- Slab allocator - slab/slub (not slob), kmalloc()/kfree().
- Page cache - Discussion of the page cache and how this is maintained and operated (and which tunables impact it), dirty pages, internals and interactions with issues such as memory pressure.
- Compaction and migration - How direct and indirect page compaction works, kcompactd and page migration.
- NUMA - Nodes in much more detail, NUMA rebalance, internals, etc.
- Out of memory killer - How the OOM killer works, how to decode an OOM dmesg report, analysis of code, oom_score_adj, etc.
- cgroups - How to use memory cgroups, how they are implemented internally, etc.
- Decoding the memory manager: procfs, tuneables, dmesg, tracepoints, eBPF and more (this title probably needs work :) - A description of procfs/sysfs/sysrq-m interfaces and how to use and decode them, generally a practical 'how to' for sysadmins/developers. Additionally covering DAMON and eBPF.
- Huge pages - Why they're useful, Transparent Huge Pages/hugetlb internals, tuneables.
- Early memory management - memblock, bitmap, ELF sections marked init, transition to core memory manager, algorithms.
- Debugging the memory manager - Use of CONFIG_DEBUG_PAGEALLOC, kasan and friends to discover bugs.