Lorenzo Stoakes

The Linux Memory Manager: Development Diary

I also have a twitter account where I post fragmented updates and thoughts on the book-writing process.

October 3rd 2022 (pages: 172, chapters: 1)

It's finally done! I have completed the first chapter of the book (well it will end up as the second chronooogically, but it is the first temporally).

It's been a real journey - getting used to writing has been challenging, finding the spare time to do it even more so and deciding what to cut/pass over possibly the hardest thing of all.

I have discovered a great many things about physical memory allocation (and indeed, a large part of the point of the exercise is to increase my depth of knowledge), largely that it is oh so incredibly heuristic.

One of the odd observations as well is just how difficult it is to make nice diagrams - I have taken pains to visualise the allocator and freeing logic because I know there is power in doing so, but getting the diagrams accurate, comprehensible and lined up correctly is more time consuming than you could possibly imagine.

During the course of the work I have got myself a Macbook M2 air, so have gained some insights into how things are on an arm64 architecture. I did cheat a little by building a custom kernel with CONFIG_NUMA set, but most shocking of all was how ZONE_DMA spans all the memory on it.

OK so the previous sentence really does speak to quite how specialist this area is, and is not perhaps what most people would focus on when using a new computer.

Linus did have the absolute cheek to release linux v6 last Sunday which really I do have to tie the book to. After all it isn't due to be finished for at least another year and a half and us soft humany creatures are a sucker for round numbers.

On that front, I've decided to leave the physical allocation chapter at 5.19 for now, I will update it later, and target the remaining chapters at 6. Oh Linus!

On to the next chapter - virtual memory!

August 25th 2022 (pages: 107)

It has been quite some time since I updated this! Since my last update I have been focusing entirely on the physical allocator chapter; doing so has taught me a lot of things the hard way - lessons I will carry through to subsequent chapters.

The most important of these is the need to plan a chapter out from the top-level first before filling in the details. Failing to do so leaves you at risk of spending a great deal of time focusing on very specific details and overrunning dreadfully on the rest of the chapter. Often it feels like what you leave out matters more than what you leave in.

My workflow is now well established - pdfLaTeX for layout, emacs for editing, mupdf for previewing, scripts to update things as I type (via inotify) and TikZ for diagrams within LaTeX.

For confirming that what I'm saying is true of course the kernel source remains the canonical reference, however it is often necessary to see what is happening in a live kernel at which point I use qemu and my kernel scripts along with gdb (qemu makes plugging gdb into the kernel a doddle) alongside some well placed breakpoints/pr_err() statements.

On the userland side I can use the copious mm procfs endpoints to examine state, for example kpageflags.

For the time being I am officially targeting kernel 5.19, however for purely human reasons I am likely to shift to 6 at some point in the future.

A key thing to note is that as much as possible I provide links to the functions/data structures I reference so somebody in possession of an electronic copy of the book can see what I'm referring to for themselves directly and in context.

I know this update is a bit brief and impersonal but I hope to say quite a bit more once I have completed my first chapter! Despite a slow start, and this necessarily being a part-time hobby project, I still do hope that the planned 2 year timeline is feasible. Time will tell.

June 6th 2022 (pages: 11)

Unfortunately the Queen's jubilee was spent being quite ill rather than working on the book so less was achieved than hoped.

While I've done a great deal of research on allocators (see previous posts) I've decided the research-y nature of that is slowing me down right at the start where I most need traction so I'm going to work on that chapter piecemeal while attacking others which actually involve the kernel.

Therefore my first chapter of actual reading-the-kernel-source work will commence with the physical memory allocation chapter. I am excited to get cracking on that while updating the allocator chapter to describe buddy allocators in general.

I also intend my pages/week rate to increase from... ~2/wk to... considerably more! That I'm digging into the kernel now means I will likely be targeting 5.18 but if it remains feasible to switch I might hopscotch kernel versions until it's not practical for me to do it any more.

May 31st 2022

I've now worked my way through the oldmalloc allocator from musl which, combined with the absolutely excellent meta paper Dynamic Storage Allocation: A Survey and Critical Review, has provided a really firm grounding in the theory and practice of implementing a userland dynamic memory allocator.

However having said that, reality has hit like a ton of bricks. I am doing this work part-time and hold the heavyweight world belt in procrastination and to say my productivity has been up and down over the past couple of weeks would be an understatement.

While my initial aim was to complete the first chapter by the end of May, that target has been missed with a capital M. I am endeavouring to start anew in June and have thus rescheduled chapter 1 for end of that month.

I was reading a Hacker News post about writing books and what stood out for me what not only the article but comment[at]ors were saying - the only thing that really matters is writing the damn words. Playing with tooling is fun, sitting there and planning things out and imagining things is fun but none of that really matters, all that really matters is sitting down and doing the damn work.

I struggle not to distract myself with myriad things - playing with my cats, gardening, more realistically hours and hours and HOURS of watching trash on youtube and chatting away on various apps (and let's not even get into what I do on Amazon/eBay).

The reality is that I have to force myself to sit in my damn seat and get on with my damn work. So, from tomorrow I will be making changes to ensure I do exactly that. I will update this diary as to how well that works out!

May 17th 2022

Firstly a word on timelines - I am targeting roughly a chapter a month (with 20+ chapters you can see why the target sits at around 2 years) but of course with this as a part-time project and a demanding day job timelines may flex.

On implementation - I am using LaTeX because it produces unfathomably beautiful documents. I do intend at some point to produce a lovely HTML version (inspired by the incredible work on RNG enhancements in kernels 5.17, 5.18) but I am not sure how I am going to do it and for now it'll remain an exercise left to the writer in some distant, possibly dystopian future.

So back to what I've actually been working on - I am part way through the research and write-up of the first chapter on allocators as a whole. I have been researching the musl malloc implementation, specifically the 'oldmalloc' version of it as a real-world albeit minimalist reference point, which has proved fruitful.

I am not sure I'd quite say the bitwise tricks embedded throughout (without being separated into named functions/macros) were fruitful to work with but I powered my way through.

Most interesting things I've discovered so far are:-

  • Multiple free lists are used and blocks assigned to them non-linearly (the number of block sizes assigned to a free list increases with, well, block size, read the book when I'm done if this seems confusing),
  • How we compact the prior and successive contiguous freelist blocks if we can opportunistically (rendering this a O(1) time cost),
  • Using mmap() for larger allocations,
  • General aversion to anything whiffing even slightly of O(n),
  • The wonders of overcommit and the really rather powerful madvise(MADV_DONTNEED).

The intent is to give a general intuitive understanding of how allocators are designed, the trade-offs, the patterns and 'shape' of an allocator and how we tackle fragmentation while remaining efficient.

By doing this the reader gains an understanding of how we generally tackle the problem of dynamic allocation which lays the groundwork for examining how the physical memory allocator is implemented in the kernel as well as the slab allocator and drawing analogies between where the 'rubber hits the road' in userland and what's going on in the kernel.

The approach of the book as a whole is to try to give an intuitive understanding to a motivated developer and to take them on a journey into more and more detail, as well as being a reference in case somebody wants to understand a specific detail quickly while also (good lord - perhaps I am trying to be everything to everybody) providing good practical chapters on things like how to decode OOM dumps and how to use procfs and sysfs to gain understanding of the system (and customise it via tunables).

I feel in any case that I have a good idea of how to move forward - a chapter at a time, dear boy!