Lorenzo Stoakes

The Linux Memory Manager: Development Diary

I also have a twitter account where I post fragmented updates and thoughts on the book-writing process.

June 6th 2022 (chapters done: 0/2x, pages: 11)

Unfortunately the Queen's jubilee was spent being quite ill rather than working on the book so less was achieved than hoped. I'm now tracking the number of pages here to make the '0' chapter figure less dreadful :)

While I've done a great deal of research on allocators (see previous posts) I've decided the research-y nature of that is slowing me down right at the start where I most need traction so I'm going to work on that chapter piecemeal while attacking others which actually involve the kernel.

Therefore my first chapter of actual reading-the-kernel-source work will commence with the physical memory allocation chapter. I am excited to get cracking in that, while updating the allocator chapter describing buddy allocators in general.

I also intend my pages/week rate to increase from... ~2/wk to... considerably more! That I'm digging into the kernel now means I will likely be targeting 5.18 but if it remains feasible to switch I might hopscotch kernel versions until it's not practical for me to do it any more.

May 31st 2022 (chapters done: 0/21)

I've now worked my way through the oldmalloc allocator from musl which, combined with the absolutely excellent meta paper Dynamic Storage Allocation: A Survey and Critical Review, has provided a really firm grounding in the theory and practice of implementing a userland dynamic memory allocator.

However having said that, reality has hit like a ton of bricks. I am doing this work part-time and hold the heavyweight world belt in procrastination and to say my productivity has been up and down over the past couple of weeks would be an understatement.

While my initial aim was to complete the first chapter by the end of May, that target has been missed with a capital M. I am endeavouring to start anew in June and have thus rescheduled chapter 1 for end of that month.

I was reading a Hacker News post about writing books and what stood out for me what not only the article but comment[at]ors were saying - the only thing that really matters is writing the damn words. Playing with tooling is fun, sitting there and planning things out and imagining things is fun but none of that really matters, all that really matters is sitting down and doing the damn work.

I struggle not to distract myself with myriad things - playing with my cats, gardening, more realistically hours and hours and HOURS of watching trash on youtube and chatting away on various apps (and let's not even get into what I do on Amazon/eBay).

The reality is that I have to force myself to sit in my damn seat and get on with my damn work. So, from tomorrow I will be making changes to ensure I do exactly that. I will update this diary as to how well that works out!

May 17th 2022 (chapters done: 0/21)

Firstly a word on timelines - I am targeting roughly a chapter a month (with 20+ chapters you can see why the target sits at around 2 years) but of course with this as a part-time project and a demanding day job timelines may flex.

On implementation - I am using LaTeX because it produces unfathomably beautiful documents. I do intend at some point to produce a lovely HTML version (inspired by the incredible work on RNG enhancements in kernels 5.17, 5.18) but I am not sure how I am going to do it and for now it'll remain an exercise left to the writer in some distant, possibly dystopian future.

So back to what I've actually been working on - I am part way through the research and write-up of the first chapter on allocators as a whole. I have been researching the musl malloc implementation, specifically the 'oldmalloc' version of it as a real-world albeit minimalist reference point, which has proved fruitful.

I am not sure I'd quite say the bitwise tricks embedded throughout (without being separated into named functions/macros) were fruitful to work with but I powered my way through.

Most interesting things I've discovered so far are:-

  • Multiple free lists are used and blocks assigned to them non-linearly (the number of block sizes assigned to a free list increases with, well, block size, read the book when I'm done if this seems confusing),
  • How we compact the prior and successive contiguous freelist blocks if we can opportunistically (rendering this a O(1) time cost),
  • Using mmap() for larger allocations,
  • General aversion to anything whiffing even slightly of O(n),
  • The wonders of overcommit and the really rather powerful madvise(MADV_DONTNEED).

The intent is to give a general intuitive understanding of how allocators are designed, the trade-offs, the patterns and 'shape' of an allocator and how we tackle fragmentation while remaining efficient.

By doing this the reader gains an understanding of how we generally tackle the problem of dynamic allocation which lays the groundwork for examining how the physical memory allocator is implemented in the kernel as well as the slab allocator and drawing analogies between where the 'rubber hits the road' in userland and what's going on in the kernel.

The approach of the book as a whole is to try to give an intuitive understanding to a motivated developer and to take them on a journey into more and more detail, as well as being a reference in case somebody wants to understand a specific detail quickly while also (good lord - perhaps I am trying to be everything to everybody) providing good practical chapters on things like how to decode OOM dumps and how to use procfs and sysfs to gain understanding of the system (and customise it via tunables).

I feel in any case that I have a good idea of how to move forward - a chapter at a time, dear boy!