June 6th 2022 (chapters done: 0/2x, pages: 11)
Unfortunately the Queen's jubilee was spent being
quite ill rather than working on the book so less
was achieved than hoped. I'm now tracking the number
of pages here to make the '0' chapter figure less
dreadful :)
While I've done a great deal of research on
allocators (see previous posts) I've decided the
research-y nature of that is slowing me down right
at the start where I most need traction so I'm going
to work on that chapter piecemeal while attacking
others which actually involve the kernel.
Therefore my first chapter of actual
reading-the-kernel-source work will commence with
the physical memory allocation chapter. I am excited
to get cracking in that, while updating the
allocator chapter describing buddy allocators in
general.
I also intend my pages/week rate to increase
from... ~2/wk to... considerably more! That I'm
digging into the kernel now means I
will likely be targeting 5.18 but if it
remains feasible to switch I might hopscotch kernel
versions until it's not practical for me to do it
any more.
May 31st 2022 (chapters done: 0/21)
I've now worked my way through the oldmalloc allocator from musl which, combined with the absolutely excellent meta paper
Dynamic Storage Allocation: A
Survey and Critical Review, has provided a
really firm grounding in the theory and practice
of implementing a userland dynamic memory
allocator.
However having said that, reality has hit like a
ton of bricks. I am doing this work part-time and
hold the heavyweight world belt in procrastination
and to say my productivity has been up and down over
the past couple of weeks would be an
understatement.
While my initial aim was to complete the first
chapter by the end of May, that target has been
missed with a capital M. I am endeavouring to start
anew in June and have thus rescheduled chapter 1 for
end of that month.
I was reading
a Hacker News post about writing
books and what stood out for me what not only the
article but comment[at]ors were saying - the only
thing that really matters is writing the damn
words. Playing with tooling is fun, sitting
there and planning things out and imagining things
is fun but none of that really matters, all that
really matters is sitting down and doing the damn
work.
I struggle not to distract myself with myriad
things - playing with my cats, gardening, more
realistically hours and hours and HOURS of watching
trash on youtube and chatting away on various apps
(and let's not even get into what I do on
Amazon/eBay).
The reality is that I have to force myself to sit
in my damn seat and get on with my damn work. So,
from tomorrow I will be making changes to ensure I
do exactly that. I will update this diary as to how
well that works out!
May 17th 2022 (chapters done: 0/21)
Firstly a word on timelines - I am targeting
roughly a chapter a month (with 20+ chapters you can
see why the target sits at around 2 years) but of
course with this as a part-time project and a
demanding day job timelines may flex.
On implementation - I am
using LaTeX
because it produces unfathomably beautiful
documents. I do intend at some point to produce a
lovely HTML version (inspired by the incredible
work
on RNG
enhancements in kernels 5.17, 5.18) but I am not
sure how I am going to do it and for now it'll remain
an exercise left to the writer in some distant,
possibly dystopian future.
So back to what I've actually been working on - I
am part way through the research and write-up of
the first chapter on allocators as a
whole. I have been researching
the musl malloc implementation,
specifically the
'oldmalloc' version of it as a
real-world albeit minimalist reference point,
which has proved fruitful.
I am not sure I'd quite say the bitwise tricks
embedded throughout (without being separated into
named functions/macros) were fruitful to work with but
I powered my way through.
Most interesting things I've discovered so far are:-
- Multiple free lists are used and blocks assigned
to them non-linearly (the number of block sizes
assigned to a free list increases with, well, block
size, read the book when I'm done if this seems
confusing),
- How we compact the prior and successive
contiguous freelist blocks if we can
opportunistically (rendering this a O(1) time cost),
- Using mmap() for larger allocations,
- General aversion to anything whiffing even
slightly of O(n),
- The wonders of overcommit and the really
rather powerful madvise(MADV_DONTNEED).
The intent is to give a general intuitive
understanding of how allocators are designed, the
trade-offs, the patterns and 'shape' of an allocator
and how we tackle fragmentation while remaining
efficient.
By doing this the reader gains an understanding of
how we generally tackle the problem of dynamic
allocation which lays the groundwork for examining how
the physical memory allocator is implemented in
the kernel as well as the slab allocator and drawing
analogies between where the 'rubber hits the road' in
userland and what's going on in the kernel.
The approach of the book as a whole is to try to
give an intuitive understanding to a motivated
developer and to take them on a journey into more and
more detail, as well as being a reference in case
somebody wants to understand a specific detail quickly
while also (good lord - perhaps I am trying to be
everything to everybody) providing good practical
chapters on things like how to decode OOM dumps and
how to use procfs and sysfs to gain understanding of
the system (and customise it via tunables).
I feel in any case that I have a good idea of how
to move forward - a chapter at a time, dear boy!