The Linux Kernel Mentorship Program

A whirlwind introduction to contributing to the Linux kernel

Table of Contents

When I first read about the Linux Kernel Mentorship (LKMP) program I had just finished my Master’s in Logic at the University of Amsterdam. I had wanted to do a PhD in the philosophy of mathematics (specifically the philosophy of set theory), but got bogged down when writing my thesis during the pandemic. I felt that I needed some proper ‘IRL’ experience before deciding that I really wanted to spend several more years studying a subject of which all the experts easily fitted in that small (but very nice!) church building in front of the Andrew Wiles Building at the University of Oxford.

Ambition and academia #

I have been programming since I was about 13. I wrote small VBScripts that would open the CD tray if you entered a certain command, and that would let Microsoft Sam speak admonishingly if you provided any wrong input. Eventually my mother saw that I had a knack for computer programming and arranged for an acquaintance of her, who taught computer science at a nearby college, to tutor me. In the years that followed I learned to program in C, studied several of the Tanenbaum books, and learned to work with Arduino boards. I loved it.

One thing that I had noticed while studying computer science was that knowledge of mathematics was often the limiting factor in advancing through the material. I distinctly remember being 16 and trying to learn about algorithms and getting my brain in a knot over the proof of the master theorem. It was too much. I decided that if I really wanted to get good, I needed to study mathematics, not computer science.

Well, many years later, having done exactly that, I can tell you two things about studying pure mathematics:

  1. You will be able to understand the proof of the master theorem
  2. There are surprisingly few applications for most advanced pure mathematics

This leaves the ambitious graduating student in a bit of a conundrum: how to satisfy your curiosity and need to ‘dive deep’ while still doing something practical (i.e., to get paid for what you do)? Unless you manage to reach the ranks of tenured faculty, doing a PhD only postpones this problem. And professorship is a very uncertain road, in particular in pure mathematics, and especially in logic. I have seen many very talented men and women falter after their third temporary postdoc position, nearing the age of 40, failing to get even an assistant professorship (let alone tenure). And I know at least one extreme case of a number theorist that went homeless because on the one hand they did not have the directly applicable skills that many tech companies demand, but on the other hand were considered overqualified for everything else (including a role as cashier at a supermarket). (1)Yitang Zhang, who proved a weak version of the twin prime conjecture, considered at the time outside the scope of current mathematics, purportedly lived in his car for some time and worked as a delivery driver and at Subway (after getting his PhD).

I could continue on this topic, and frankly it is a conversation that we should have some day: why are we (at least in the EU) currently educating so many specialists in very technical fields? We even have special visa programs and scholarships to attract and promote talent in these areas, but there is often no real demand from businesses or society. Moreover, there are many very acute and complex problems that remain largely unaddressed. However, in this blog I want to focus on the positive.

Plunging into the deep #

Being in this gloomy mindset after finishing my studies, I happened to come across Javier Carrasco’s article on the Linux kernel mentorship program. I was curious, but also uncertain of my ability to partake, since by then I had not been actively programming for almost a decade. The Linux kernel is huge and very complex, and has the reputation for being a bit of a bastion. But this is also part of its appeal. There is only a handful of companies that employ Linux kernel engineers. These are challenging and important jobs that demand continuous development of skills and knowledge, and active participation in a wider community. This is exactly the kind of environment that can satisfy my curiosity and ambition, but that is also still meaningfully engaged with the world.

I took the leap this summer and set the first steps towards becoming part of the kernel community by applying for the mentorship. I had a year of professional software engineering experience under my belt and felt a bit more confident. The program typically runs several times a year and is led by Shuah Khan, a very experienced kernel maintainer, (2)She is also the third ever Linux Foundation Fellow. with the assistance of one or more co-mentors.

One important aspect you should know about the program if you consider applying, is that it is very much hands-off. The only concrete structure that is provided by the mentors is a weekly office hour session. In these sessions Shuah demonstrates relevant tools and skills, such as how to do cross-compilation or set up ctags in Vim to navigate the kernel sources. She also takes questions from the mentees. That’s it. What to work on, or how to go about it, is entirely up to you. There is very little hand-holding.

The recommendation is to focus your efforts on one or two subsystems. I would also add that you should not go into subsystems that have tentacles throughout the kernel code (such as memory management) or that require other deep knowledge or trust that simply cannot be attained within the time frame of the program (like functionality in kernel/). Some mentees in my class have worked successfully on drm and net, or wrote a device driver. I chose to work on fs (file systems).

Syzbot #

One of the options open to mentees is to work on syzbot bugs. Syzbot runs cloud instances of syzkaller, a program that automatically fuzzes the Linux kernel on many different types of machines and using various configurations and compilation environments. Fuzzing here means to try to test all possible code paths and detect and log any encountered problems. Syzkaller attempts, for instance, to test the code of various system calls by invoking them on a diverse sample of the possible argument space, and reports any raised warnings and panics. It also mounts purposefully corrupted images and tries to perform file system operations on them. (3)This helps to test uncommon code paths. It also allows the file system to respond to possible corruption and prevent further loss of data. For instance, by reconfiguring the suspect file system as read-only. This type of testing is called dynamic analysis and contrasts with static analysis. In static analysis, code is inspected by an algorithm without ever being executed. (4)Static analysis cannot be expected to replace dynamic analysis because most questions about programs that we would like to be able to answer in general are uncomputable. Although, it could be that, in theory, these questions can be answered for all practical programs, these facts do indicate some bounds.

When I first looked through the syzbot log, most bugs appeared quite daunting: long stack traces involving many different moving parts, and all of them unfamiliar. I had no idea where to even start. In other words, I lacked context. In such situations, however, some wisdom from software engineering can help: comprehension of the operation and goal of a piece of code often follows naturally from understanding the involved data structures.

What’s going on? #

The first syzbot bug that I tackled was a warning triggered in minix_rmdir. (5)Minix is a very old Unix file system that was developed for the Minix OS. It has remained in the kernel for compatibility reasons, but is not used much. Using kgdb I discovered that the bug originated from a corrupted nlink field which eventually overflowed when calling minix_rmdir, triggering a WARN_ON. This is the call trace of that bug:

 inode_dec_link_count include/linux/fs.h:2634 [inline]
 minix_rmdir+0xa8/0xd0 fs/minix/namei.c:170
 vfs_rmdir+0x3b7/0x520 fs/namei.c:4470
 do_rmdir+0x2ac/0x630 fs/namei.c:4525
 __do_sys_rmdir fs/namei.c:4544 [inline]
 __se_sys_rmdir fs/namei.c:4542 [inline]
 __x64_sys_rmdir+0x47/0x50 fs/namei.c:4542
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Finally, inode_dec_link_count calls into drop_nlink, which triggers the WARN_ON:

/**
 * drop_nlink - directly drop an inode's link count
 * @inode: inode
 *
 * This is a low-level filesystem helper to replace any
 * direct filesystem manipulation of i_nlink.  In cases
 * where we are attempting to track writes to the
 * filesystem, a decrement to zero means an imminent
 * write when the file is truncated and actually unlinked
 * on the filesystem.
 */
void drop_nlink(struct inode *inode)
{
	WARN_ON(inode->i_nlink == 0);
	inode->__i_nlink--;
	if (!inode->i_nlink)
		atomic_long_inc(&inode->i_sb->s_remove_count);
}
EXPORT_SYMBOL(drop_nlink);

It is not hard to fix this bug, it takes just a few lines of code. Likewise, the bugs you might attempt to fix during your mentorship will also not be hard. The challenge for us kernel newbies is not merely to understand what changes to make. More important–and more difficult at this stage—is to grasp how your changes may affect the surrounding code, and how you will convince yourself (and the community!) that they do not result in any regressions. This means you really need to understand what you are doing and have a plan for how to test your fix.

The minix_rmdir bug involves the data structure struct inode and some jumping around in fs/namei.c. So this is the context one has to grasp before even getting started on a solution. Luckily, many concepts and techniques used in the kernel are rooted in operating system theory and have had long gestation periods in various Unixes. You can read, for instance, about inodes in the OSTEP book; Love’s Linux Kernel Development still explains how the Virtual File System (VFS) dispatches file system specific operations, despite its 2010 publishing date; and details of the Minix filesystem can be found in Tanenbaum’s Minix book. This background gives you the scaffolding you need to read the current kernel code and understand the relevant call paths. For each task, be it a bug or something else, you need to build such awareness.

The kernel community #

I fixed the minix_rmdir bug and another similar syzbot bug in the minix file system. But instead of just sending these patches straightaway, I wanted to adopt a more general approach. My idea was to port the way that ext4 handles corruption to minix. (6)This is indicative of the freedom that the mentorship provides. If you see an opportunity to contribute, take it! As I wrote in my first patch:

This patch sets up generic handling of errors such as filesystem corruption which are frequently raised by syzbot. Towards this aim it adds the following mount options to the minix filesystem: errors=continue/panic/remount-ro and (no)warn-on-error, with semantics similar to ext4. When a minix_error() or minix_error_inode() is raised, the error is reported and action is taken according to which of these mount options is set (errors=continue,nowarn-on-error are the default).

As an example, this patch fixes a drop_nlink warning in rmdir exposed by syzbot, originating from a corrupted nlink field of a directory.

The changes were tested using the syzbot reproducer with the various new mount options. I also handcrafted a similar corrupted fs but with the minix v3 format (the reproducer uses v1).

Jan Kara, who is among other things a reviewer for VFS, responded the next day:

The patch looks ok to me but since minix filesystem driver is in the kernel mostly to allow mounting ancient unix filesystems I don’t quite understand the motivation for adding the new mount options. Why not just fixup minix_rmdir() to better handle corrupted filesystems?

This happens. You have an idea and work on a patch, but the community is ahead of you or does not like the change. (7)This also makes working on syzbot bugs quite hard. If you fix a bug in a fairly active subsystem, chances are someone else that is more experienced has beat you to it. There is no way to know in general who is working on what, but be sure to keep an eye on the relevant mailing lists. Also, failed patch attempts are still very useful as learning experiences and a way to get feedback from the community. In the kernel community you communicate through patches. No patch means no substance, and you are likely to get no response. Unfortunately, this means that sometimes you will work on a patch that turns out is not needed, or not the correct or desired way to approach the problem. However, if you do your due diligence, the community is generally very friendly to beginners.

In my case, it turned out that at last year’s LFS/MM conference, there had been a discussion on how to deal with old and rarely used file systems that are still in the kernel, such as minix. (8)These also include for instance hfs and jfs. The conclusion was that they should be deprecated as they pose a large burden on VFS maintainers, since any API change in the VFS must be accommodated in every file system, including these old ones. However, through my interaction with the community the idea was floated that I could work on deprecating minix.ko by writing a FUSE user space driver. I suggested that we could provide a in-tree but out of kernel implementation, similar to how other user space code (like kselftest) is already integrated. This idea was well received. So you see that even with a rejected patch you can get your foot in the door and get started on more significant contributions. If you do well, this is how you can slowly build a reputation in the community.

The remainder of the mentorship I spent understanding the code of the minix file system in detail. To port the current code to a FUSE implementation you must know how it interacts with the various kernel caches, such as the page cache, and kernel APIs, like the writeback API. In particular, you must understand which of these tasks FUSE already does for you and what you need to do yourself. Shuah has supported me in this endeavor, and I will continue this work past the mentorship.

During my reading of the minix code I came across several issues for which I have submitted patches. I also worked on other file system syzbot bugs. This is an overview of my patches:

Upstream:

Fix two syzbot corruption bugs in minix filesystem (series with 3 patches)

Reviewed but not yet included upstream:

minix: Add required sanity checking to minix_check_superblock()

Corrected errno in minix_new_inode

hfs: Replace BUG_ON with error handling for CNID count checks

Waiting for review:

nlink overflow in jfs_rename

dtInsertEntry can result in buffer overflow on corrupted jfs filesystems

Thanks #

I want to thank especially Shuah Khan for her time and support. Having a mentor that encourages your ambitions matters, especially if you come from an underprivileged background where such support is rare. Also thanks to the co-mentors David Hunter and Khalid Aziz.

I further want to thank Jan Kara and Viacheslav Dubeyko for their patient advice and extensive feedback on my patches.