Advanced Kernel Programming
This is the home page for the "Advanced Kernel Programming" course.
Here, you can find information about the lessons and all the material
used during the course
For the previous editions of the course, check the old websites: 2019/2020
and 2022/2023 (WARNING: This edition of the course was attended by students that did not attend the Linux Kernel Programming course... So, I was forced to repeat topics from the previous course).
Lessons:
- First lesson: 2024/11/04, 9:30 -> 11:30
- Introduction to the course
- Recall some basic concepts about Linux
- SCHED_DEADLINE
- Second lesson: 2024/11/07, 9:30 -> 11:30
- Affinity masks, and their usage
- Multi-core scheduling, push/pull functions
- Some more details about the scheduler implementation (scheduling classes, some functions pointed in the classes, how to block/wakeup a task, the schedule() function)
- Transforming SCHED_DEADLINE into a partitioned scheduler
- Third lesson: 2024/11/11, 9:30 -> 11:30
- Fourth lesson: 2024/11/14, 9:30 -> 11:30
Fifth lesson: 2024/11/18, 9:30 -> 11:30
- Fifth lesson: 2024/11/21, 9:30 -> 11:30
- Processes and virtual memory
- Handling page faults
- Anonymous memory and memory-mapped files
mlockall()
and disabling lazy memory allocation
- Sixth lesson 2024/11/25, 9:30 -> 11:30
- Again on mapping virtual memory pages in physical memory pages
- Experiments with
mmap()
and virtual memory areas
- Example of mmappable device
- Seventh lesson 2024/11/28, 9:30 -> 11:30
- Other examples on
mmap()
- Example of mmappable device with shared memory
- Eighth lesson 2024/12/02, 9:30 -> 11:30
- Nineth lesson 2024/12/05, 9:30 -> 11:30
Tenth lesson 2024/12/09, 9:30 -> 11:30
- Tenth lesson 2024/12/16, 9:30 -> 11:30
Interesting Kernel CallChains:
Downloads:
Ideas about Possible Projects for the Exam
Some projects are simpler than others; if you decide to work on a project, please
contact me first; if you have ideas about different projects, contact me to discuss them.
Modify the SCHED_DEADLINE migration mechanism to implement
Adaptive Partitioning
(note: an updated version of the paper contains better implementation
details; ask me for it if you plan to work on this project)
Project already taken
- When forced threadirqs are used, try to move all the network processing to the
network interrupd handler thread, removing the network softirqs
- Related to the previous project: look at the
threaded NAPI
patchset, and design an integration with
threaded irq handlers / forced threadirqs
- Compare the performance of SLAB, SLUB and SLOB through a set of experiments
(note: to really measure the slab allocator performance, and not some
random noise, you need to carefully design the experiments!)
- Experimentally verify how many times the buddy allocator is used to invoke
one single physical memory page, and how many times it is used of higher-order
allocations
(use an appropriate workload, otherwise you will not be able to measure anything)
Also check which kernel subsystems need higher-order allocations (you can use
ftrace to find out this information)
- Implement some kind of IPC mechanism based on shared memory, using a kernel
module (you can extend and improve the shared mmap() example seen during the
course). Then, compare the performance of this IPC mechanism with the ones of
an equivalent IPC based on the read() and write() system calls
- Analyze the behaviour of a Linux kernel under a high network load (using netperf,
or similar tools) and check which fraction of the CPU time is spent in the network
softirqs (NET_RX_SOFTIRQ and NET_TX_SOFTIRQ) and which fraction of CPU time is
consumed by the application sending and receiving data. Is it possible to move
processing time from NET_RX_SOFTIRQ to the application using some form of early
demultiplexing?
- Under high network load, try to isolate one or more core from network processing
by modifying the CONFIG_RPS code
Interesting Papers:
- Bonwick, Jeff. "The slab allocator: An object-caching kernel memory allocator." USENIX summer. Vol. 16. 1994.
- Bonwick, Jeff, and Jonathan Adams. "Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources." USENIX Annual Technical Conference, General Track. 2001.
- Mogul, Jeffrey C., and K. K. Ramakrishnan. "Eliminating receive livelock in an interrupt-driven kernel." ACM Transactions on Computer Systems 15.3 (1997): 217-252.