Hide minor edits - Show changes to markup
Assigned: Oct 20 Due: Nov 3
Note - the book Understanding the Linux Virtual Memory Manager, by Mel Gorman (Gorman04), will be very helpful in doing this assignment. It is available for download from the publisher's website here.
In this assignment you will instrument the Linux virtual memory system in order to collect page lifetime statistics. In doing so you will need to study the code which the Linux kernel uses to allocate, free, and swap pages, as well as initialize virtual memory areas and handle page faults. After doing so, an interface will need to be written so that these statistics may be collected by an application.
Linux virtual memory areas must be populated with physical pages in order to be accessed by the processor. These pages come from the following sources:
Inserting pages into a virtual memory area is done in a lazy fashion - i.e. at page fault time. On Intel 32-bit architecture systems, these page faults are handled by do_page_fault() in arch/i386/mm/fault.c. The logic followed by the fault handler is shown on page 81 of Gorman04. Pages are reclaimed by kswapd, and are either freed directly (if they are clean) or must be written to swap. Although the process of allocating and (especially) deallocating a page is complex, the beginning and end of a page's active lifetime are in mm/page_alloc.c, in __get_free_pages() and __free_pages_bulk(). (Note - __get_free_pages() is also used to allocate many pages - e.g. for filesystem buffers - which we do not want to include in our statistics.)
Instrument the memory manager software to determine the occupancy time of each page - i.e. the time from when it is first allocated and assigned to a virtual memory area until it is freed. To do this you will need to record the time at which each page is first mapped, and then the time at which it is finally freed. Keep a histogram of occupancy times, and create a /proc entry which allows this histogram to be retrieved. (note that you do not need to provide a reset function, as a user process can keep past values and subtract.) It would be preferable to use a histogram with varying bucket sizes, allowing it to track short times with relatively high precision, while still representing long times.
struct page.
To test this we will need to produce some memory pressure. Set the virtual machine memory size to a small value (96MB should work, I think) and run a kernel compile after make clean. Collect statistics every 5 seconds during the compile. Submit these statistics as well as a graph of mean and median as calculated from the histograms.
Please email the following: