This page last updated: December 2018.
Paging, Part 1: Managing the MMU
The code for this set of exercises is a modified version of the example-idt codebase that you worked on in the context switching lab; the modifications in this file include some changes to initialize the paging mechanisms (as described in the Week 4 lecture) and the addition of some TODO items to guide you through these exercises. You can download a copy of this code from paging.tar.gz, and should unpack it (using tar xvzf paging.tar.gz) in your main llp folder (i.e, the one with subdirectories for simpleio, mimg, example-idt, switching, etc.). This will create a new paging folder where you'll do your work in this lab, with subfolders for kernel and user code, following the pattern you've seen previously.
Because there are some fairly complicated steps in this work, we'll be splitting these exercises over two weeks:
All of the work for this first week will be done on the files in the kernel subdirectory, but you may wish to keep one window open in the main paging directory (so that you can "make run" for the purposes of testing), and another in paging/kernel so that you are in the most convenient place to edit the files.
The provided code should compile and run as supplied, but it is incomplete and has multiple sections with TODO comments describing code that you will need to write. This document is designed to be read in conjunction with those source files and annotations. You are strongly suggested to follow the specific sequence of steps described below to complete these tasks (augmented with your own tests along the way, as you see fit), but you may choose to proceed in a different way if you prefer.
As you work on these exercises, you'll probably want to remind yourself about the bit layout patterns for page directory and page table entries. Here is a snapshot of the table in the Intel documentation that might be useful for that purpose (clickable for a full size version):
[Aside: This diagram includes some details for an optional feature of current Intel architectures called PSE-36 that allows programmers to specify physical addresses with up to 40 bit addresses for 4MB super pages. This feature uses bits 12-20 in the second row of the diagram above. We are not using that feature here, however, so we will always set those bits to zero.]
Ok, let's get started!
Step 1: Review
Review the supplied code, particularly the assembly language sequence at the start of kernel/init.S that enables the IA32's paging mechanism. Much of this should look familiar from the code that we have discussed in the Week 4 lecture. The code that creates the initial page directory is particularly relevant (i.e., worth your time and careful study) because you'll be doing something very similar in C later on ...
[Aside: You might be wondering if there is any significance to the fact that the file name kernel/init.S ends with a capital S, unlike the assembly files that we've seen previously with a lower case s suffix. This is a way of specifying that the assembly code in that file should be run through the C preprocessor before it is fed to the assembler. In particular, this allows use to use directives like #include "memory.h" or #define PAGESIZE 12 in our assembly code file, so that we can use the same symbolic constants as in the accompanying C code.]
Run the program. How can you tell that paging has been enabled??? (Hint: the clue you're looking for is just one character ...)
You'll notice that the program also displays a textual description of the initial page directory that is used by the code in kernel/init.S when the memory management/paging support is first turned on. The output looks like this:
initial page directory is at 0xc0102000 Page directory at c0102000 0: [0-3fffff] => [0-3fffff], superpage 1: [400000-7fffff] => [400000-7fffff], superpage 2: [800000-bfffff] => [800000-bfffff], superpage 3: [c00000-ffffff] => [c00000-ffffff], superpage 4: [1000000-13fffff] => [1000000-13fffff], superpage 5: [1400000-17fffff] => [1400000-17fffff], superpage 6: [1800000-1bfffff] => [1800000-1bfffff], superpage 7: [1c00000-1ffffff] => [1c00000-1ffffff], superpage 300: [c0000000-c03fffff] => [0-3fffff], superpage 301: [c0400000-c07fffff] => [400000-7fffff], superpage 302: [c0800000-c0bfffff] => [800000-bfffff], superpage 303: [c0c00000-c0ffffff] => [c00000-ffffff], superpage 304: [c1000000-c13fffff] => [1000000-13fffff], superpage 305: [c1400000-c17fffff] => [1400000-17fffff], superpage 306: [c1800000-c1bfffff] => [1800000-1bfffff], superpage 307: [c1c00000-c1ffffff] => [1c00000-1ffffff], superpageThis output (produced by the showPdir() function that is defined in kernel/paging.c) shows that the initial page directory includes mappings for a total of sixteen superpages: This includes:
In fact, you should recognize the above as a textual description of the initial page directory described in the lecture that maps the same region of physical memory in to virtual memory at two different addresses: a 1:1 mapping allows us to access these memory locations using low virtual addresses that correspond exactly to the underlying physical addresses, while the second group of mappings provide access to the same memory using addresses in the kernel space region. Take time to make sure you understand what the text above is describing: we're going to be creating and manipulating new page directory structures as we proceed, so it's important to have a clear idea of what they describe.
Ok, it's time to move on. But a quick heads up before we go further: you're going to be doing quite a lot of work to get this code up and running properly. It won't be easy, and you'll deserve some credit. So open up kernel/kernel.c in your text editor and add your name at the top of the file! This will almost certainly be this easiest of all the changes that you make to the code during this lab, but it's always nice to start with something simple :-)
It is fairly common for low-level code to use "bit twiddling", which involves maninpulating the bit-level representations of values using appropriate combinations of logical & (and), | (or), ^ (xor) and ~ (not) operations as well as bit shift operations like << (left shift) and >> (right shift). For example: we can extract the least significant 4 bits of a value x using x & 0xf; we can set bit 7 in x to one using x | (1<<7); or reset it using x & ~(1<<7). Pay careful attention to operator precedence in examples like this, and don't be shy about using parentheses where it is either necessary or just useful to clarify what is expected. For example, if you want to test whether bit 3 in a value x, you can use (x&0x8)!=0, but you should not omit the parentheses here because x&0x8!=0 is treated in the same way as x&(0x8!=0), which does not give the intended result!
For some practice in bit twiddling, your next task is to implement the macros that are defined/commented out at the end of memory.h. The descriptions for each macro/function given there should provide all the detail that you need, and none of them require anything very complicated. (Although a clear head will be useful!) Test your implementations to make sure they work as you expect because you'll be relying on them soon. (Hint: you can always #include "memory.h" in a regular C program that you can run directly from Linux to help with your tests; you might even find a suitable program lurking in the kernel folder ...)
The following diagram may help to provide some graphical intuition for these bit-twiddling macros:
Specifically, the green region shown here represents a block of memory running from the address lo to the address hi, inclusive. The vertical ticks indicate page boundaries, spaced at 4K intervals.
It will be important in the following that you understand what the numbers produced by these macros represent; they will be needed in later parts of this exercise! With that in mind, note that the smallest set of (full) pages that includes all of the memory in the green region runs from pageStart(lo) to pageEnd(hi). Conversely, the largest set of full pages that is completely contained within the green region runs from firstPageAfter(lo) to endPageBefore(hi). In this particular example, firstPageAfter(lo) is also the same as pageNext(lo) and pageNext(pageStart(lo)). Note, however, that pageStart(a) and firstPageAfter(a) are not always a whole page apart: if a is a multiple of 4K, then pageStart(a) = a = firstPageAfter(a). In case it is useful, here are some specific examples to make all this more concrete:
pageStart(0x12345678) = 0x12345000 firstPageAfter(0x12345678) = 0x12346000 endPageBefore(0x12345678) = 0x12344fff pageEnd(0x12345678) = 0x12345fff pageStart(0xabcde000) = 0xabcde000 firstPageAfter(0xabcde000) = 0xabcde000 endPageBefore(0xabcde000) = 0xabcddfff pageEnd(0xabcde000) = 0xabcdefff
Another way to think of these examples here is that every 32 bit address, such as 0x12345678 can be thought of as a combination of a 20 bit page number, such as 0x12345 and a 12 bit offset, such as 0x678. For example, if we want to find the start of a page from an arbitrary address, then we just need to replace the least significant 12 bits with zeros. Conversely, if we want to find the address of the last byte in a a page, then we need to replace the least significant 12 bits with ones, corresponding to 0xfff in hex notation.
Test your code for these macros carefully, and keep them in mind as you proceed with subsequent exercises in this lab and the next ... you can expect to find practical uses for all of these macros in what is still to come!
Step 3: Scanning the memory map
We're going to need to allocate some memory for page directory and page table structures. Study the code that prints out a summary of the memory map at the start of the kernel() function in kernel/kernel.c and then start on the first few TODO items in the body of that function. The algorithm described in those TODOs is not very sophisticated, but it will be sufficient for our purposes.
To be specific, the high level goal here is to identify a block of pages that the kernel can then use as a pool to allocate new page table or page directory structures (or other blocks of memory, in units of 4KB that start on a 4KB boundary). Be careful to note that a memory map entry spanning from lo to hi must be trimmed to the potentially smaller region between firstPageAfter(lo) and endPageBefore(hi) before it can be used for allocating whole pages of memory. (See the diagram in Step 2.)
Debugging and development hints:
Note that, for the purposes of testing, you can add a call to the halt() function at any point in your code; this will prevent execution from continuing to parts of the program that you are not ready to run just yet. (And if your program is "blowing up" for some unknown reason, you can experiment with adding calls to halt() to figure out how far it is getting before things go wrong!) Note also that you can insert printf() calls in your code to print out the values that it is working with; this can often be useful as a way to monitor your code and check that your algorithms are giving the results you expect.
As always, it is also a good idea to break this task down in to steps. For example, you might start by writing some code that just prints out the memory map, showing each region that we might consider in turn and confirming that we are at least working from the correct input. In my testing, this produced output that looks like the following (yours, of course, may vary a little, and that doesn't necessarily mean that there is anything wrong):
Considering [0-9fbff] Considering [9fc00-9ffff] Considering [f0000-fffff] Considering [100000-1fdffff] Considering [1fe0000-1ffffff] Considering [fffc0000-ffffffff]Because these regions of memory do not always match with page boundaries, we may have to do a little rounding (remember Step 2). To check I had the right approach there, I made further modifications to my code to calculate and display appropriate page boundaries within each page. My output at this stage was as follows (and again, yours may well vary, depending on the input data and the details of how you wrote your formatting code):
Considering [0-9fbff], full pages [0-9efff] Considering [9fc00-9ffff], full pages [a0000-9ffff] Considering [f0000-fffff], full pages [f0000-fffff] Considering [100000-1fdffff], full pages [100000-1fdffff] Considering [1fe0000-1ffffff], full pages [1fe0000-1ffffff] Considering [fffc0000-ffffffff], full pages [fffc0000-ffffffff]We can quickly confirm that all of the addresses in the "full pages" output here are valid page boundaries: the start addresses all end with 12 zero bits (three hexadecimal zeros) while the end addresses all end with 12 one bits (three hexadecimal f digits). Continue in a similar way, following the steps described in the TODO notes, to build your complete implementation, pausing as necessary for testing ...
Once this portion of your code is complete you will have set the physStart and physEnd parameters to span a contiguous block of pages that (a) fits within the available physical memory, and (b) does not conflict with any region of memory in which other code has been loaded at boot time.
The output that I got at this stage of the process was:
Will allocate from region [403000-1ffdfff], 29339648 bytes
Yours should look similar, but don't worry if it's not exactly the same: differences in the way we've coded our solutions, or in the QEMU command line that we use, could cause small differences in the output that you see here. Note that we've found a block of 29+MB here, which should keep us going for some time :-)
Step 4: Allocation of pages
Now that you've identified a region of available pages, it might be time to turn your attention to the allocPage() function that appears near the top of kernel/kernel.c. The TODO text should give you enough information to complete this task, but make sure that you are using physical and virtual addresses properly, converting between them using the toPhys() and fromPhys() macros as necessary. For example, if you wanted to write an unsigned zero value in to memory at the physical address in physStart, then you could do something like the following:
// Find the virtual address for memory at physStart: unsigned* page = fromPhys(unsigned*, physStart); // And then write a zero at that location: *page = 0; // (We need to use virtual addresses for memory accesses // because we are running with the MMU turned on ...)Note also that, if physStart holds the physical address of an unused page of memory on entry to allocPage(), then the next available page of memory will begin at the physical address given by nextPage(physStart).
For testing, you could always try inserting a few calls to allocPage() at an appropriate point in the kernel() function and then printing out the results. You should see output addresses that are (1) within the selected region of memory, (2) properly aligned to page boundaries (i.e., the least significant twelve bits should be zero), and (3) increasing with each subsequent call to allocPage().
Step 5: Adding kernel space mappings to new page directories
Once you've completed allocPage(), you should be able to get past the lines of code in kernel/kernel.c that try to allocate and print out the new page directory, newpdir. But don't expect to see any entries in that new page directory just yet; we'll need to add these ourselves. This would be a good time to turn your attention to the TODO item in the allocPdir() function in kernel/paging.c. As hinted earlier, the C code that you'll need to write here has some strong similarities to (some parts of) the assembly code at the start of kernel/init.S ...
Once this is done, you should get some output that looks something like the following:
Page directory at c0406000 300: [c0000000-c03fffff] => [0-3fffff], superpage 301: [c0400000-c07fffff] => [400000-7fffff], superpage 302: [c0800000-c0bfffff] => [800000-bfffff], superpage 303: [c0c00000-c0ffffff] => [c00000-ffffff], superpage 304: [c1000000-c13fffff] => [1000000-13fffff], superpage 305: [c1400000-c17fffff] => [1400000-17fffff], superpage 306: [c1800000-c1bfffff] => [1800000-1bfffff], superpage 307: [c1c00000-c1ffffff] => [1c00000-1ffffff], superpage
Of course, this looks a lot like the initial page directory shown previously, albeit without the mappings at low addresses. Don't worry if your output looks ever so slightly different: this might just be the result of minor differences in the way we've written our respective implementations. That said, it is important that this new page directory does not include any mappings for memory at addresses below c0000000: this is necessary to avoid conflicts with any future user space mappings that we might want to make.
Step 6: Are you feeling lucky?
Head back in to kernel/kernel.c, and enable the code that will switch to your newly created page directory. Unfortunately, even if you've done everything correctly so far, you'll find that things don't work quite as we might hope :-( But all is not lost. To help figure out what is causing the problem, try inserting a call to halt() immediately after the setPdir() call and then use QEMU's monitor to check on the status of the CPU ("info cpus", for example). Then try moving the halt after the printf() call and try again. What does this tell us???
We're obviously running in to a problem here ... but what might that be??? Clearly it's something that happens in between those two points where you inserted calls to halt. But the only thing there is a straightforward printf() call, and we know that worked just fine previously ...
Think carefully about the purpose of printf and how it communicates information to the user of a computer system. In general terms, what memory locations would you expect it to be accessing??? And what physical addresses are those locations mapped to by the new page directory???
If you're still puzzled, one way to get a little extra insight into what is happening is to find the line in kernel/init.S that defines handler num=14,... and then change the associated func value from nohandler to pfhandler. And if that still leaves you mistified, don't forget that you can always ask us for a little extra help! [Aside: If you do edit the code to use pfhandler, you might want to ask yourself: Why does the resulting output show a T character immediately before the word Page???]
Once you've figured out what's going on, be sure that you don't say it out loud: you don't want to spoil the opportunity for the folks around you to figure this out on their own, no matter how strongly they might claim to disagree with that statement! :-)
Step 7: Mapping single pages
Although we've used 4MB super page mappings for the kernel space region of our page directories, we'll restrict ourselves to using 4KB mappings within the user space region. In particular, this means that we'll need to start using some page table structures in addition to the top level page directory.
Head on back to kernel/paging.c, and work on the TODO items in the mapPage() function. You might find it useful to look over the code in the showPdir() function at this point for an example of "walking" over page directory and page table structures. (Hint: showPdir is just a few lines below mapPage.) The code that you need here isn't particularly difficult, but you will need to think through the individual steps very carefully. For example, you'll need to start by identifying an appropriate page directory slot, then looking for the page table in that slot (or allocating a new one if necessary), and then find the appropriate page table slot to fill. Use your bit-twiddling skills (and/or some of the macros from kernel/memory.h) to help with this, and again remember that you can insert printf() calls along the way to check that your code is performing correctly. Last, but not least, remember to distinguish between physical and virtual addresses. For example, the addresses that are stored in page directory entries, for example, are physical addresses, and you will need to calculate a corresponding virtual address before you can actually read the data they are pointing to. Once again, the fromPhys() and toPhys() macros will be useful here ...
You might also benefit from looking at the Intel diagram that shows the bit-level representations that are used for PDEs and PTEs; remember that there's a snapshot of that diagram at the start of this document.
Step 8: The end, for this week
Once you've reached this step, you should have all the tools you need to fix that pesky problem from Step 6 — a single call to mapPage() should be enough! When you see a message indicating that "The kernel will now halt!", you can confirm that it is accurate (e.g., using QEMU's info cpus), and then declare victory!
[Or you could start working to further configure the new page directory so that you can run a user level program within its own protected address space. (And why stop at just one running user program?) I'm not going to try to stop you, but I should probably tell you that this is exactly what we'll be doing in next week's lab, so you can expect some more detailed instructions for that soon ... ]