OPERATING SYSTEMS I: Junfeng Yang

Homework 1 is due Wed 2/1 at 12:01 AM ET.

Submit your solutions electronically via Courseworks. Refer to the homework submission page on the class web site for additional submission instructions.

Written Assignment (40 pts)

Exercise numbers X.Y refer to the exercise problem Y in Chapter X in the course textbook, Modern Operating Systems. Each problem is worth 5 points. Make your answers concise. You'll lose points for verbosity.

1.3
1.7
1.13
1.18
1.25
The issue of resource utilization shows up in different forms in different types of operating systems. List what resources must be managed carefully in the following settings: a. Mainframe or minicomputer systems. b. Workstations connected to servers. c. Handheld computers
Write a C function that prints the return addresses of all functions currently on the stack.
Read the xv6 source code and construct a state diagram to represent the relevant states that a process may be in at any given time. For each state, give the exact name used for the state in the source code.

Programming Assignment: Booting a PC (60 pts)

This programming assignment is split into three parts. Part A concentrates on getting familiarized with x86 assembly language, the QEMU x86 emulator, and the PC's power-on bootstrap procedure. Part B examines the boot loader (bootasm.S and bootmain.c) for the xv6 kernel. Finally, Part C delves into the xv6 kernel itself.

One purpose of this assignment is to test whether you meet the prerequisites of this course and to warm you up for future kernel programming assignments.

You do not need to turn in any answers except code for the exercises in this assignment, but you must go through them for your own understanding and preparation for the exams.

Software Setup

The source files needed for this assignment are distributed using the Git distributed version control software. To learn about Git, take a look at the Git user's manual, or, if you are already familiar with other version control systems, you may find this Git overview useful.

Follow these instructions to install QEMU.

To get the xv6 source code for this assignment, run the following command

$ git clone http://debug.cs.columbia.edu/xv6.git xv6 -b hw1

This command clones the course Git repository onto your local machine (the "clone" command), and switch to the "hw1" branch (the "checkout" command). Then you can modify the source files, and Git automatically tracks your changes. If you add a new file, you should tell Git to track this file by running

$ git add <new file>

To generate an intermediate checkpoint of the source files, you can commit your changes to your local repository by running

$ git commit -am 'whatever comments you have for this commit'
[hw2 9bad18e] whatever comments you have for this commit
 1 files changed, 1 insertions(+), 1 deletions(-)

You can also run git diff HEAD to see what you have changed since your last commit. To see what you've changed since you cloned the repository, run git diff origin/hw1.

Part A: PC Bootstrap

Getting Started with x86 assembly

If you are not already familiar with x86 assembly language, you will quickly become familiar with it during this course! The PC Assembly Language Book is an excellent place to start. Hopefully, the book contains mixture of new and old material for you.

Warning: Unfortunately the examples in the book are written for the NASM assembler, whereas we will be using the GNU assembler. NASM uses the so-called Intel syntax while GNU uses the AT&T syntax. While semantically equivalent, an assembly file will differ quite a lot, at least superficially, depending on which syntax is used. Luckily the conversion between the two is pretty simple, and is covered in Brennan's Guide to Inline Assembly.

Exercise 1. Familiarize yourself with the assembly language materials available on the course resources page. You don't have to read them now, but you'll almost certainly want to refer to some of this material when reading and writing x86 assembly. We do recommend reading the section "The Syntax" in this book. It gives a good (and quite brief) description of the AT&T assembly syntax we'll be using with the GNU assembler in xv6.

Certainly the definitive reference for x86 assembly language programming is Intel's instruction set architecture reference, which you can find on in two flavors: an HTML edition of 80386 Programmer's Reference Manual from MIT folks; and the full, latest and greatest IA-32 Intel Architecture Software Developer's Manuals from Intel, covering all the features of the most recent processors that we won't need in class but you may be interested in learning about.

You should skim the recommended chapters of the PC Assembly book, and "The Syntax" section in Brennan's Guide now. Save the Intel/AMD architecture manuals for later or use them for reference when you want to look up the definitive explanation of a particular processor feature or instruction.

Simulating the x86

Instead of developing the operating system on a real, physical personal computer (PC), we use QEMU Emulator which faithfully emulates a complete PC: the code you write for QEMU will boot on a real PC too. Using an emulator instead of a real PC simplifies debugging; you can, for example, set break points inside of the emulated x86, which is difficult to do with the silicon version of an x86.

While QEMU's built-in monitor provides only limited debugging support, QEMU can act as a remote debugging target for the GNU debugger (GDB), which we'll use in this assignment to step through the early boot process.

To get started, clone our xv6 repository as described above in "Setup", then type make in the xv6 directory to build the boot loader and kernel you will start with.

$ cd xv6
$ make
...
dd if=kernel of=xv6.img seek=1 conv=notrunc
281+1 records in
281+1 records out
144356 bytes transferred in 0.061452 secs (2349080 bytes/sec)
rm wc.o grep.o mkdir.o rm.o ln.o stressfs.o kill.o echo.o init.o usertests.o
zombie.o cat.o sh.o ls.o

Now you're ready to run QEMU, supplying the file xv6.img and fs.img created above. xv6.img serves as the emulated PC's "virtual hard disk", which contains both our boot loader (bootmain) and our kernel (kernel). The second virtual disk fs.img contains a file system you will later examine after the system boots up.

To run xv6 under QEMU, type make qemu:

$ make qemu
qemu -serial mon:stdio -hdb fs.img xv6.img -smp 2
xv6...

cpu0: starting xv6

lapicinit: 1 0xfee00000
cpu1: starting
init: starting sh
cpu0: starting
Spurious IDE interrupt.

A separate window should appear containing the display of the virtual machine. After a few seconds, QEMU's virtual BIOS will load xv6's boot loader from a virtual hard drive image contained in the file xv6.img, and the boot loader will in turn load and run the xv6 kernel.

After everything is loaded, you should get a '$' prompt in the xv6 display window and be able to enter commands into the rudimentary but functional xv6 shell. For example, try:

$ ls
.              1 1 512
..             1 1 512
README         2 2 1926
cat            2 3 12491
...
$ echo Hello
Hello
$ cat README
xv6 is a re-implementation of Dennis Ritchie's and Ken Thompson's Unix
Version 6 (v6).  xv6 loosely follows the structure and style of v6,
but is implemented for a modern x86-based multiprocessor using ANSI C.
...

Now close this QEMU session, destroying the state of the xv6 virtual machine. You can do so either by closing the QEMU window or by pressing CTRL-C in the terminal where you typed make qemu.

The PC's Physical Address Space

We will now dive into a bit more detail about how a PC starts up. A PC's physical address space is hard-wired to have the following general layout:

 
+------------------+  <- 0xFFFFFFFF (4GB)
|      32-bit      |
|  memory mapped   |
|     devices      |
|                  |
/\/\/\/\/\/\/\/\/\/\

/\/\/\/\/\/\/\/\/\/\
|                  |
|      Unused      |
|                  |
+------------------+  <- depends on amount of RAM
|                  |
|                  |
| Extended Memory  |
|                  |
|                  |
+------------------+  <- 0x00100000 (1MB)
|     BIOS ROM     |
+------------------+  <- 0x000F0000 (960KB)
|  16-bit devices, |
|  expansion ROMs  |
+------------------+  <- 0x000C0000 (768KB)
|   VGA Display    |
+------------------+  <- 0x000A0000 (640KB)
|                  |
|    Low Memory    |
|                  |
+------------------+  <- 0x00000000

The first PCs, which were based on the 16-bit Intel 8088 processor, were only capable of addressing 1MB of physical memory. The physical address space of an early PC would therefore start at 0x00000000 but end at 0x000FFFFF instead of 0xFFFFFFFF. The 640KB area marked "Low Memory" was the only random-access memory (RAM) that an early PC could use; in fact the very earliest PCs only could be configured with 16KB, 32KB, or 64KB of RAM!

The 384KB area from 0x000A0000 through 0x000FFFFF was reserved by the hardware for special uses such as video display buffers and firmware held in non-volatile memory. The most important part of this reserved area is the Basic Input/Output System (BIOS), which occupies the 64KB region from 0x000F0000 through 0x000FFFFF. In early PCs the BIOS was held in true read-only memory (ROM), but current PCs store the BIOS in updateable flash memory. The BIOS is responsible for performing basic system initialization such as activating the video card and checking the amount of memory installed. After performing this initialization, the BIOS loads the operating system from some appropriate location such as floppy disk, hard disk, CD-ROM, or the network, and passes control of the machine to the operating system.

When Intel finally "broke the one megabyte barrier" with the 80286 and 80386 processors, which supported 16MB and 4GB physical address spaces respectively, the PC architects nevertheless preserved the original layout for the low 1MB of physical address space in order to ensure backward compatibility with existing software. Modern PCs therefore have a "hole" in physical memory from 0x000A0000 to 0x00100000, dividing RAM into "low" or "conventional memory" (the first 640KB) and "extended memory" (everything else). In addition, some space at the very top of the PC's 32-bit physical address space, above all physical RAM, is now commonly reserved by the BIOS for use by 32-bit PCI devices.

Recent x86 processors can support more than 4GB of physical RAM, so RAM can extend further above 0xFFFFFFFF. In this case the BIOS must arrange to leave a second hole in the system's RAM at the top of the 32-bit addressable region, to leave room for these 32-bit devices to be mapped. For now we will pretend that all PCs have "only" a 32-bit physical address space. But dealing with complicated physical address spaces and other aspects of hardware organization that evolved over many years is one of the important practical challenges of OS development.

The ROM BIOS

In this portion of the lab, you'll use QEMU's debugging facilities to investigate how an IA-32 compatible computer boots.

Open two terminal windows. In one, enter make qemu-gdb (or make qemu-nox-gdb). This starts up QEMU, but QEMU stops just before the processor executes the first instruction and waits for a debugging connection from GDB. In the second terminal, from the same directory you ran make, run gdb. You should see something like this:

$ gdb
...
+ target remote localhost:25501
The target architecture is assumed to be i8086
[f000:fff0]    0xffff0: ljmp   $0xf000,$0xe05b
0x0000fff0 in ?? ()
+ symbol-file kernel
(gdb)

The following line:

 
[f000:fff0] 0xffff0:  ljmp   $0xf000,$0xe05b

is GDB's disassembly of the first instruction to be executed. From this output you can conclude a few things:

The IBM PC starts executing at physical address 0x000ffff0, which is at the very top of the 64KB area reserved for the ROM BIOS.
The PC starts executing with CS = 0xf000 and IP = 0xfff0.
The first instruction to be executed is a jmp instruction, which jumps to the segmented address CS = 0xf000 and IP = 0xe05b.

Why does QEMU start like this? This is how Intel designed the 8088 processor, which IBM used in their original PC. Because the BIOS in a PC is "hard-wired" to the physical address range 0x000f0000-0x000fffff, this design ensures that the BIOS always gets control of the machine first after power-up or any system restart - which is crucial because on power-up there is no other software anywhere in the machine's RAM that the processor could execute. The QEMU emulator comes with its own BIOS, which it places at this location in the processor's simulated physical address space. On processor reset, the (simulated) processor enters real mode and sets CS to 0xf000 and the IP to 0xfff0, so that execution begins at that (CS:IP) segment address. How does the segmented address 0xf000:fff0 turn into a physical address?

To answer that we need to know a bit about real mode addressing. In real mode (the mode that PC starts off in), address translation works according to the formula: physical address = 16 * segment + offset. So, when the PC sets CS to 0xf000 and IP to 0xfff0, the physical address referenced is:

 
   16 * 0xf000 + 0xfff0   # in hex multiplication by 16 is
   = 0xf0000 + 0xfff0     # easy--just append a 0.
   = 0xffff0

0xffff0 is 16 bytes before the end of the BIOS (0x100000). Therefore we shouldn't be surprised that the first thing that the BIOS does is jmp backwards to an earlier location in the BIOS; after all how much could it accomplish in just 16 bytes?

When the BIOS runs, it sets up an interrupt descriptor table and initializes various devices such as the VGA display. This is where the "Starting SeaBIOS" message you see in the QEMU window comes from.

Exercise 2. Use GDB's si (Step Instruction) command to trace into the ROM BIOS for a few more instructions, and try to guess what it might be doing. You might want to look at Phil Storrs I/O Ports Description. No need to figure out all the details - just the general idea of what the BIOS is doing first.

After initializing the PCI bus and all the important devices the BIOS knows about, it searches for a bootable device such as a floppy, hard drive, or CD-ROM. Eventually, when it finds a bootable disk, the BIOS reads the boot loader from the disk and transfers control to it.

Part B: The Boot Loader

Floppy and hard disks for PCs are divided into 512 byte regions called sectors. A sector is the disk's minimum transfer granularity: each read or write operation must be one or more sectors in size and aligned on a sector boundary. If the disk is bootable, the first sector is called the boot sector, since this is where the boot loader code resides. When the BIOS finds a bootable floppy or hard disk, it loads the 512-byte boot sector into memory at physical addresses 0x7c00 through 0x7dff, and then uses a jmp instruction to set the CS:IP to 0000:7c00, passing control to the boot loader. Like the BIOS load address, these addresses are fairly arbitrary - but they are fixed and standardized for PCs.

The ability to boot from a CD-ROM came much later during the evolution of the PC, and as a result the PC architects took the opportunity to rethink the boot process slightly. As a result, the way a modern BIOS boots from a CD-ROM is a bit more complicated (and more powerful). CD-ROMs use a sector size of 2048 bytes instead of 512, and the BIOS can load a much larger boot image from the disk into memory (not just one sector) before transferring control to it. For more information, see the "El Torito" Bootable CD-ROM Format Specification.

Xv6 uses the conventional hard drive boot mechanism, which means that its boot loader must fit into a measly 512 bytes. The boot loader consists of one assembly language source file bootasm.S, and one C source file, bootmain.c. Look through these source files carefully and make sure you understand what's going on. The boot loader must perform two main functions:

The boot loader switches the processor from real mode to 32-bit protected mode, because it is only in this mode that software can access all the memory above 1MB in the processor's physical address space. Protected mode is described briefly in the Bootstrap section in the xv6 book, and in great detail in the Intel architecture manuals. At this point you only have to understand that translation of segmented addresses (segment:offset pairs) into physical addresses happens differently in protected mode, and that after the transition offsets are 32 bits instead of 16.
The boot loader reads the kernel from the hard disk by directly accessing the IDE disk device registers via the x86's special I/O instructions. You will not need to learn much about programming specific devices in this class: writing device drivers is in practice a very important part of OS development, but from a conceptual or architectural viewpoint it is also one of the least interesting.

After you understand the boot loader source code, look at the file bootblock.asm. This file is a disassembly of the boot loader that the Makefile creates after compiling the boot loader. This disassembly file makes it easy to see exactly where in physical memory all of the boot loader's code resides, and makes it easier to track what's happening while stepping through the boot loader in GDB. Likewise, kernel.asm contains a disassembly of the xv6 kernel, which can often be useful for debugging.

You can set address breakpoints in GDB with the b command. For example, b *0x7c00 sets a breakpoint at address 0x7C00; b _start sets a breakpoint at function _start, the entry of the xv6 kernel. Once at a breakpoint, you can continue execution using the c and si commands: c causes QEMU to continue execution until the next breakpoint (or until you press Ctrl-C in GDB), and si N steps through the instructions N at a time.

To examine instructions in memory (besides the immediate next one to be executed, which GDB prints automatically), you use the x/i command. This command has the syntax x/Ni ADDR, where N is the number of consecutive instructions to disassemble and ADDR is the memory address at which to start disassembling.

Exercise 3.Take a look at the lab tools guide, especially the section on GDB commands. Even if you're familiar with GDB, this includes some esoteric GDB commands that are useful for OS work.

Set a breakpoint at address 0x7c00, which is where the boot sector will be loaded. Continue execution until that breakpoint. Trace through the code in bootasm.S, using the source code and the disassembly file bootblock.asm to keep track of where you are. Also use the x/i command in GDB to disassemble sequences of instructions in the boot loader, and compare the original boot loader source code with both the disassembly in bootblock.asm and GDB.

Trace into bootmain() in bootmain.c, and then into readsect(). Identify the exact assembly instructions that correspond to each of the statements in readsect(). Trace through the rest of readsect() and back out into bootmain(), and identify the begin and end of the for loop that reads the remaining sectors of the kernel from the disk. Find out what code will run when the loop is finished, set a breakpoint there, and continue to that breakpoint. Then step through the remainder of the boot loader.

Be able to answer the following questions:

At what point does the processor start executing 32-bit code? What exactly causes the switch from 16- to 32-bit mode?
What is the last instruction of the boot loader executed, and what is the first instruction of the kernel it just loaded?
Where is the first instruction of the kernel?
How does the boot loader decide how many sectors it must read in order to fetch the entire kernel from disk? Where does it find this information?

Loading the Kernel

We will now look in further detail at the C language portion of the boot loader, in bootmain.c. But before doing so, this is a good time to stop and review some of the basics of C programming.

Exercise 4. Read about programming with pointers in C. The best reference for the C language is The C Programming Language by Brian Kernighan and Dennis Ritchie (known as 'K&R'). We recommend that students purchase this book (here is an Amazon Link).

Read 5.1 (Pointers and Addresses) through 5.5 (Character Pointers and Functions) in K&R. Then download the code for pointers.c, run it, and make sure you understand where all of the printed values come from. In particular, make sure you understand where the pointer addresses in lines 1 and 6 come from, how all the values in lines 2 through 4 get there, and why the values printed in line 5 are seemingly corrupted.

There are other references on pointers in C, though not as strongly recommended. A tutorial by Ted Jensen that cites K&R heavily is available in the course readings.

Warning: Unless you are already thoroughly versed in C, do not skip or even skim this reading exercise. If you do not really understand pointers in C, you will suffer untold pain and misery in subsequent labs, and then eventually come to understand them the hard way. Trust us; you don't want to find out what "the hard way" is.

To make sense out of bootmain.c you'll need to know what an ELF binary is. When you compile and link a C program such as the xv6 kernel, the compiler transforms each C source ('.c') file into an object ('.o') file containing assembly language instructions encoded in the binary format expected by the hardware. The linker then combines all of the compiled object files into a single binary image such as kernel, which in this case is a binary in the ELF format, which stands for "Executable and Linkable Format".

Full information about this format is available in the ELF specification, but you will not need to delve very deeply into the details of this format in this class. Although as a whole the format is quite powerful and complex, most of the complex parts are for supporting dynamic loading of shared libraries, which we will not do in this class.

For the purposes of this course, you can consider an ELF executable to be a header with loading information, followed by several program sections, each of which is a contiguous chunk of code or data intended to be loaded into memory at a specified address. The boot loader does not modify the code or data; it loads it into memory and starts executing it.

An ELF binary starts with a fixed-length ELF header, followed by a variable-length program header listing each of the program sections to be loaded. The C definitions for these ELF headers are in elf.h. The program sections we're interested in are:

.text: The program's executable instructions.
.rodata: Read-only data, such as ASCII string constants produced by the C compiler. (We will not bother setting up the hardware to prohibit writing, however.)
.data: The data section holds the program's initialized data, such as global variables declared with initializers like int x = 5;.

When the linker computes the memory layout of a program, it reserves space for uninitialized global variables, such as int x;, in a section called .bss that immediately follows .data in memory. C requires that "uninitialized" global variables start with a value of zero. Thus there is no need to store contents for .bss in the ELF binary; instead, the linker records just the address and size of the .bss section. The loader or the program itself must arrange to zero the .bss section.

You can display a full list of the names, sizes, and link addresses of all the sections in the kernel executable by typing:

$ objdump -h kernel

You will see many more sections than the ones we listed above, but the others are not important for our purposes. Most of the others are to hold debugging information, which is typically included in the program's executable file but not loaded into memory by the program loader.

Take particular note of the "VMA" (or link address) and the "LMA" (or load address) of the .text section. The load address of a binary is the memory address at which a binary is actually loaded. For example, the BIOS is loaded by the PC hardware at address 0xf0000. So this is the BIOS's load address. Similarly, the BIOS loads the boot sector at address 0x7c00. So this is the boot sector's load address. In the ELF object, this is stored in the ph->p_pa field (in this case, it really is a physical address, though the ELF specification is vague on the actual meaning of this field).

The link address of a binary is the memory address for which the binary is linked. Linking a binary for a given link address prepares it to be loaded at that address. The linker encodes the link address in the binary in various ways, for example when the code needs the address of a global variable, with the result that a binary usually won't work if it is executing from an address that it is not linked for. (It is possible to generate position-independent code that does not contain any such absolute addresses. This is used extensively by modern shared libraries, but it has performance and complexity costs, so we won't be using it in this course.)

In one sentence: the link address is the location where a binary assumes it is going to be loaded, while the load address is the location where a binary is loaded. It's up to us to make sure that they turn out to be the same.

Typically, the link and load addresses are the same. For example, look at the .text section of the boot loader:

$ objdump -h bootblock.o

The BIOS loads the boot sector into memory starting at address 0x7c00, so this is the boot sector's load address. This is also where the boot sector executes from, so this is also its link address. We set the link address by passing -Ttext 0x7C00 to the linker in Makefile, so the linker will produce the correct memory addresses in the generated code.

Exercise 5. Trace through the first few instructions of the boot loader again and identify the first instruction that would "break" or otherwise do the wrong thing if you were to get the boot loader's link address wrong. Then change the link address in Makefile to something wrong, run make clean, recompile the lab with make, and trace into the boot loader again to see what happens. Don't forget to change the link address back and make clean again afterward!

Look back at the load and link addresses for the kernel. Unlike the boot loader, these two addresses aren't the same: the kernel is telling the boot loader to load it into memory at a low address (0x100000, which is 1 megabyte), but it expects to execute from a high address. We'll dig in to how we make this work in the next section.

Besides the section information, there is one more field in the ELF header that is important to us, named e_entry. This field holds the link address of the entry point in the program: the memory address in the program's text section at which the program should begin executing. You can see the entry point:

 
$ objdump -f kernel

You should now be able to understand the minimal ELF loader in bootmain.c. It reads each section of the kernel from disk into memory at the section's load address and then jumps to the kernel's entry point.

Exercise 6. We can examine memory using GDB's x command. The GDB manual has full details. For now, it is enough to know that the recipe x/Nx ADDR prints N words of memory at ADDR. (Note that both 'x's in the command are lowercase.)

Warning: The size of a word is not a universal standard. In GNU assembly, a word is two bytes (the 'w' in xorw, which stands for word, means 2 bytes).

Reset the machine (exit QEMU/GDB and start them again). Examine the 8 words of memory at 0x00100000 at the point the BIOS enters the boot loader, and then again at the point the boot loader enters the kernel. Why are they different? What is there at the second breakpoint? (You do not really need to use QEMU to answer this question. Just think.)

Part C: The Kernel

When you inspected the boot loader's link and load addresses above, they matched perfectly, but there was a (rather large) disparity between the kernel's link address (as printed by objdump) and its load address. Go back and check both and make sure you can see what we're talking about. (Linking the kernel is more complicated than the boot loader, so the link and load addresses are at the top of kernel.ld.)

Operating system kernels often like to be linked and run at very high virtual address, such as 0x80100000, in order to leave the lower part of the processor's virtual address space for user programs to use. The reason for this arrangement will become clearer when we talk about processes and address spaces.

Many machines don't have any physical memory at address 0x80100000, so we can't count on being able to store the kernel there. Instead, we will use the processor's memory management hardware to map virtual address 0x80100000 (the link address at which the kernel code expects to run) to physical address 0x00100000 (where the boot loader loaded the kernel into physical memory). This way, although the kernel's virtual address is high enough to leave plenty of address space for user processes, it will be loaded in physical memory at the 1MB point in the PC's RAM, just above the BIOS ROM. This approach requires that the PC have at least a few megabytes of physical memory (so that physical address 0x00100000 works), but this is likely to be true of any PC built after about 1990.

For now, we'll just map the first 4MB of physical memory, which will be enough to get us up and running. We do this using the hand-written, statically-initialized page directory and page table entrypgdir in main.c. For now, you don't have to understand the details of how this works, just the effect that it accomplishes. Up until entry.S sets the CR0_PG flag, memory references are treated as physical addresses (strictly speaking, they're linear addresses, but bootasm.S set up an identity mapping from linear addresses to physical addresses and we're never going to change that). Once CR0_PG is set, memory references are virtual addresses that get translated by the virtual memory hardware to physical addresses. entry_pgdir translates virtual addresses in the range 0x80000000 through 0x80400000 to physical addresses 0x00000000 through 0x00400000, as well as virtual addresses 0x00000000 through 0x00400000 to physical addresses 0x00000000 through 0x00400000. Any virtual address that is not in one of these two ranges will cause a hardware exception which, since we haven't set up interrupt handling yet, will cause QEMU to dump the machine state and exit (or endlessly reboot if you aren't using the patched version of QEMU).

Exercise 7. Use QEMU and GDB to trace into the xv6 kernel and stop at the movl %eax, %cr0 in entry.S. Examine memory at 0x00100000 and at 0x80100000. Now, single step over that instruction using the stepi GDB command. Again, examine memory at 0x00100000 and at 0x80100000. Make sure you understand what just happened.

What is the first instruction after the new mapping is established that would fail to work properly if the mapping weren't in place? Comment out the movl %eax, %cr0 in entry.S, trace into it, and see if you were right.

Formatted Printing to the Console

Most people take functions like printf() for granted, sometimes even thinking of them as "primitives" of the C language. But in an OS kernel, we have to implement all I/O ourselves.

Exercise 8. Read through console.c. The xv6 cprintf implementation does not handle printing of octal numbers using patterns of the form "%o". Implement this feature.

Be able to answer the following questions:

Explain the code of function cgaputc() in console.c.
For the following questions you might wish to consult the notes for Lecture 3. These notes cover GCC's calling convention on the x86.
Trace the execution of the following code step-by-step:
```
int x = 1, y = 3, z = 4;
cprintf("x %d, y %x, z %d\n", x, y, z);
```
- In the call to cprintf(), to what does fmt point? To what does argp point?
- List (in order of execution) each call to cons_putc and its argument.
Run the following code.
```
    unsigned int i = 0x00646c72;
    cprintf("H%x Wo%s", 57616, &i);
```
What is the output? Explain how this output is arrived at in the step-by-step manner of the previous exercise. Here's an ASCII table that maps bytes to characters.
The output depends on that fact that the x86 is little-endian. If the x86 were instead big-endian what would you set i to in order to yield the same output? Would you need to change 57616 to a different value?

Here's a description of little- and big-endian and a more whimsical description.
In the following code, what is going to be printed after 'y='? (note: the answer is not a specific value.) Why does this happen?
```
    
cprintf("x=%d y=%d", 3);
```
Let's say that GCC changed its calling convention so that it pushed arguments on the stack in declaration order, so that the last argument is pushed last. How would you have to change cprintf or its interface so that it would still be possible to pass it a variable number of arguments?

The Stack

In the final exercise of this lab, we will explore in more detail the way the C language uses the stack on the x86, and in the process write a useful that prints a backtrace of the stack: a list of the saved Instruction Pointer (IP) values from the nested call instructions that led to the current point of execution.

Exercise 9. Determine where the kernel initializes its stack, and exactly where in memory its stack is located. How does the kernel reserve space for its stack? And at which "end" of this reserved area is the stack pointer initialized to point to?

The x86 stack pointer (esp register) points to the lowest location on the stack that is currently in use. Everything below that location in the region reserved for the stack is free. Pushing a value onto the stack involves decreasing the stack pointer and then writing the value to the place the stack pointer points to. Popping a value from the stack involves reading the value the stack pointer points to and then increasing the stack pointer. In 32-bit mode, the stack can only hold 32-bit values, and esp is always divisible by four. Various x86 instructions, such as call, are "hard-wired" to use the stack pointer register.

The ebp (base pointer) register, in contrast, is associated with the stack primarily by software convention. On entry to a C function, the function's prologue code normally saves the previous function's base pointer by pushing it onto the stack, and then copies the current esp value into ebp for the duration of the function. If all the functions in a program obey this convention, then at any given point during the program's execution, it is possible to trace back through the stack by following the chain of saved ebp pointers and determining exactly what nested sequence of function calls caused this particular point in the program to be reached. This capability can be particularly useful, for example, when a particular function causes an assert failure or panic because bad arguments were passed to it, but you aren't sure who passed the bad arguments. A stack backtrace lets you find the offending function.

Exercise 10. To become familiar with the C calling conventions on the x86, find the address of the test_backtrace function in kernel.asm, set a breakpoint there, and examine what happens each time it gets called after the kernel starts. How many 32-bit words does each recursive nesting level of test_backtrace push on the stack, and what are those words?

The above exercise should give you the information you need to implement a stack backtrace function, which you should call backtrace(). A prototype for this function is already waiting for you in backtrace.c. You can do it entirely in C, but you may find the read_ebp() function in x86.h useful.

The backtrace function should display a listing of function call frames in the following format:

Stack backtrace:
  ebp 8010c598 eip 801002b8 args 8010c5b8 80104db7 8010c5c8 460 7bf8
  ebp 8010c5f8 eip 801002c3 args 0 1 8010c624 80102c36 e000000
  ...

The first line printed reflects the currently executing function, namely backtrace() itself, the second line reflects the function that called backtrace(), the third line reflects the function that called that one, and so on. You should print all the outstanding stack frames. By studying entry.S you'll find that there is an easy way to tell when to stop.

Within each line, the ebp value indicates the base pointer into the stack used by that function: i.e., the position of the stack pointer just after the function was entered and the function prologue code set up the base pointer. The listed eip value is the function's return instruction pointer: the instruction address to which control will return when the function returns. The return instruction pointer typically points to the instruction after the call instruction (why?). Finally, the five hex values listed after args are the first five arguments to the function in question, which would have been pushed on the stack just before the function was called. If the function was called with fewer than five arguments, of course, then not all five of these values will be useful. (Why can't the backtrace code detect how many arguments there actually are? How could this limitation be fixed?)

Here are a few specific points you read about in K&R Chapter 5 that are worth remembering for the following exercise and for future labs.

If int *p = (int*)100, then (int)p + 1 and (int)(p + 1) are different numbers: the first is 101 but the second is 104. When adding an integer to a pointer, as in the second case, the integer is implicitly multiplied by the size of the object the pointer points to.
p[i] is defined to be the same as *(p+i), referring to the i'th object in the memory pointed to by p. The above rule for addition helps this definition work when the objects are larger than one byte.
&p[i] is the same as (p+i), yielding the address of the i'th object in the memory pointed to by p.

Although most C programs never need to cast between pointers and integers, operating systems frequently do. Whenever you see an addition involving a memory address, ask yourself whether it is an integer addition or pointer addition and make sure the value being added is appropriately multiplied or not.

Exercise 11. Implement the backtrace function as specified above. Use the same format as in the example, since otherwise the grading script will be confused. When you think you have it working right, run make grade to see if its output conforms to what our grading script expects, and fix it if it doesn't.

At this point, your backtrace function should give you the addresses of the function callers on the stack that lead to backtrace() being executed. However, in practice you often want to know the function names corresponding to those addresses. For instance, you may want to know which functions could contain a bug that's causing your kernel to crash. You can use the addr2line utility to translate an address into a file name and line number.

If you enable the -O2 flag in your Makefile, you may find that the some functions are missing from the backtrace. This is because the compiler inlines some function calls. Other optimizations may cause you to see unexpected line numbers. If you get rid of the -O2 flag, the backtraces may make more sense (but your kernel will run more slowly).

Submission instructions

When you are ready to hand in your solution, put your UNI in conf.mk, commit all your changes, and run make handin in the xv6 directory. This will generate hw1-<your uni>-code.tar.gz for you, which includes a snapshot of all xv6 source code and a patch generated using git diff origin/hw1 hw1. This will be the programming part of your Courseworks submission. Be sure to commit all your changes (including the new files you add) before running make handin!

We will be grading your solutions with a grading program. You can run make grade to test your solutions with the grading program.

W4118 OPERATING SYSTEMS I

Spring 2012 -- Junfeng Yang