Homework 2 is due Wednesday 2/15 at 12:01 AM ET.
The written problems are to be done individually. The programming problems are also to be done individually. All homework submissions are to be made via Courseworks. Refer to the homework policy section on the class web site for further details.
Written Assignment (40 pts)
Exercise numbers refer to the course textbook, Modern Operating Systems. Each problem is worth 5 points. Make your answers concise. You'll lose points for verbosity.
When an interrupt or a system call transfers control to the operating system, a kernel stack area separate from the stack of the interrupted process is generally used. Why? If a multithreaded process forks, a problem occurs if the child gets copies of all the parent's threads. Suppose that one of the original threads was waiting for keyboard input. Now two threads are waiting for keyboard input, one in each process. Does this problem ever occur in single-threaded processes? When a computer is being developed, it is usually first simulated by a program that runs one instruction at a time. Even multiprocessors are simulated strictly sequentially like this. Is it possible for a race condition to occur when there are no simultaneous events like this? Consider the following C program: Int X[N}; Int step = M; //M is some predefined constant For (int i=0; i<N; i+=step) X[i] = X[i] + 1;(a) If this program is run on a machine with a 4-KB page size and 64-entry TLB, what values of M and N will cause a TLB miss for every execution of the inner loop? (b) Would your answer in part (a) be different if the loop were repeated many times? Explain. Suppose that a machine has 48-bit virtual addresses and 32-bit physical addresses. |
Programming Assignment (60 pts)
This programming assignment has five parts. The purpose of part A-D is to introduce the concept of x86 interrupts and interrupt descriptor table, figure out x86 protection, and understand xv6's system call mechanism. In the last part, part E, you'll write a system call recorder.
You will not have to write any code or turn in any answers for part A-D of the lab, but you should go through them anyway for your own understanding and be prepared to answer the questions posed below.
Getting Started
The files needed for this assignment are distributed using the Git distributed version control software. To learn about Git, take a look at the Git user's manual, or, if you are already familiar with other version control systems, you may find this Git overview useful.
To get the xv6 source code for this assignment, run the following command
$ git clone http://debug.cs.columbia.edu/xv6.git xv6 -b hw2
This command clones the course Git repository onto your local machine (the "clone" command), and switch to the "hw2" branch (the "checkout" command). Then you can modify the source files, and Git automatically tracks your changes. If you add a new file, you should tell Git to track this file by running
$ git add <new file>
To generate an intermediate checkpoint of the source files, you can commit your changes to your local repository by running
$ git commit -am 'whatever comments you have for this commit' [hw2 9bad18e] whatever comments you have for this commit 1 files changed, 1 insertions(+), 1 deletions(-)
You can also run git diff HEAD to see what you have changed since your last commit. To see what you've changed since you cloned the repository, run git diff origin/hw2.
Submission instructions
When you are ready to hand in your solution, put your UNI in conf.mk, commit all your changes, and run make handin in the xv6 directory. This will generate hw2-programming-<your uni>.tar.gz for you, which you can then upload via courseworks. This tar ball includes a snapshot of all xv6 source code and a patch generated using git diff origin/hw2 hw2. Thus, be sure you commit all your changes (including the new files you add) before running make handin.
We will be grading your solutions with a grading program. You can run make grade to test your solutions with the grading program.
Overview
When a user process issues a system call to request OS service or a device generates an interrupt, the hardware must switch from user-mode to kernel mode so that the kernel can handle this system call or interrupt. On the x86, this switch is done by a hardware mechanism called "interrupt" or "trap". An interrupt pauses the execution of the current user process, saves its context, switches from the user mode to the kernel mode, and executes a piece of code in the kernel called an interrupt handler.
Part A: Interrupts
Because system call is implemented with interrupt mechanism, the first thing to do is to be familiar with interrupts. x86 supports interrupts up to 256, and each interrupt is indentified by an interrupt number from 0 to 255. The first 32 interrupts are reserved by Intel. An interrupt pauses the execution of the current user process, saves its context, switches from the user mode to the kernel mode, and executes a piece of code in the kernel called an interrupt handler. In x86, interrupt handlers are defined with a special format called gate descriptor.
The above figure describes the structure of gate descriptors. x86 has three different types of gate descriptor. You only need to pay attention to the interrupt gate descriptor here. Gate descriptor is a 2-words (64 bits) data structure that includes a code segment selector, an offset of the handler, presence flag (P), descriptor privilege level (DPL), and other attributes. When written in C, the gate descriptor can be represented as a structure. In xv6, gate descriptors are defined as struct gatedesc.
Part B: Interrupt Descriptor Table
Each interrupt gate descriptor is stored in an array called the interrupt descriptor table. The interrupt descriptor table can be placed anywhere in the main memory, and the address of the table should be set into the interrupt descriptor table register (IDTR) with the x86 instruction LIDT. xv6 puts the interrupt descriptor table into struct gatedesc idt[] in trap.c, and initializes it on start up. After this, xv6 calls lidt() in x86.h to set the address of idt to IDTR.
Part C: x86 Protection
From the previous exercise, you should already know that xv6 initializes system call interrupt differently from other interrupts. For the system call interrupt, xv6 enables trap bit (40th bit in Figure 9-3) of the gate descriptor. The trap bit tells CPU whether the desciptor is for a trap or an interrupt. When a gate is called with the trap bit set, IF flag of the flag register is not cleared, so it does not block other interrupts while processing the interrupt handler. xv6 also sets the privilege field (DPL) to DPL_USER, where the value of DPL_USER is 3, and for interrupts, DPL is set to 0. DPL_USER allows the gate to be called from user programs. When a gate descriptor is called, x86 compares DPL of the descriptor with the current privilege level (CPL), and transfers control only when CPL is equal or less than DPL. Otherwise, this operation causes an exception.
Part D: System Call Dispatch
In xv6, each interrupt handler sets up a trap frame in struct trapframe containing the processor registers at the time of the interrupt. It then calls the C function trap defined in trap.c. This function looks at the hardware trap number in the trap frame to decide why it has been called and what needs to be done. If the trap number is T_SYSCALL, trap calls the system call handler syscall.
Each system call has a system call number to distinguish it from others. In xv6, the system call number is stored in register %eax before issuing the system call. Function syscall reads the system call number from the trap frame, looks up the table syscalls and decides which sys_* routine to call.
Xv6 does not copy the arguments of a system call to the kernel stack. Therefore, it fetches these arguments from the user stack using helper functions argint, argptr, and argstr. These helper functions reads the user stack register %esp from the trap frame, and locates the wanted argument.
Part E: Implement a System Call Recorder (60 pts)
Developers often want to trace the system calls a program issues for program understanding and debugging. In this part of the assignment, you will modify the xv6 kernel to record system calls a process issues. For each system call, you should record the system call number, the return value, and the arguments if any. There are three types of system call arguments in xv6:
-
Integer argument: you should record its numeric value.
-
Pointer argument: you should record the value of the pointer instead of the value pointed to by the pointer. That is, given a pointer argument p, record p instead of *p.
-
String argument: record up to 19 characters of the string. We limit the number of characters to record to save memory.
In order to control the recording and retrieve the results, you need to implement the following three system calls:
- int startrecording()
- int stoprecording()
- int fetchrecords(struct record *records, int num_records)
A process is either in the recording mode or the normal mode. Initially, a process is in the normal mode. startrecording puts the calling process into the recording mode, and stoprecording switches it back to normal. These two system calls should return 0 on success, or return -1 if the current process is already in the targeting mode. When a process forks, the child process inherits the mode of the parent.
All the system calls issued by a process, except the above three system calls, should be recorded when this process is in the recording mode. You should store the records in a process's PCB, e.g. a linked list associated with struct proc. These records should be removed when the process exits. When a process forks, the child process starts without any record.
fetchrecords retrieves all the system call records of the current process.
-
If the first argument records is NULL, the system call returns the number of records, and the second argument num_records is simply ignored.
-
If records is not NULL, it points to a pre-allocated array used to store the records. In this case, num_records indicates the size of the array (the number of struct records). fetchrecords retrieves at most num_records records and return the number of records actually retrieved.
For the purpose of grading, please use the struct record defined in record.h to record system calls. This struct is shown below
enum recordtype { SYSCALL_NO, ARG_INTEGER, ARG_POINTER, ARG_STRING, RET_VALUE }; #define MAX_STR_LEN (20) struct record { enum recordtype type; union recordvalue { int intval; void *ptrval; char strval[MAX_STR_LEN]; } value; };
A record may be a system call number, an argument, or a return value. The field type indicates the type of the record, and value stores the corresponding value.
One system call invocation may generate multiple records. They should appear in the following order: system call number, argument 0, argument 1, ... , return value. For example, a system call open("README", 0) = 3 generates four records:
- SYSCALL_NO: SYS_open
- ARG_STRING: "README"
- ARG_INTEGER: 0
- RET_VALUE: 3