OS Organization with Virtualization

Scribes: Peter Chang and Ruolin Fan

Hard Modularity

"Don't trust other modules" because of...

There are two techniques to implement hard modularity

  1. Client-Service
        // Sample code that implements a factorial call 
    
        // Client Code:
        send (fact_port,{"!",6}); //example: we want to compute 6!
        receive (fact_port, response);
        if (response.opcode == "ok")
    	print (response.val);
        else
        print ("error %d", response.errorcode);
    
    
        // Server Code:
        for (;;) {	    //go in loops and wait for request from the client
    	receive (fact_port,request);
    	if (request.opcode == "!") {
    	    n = request,val;
    	    for (int i = 2; i <= n; i++) 
    		n *= i;
    	    response = {"ok", n};
    	} else 
    	    response = {"ng", 29}; //error opcode "ng" and error code 29
    	send (fact_port, response);
        }
    

    Some pluses and minuses for this kind of technique:

    + Limited the error propagation
    • No shared states
    • Client loops will not compromise the server. Vice versa.
    - Uses more resources
    • Requires multiple machines (or virtual machines)
    • Interpreting messages slows down main computations (marshalling)
    - Less Security
    • Messages can be intercepted, or faulty messages can be sent.
    • Example: Kaminsky DNS design flaw
    - Harder to deploy
    • More Complex

    Overall: although this technique successfully solved the problem at hand, its implementation has many drawbacks that, if used for small projects like calculating a factorial function, is not feasible sense because of its resource-hogging and complexity.

  2. Virtualization:

    To implement OS virtualization, the OS gives a "pretend machine" to the application. This way, the application can not inadvertently modify sensitive system data directly. The computer in which the application runs is virtualized into components like virtual memory and a virtualized CPU. Any action that the application does that requires modification of system data must be done through a "middle man" such as the system kernel. One simple implementation is an X86 emulator. The OS runs an application inside emulator, which would check all the memory references and IO instructions.

    Some pluses and minuses:

    + No direct access to I/O or sensitive devices
    + Can catch infinite loops inside application
    • The virtual machine can identify application loops and switch control to another application.
    - Slower
    • Traditionally by a factor of approximately 10
    • Client loops will not compromise the server. Vice versa.

    Since virtualization using an emulator very slow, we try to achieve better performance through hardware level control structures such as the virtualizable processor

    virtualizable_processor

    There are two ways to call the kernel:

    1. Ordinary function calls
      • This way is fast, supported, but unsafe
      • Very popular in embedded applications
    2. Protected transfer of control
      • When an unsafe instruction is executed, the hardware traps, and the kernel takes control (the kernel can run any instructions)

    But what is a kernel? The kernel is the key part of an operating system that can execute any instructions; it is the core of the operating system.

Hardware Trap

Possible causes:

The kernel keeps an interrupt vector like the following, which is made up of 265 words, with each word being a pointer to a privileged instruction that it can execute.

interrupt_vector

The trap executes as follows:

  1. Push the following things onto stack (note: the kernel stack, not application stack):

    ss

    Stack Segment (identities stack)

    esp

    Extended Stack Pointer

    eflags

    cs

    Code Segment

    eip

    Instruction Pointer (return address)

  2. eip = iv[trap#]

    error code

    More details of trap

  3. A RETI instruction at the end of the kernel stack "returns" to the program that made the syscall
Privelege Hierarchy

                                          Figure 3: the standard protection system, or hierarchy of privileges.

So how do we do syscalls?

One solution: while(1); or for(;;)

Another solution: *(char*)0 = 'x';

The proper way to do syscalls in X86:

But how are a, b, and c passed?

  %ecx c    //This is the assembly code for "read"
  %ebx b    // ... read.s
  %eax a
  INT 0X80
  Result %eax

Overall, syscall is like a function call except:

Components of the machine that may need virtualization

  1. ALU
  2. Registers
  3. Cache
  4. Primary Memory (RAM)
  5. I/O Devices

What can go wrong?

Infinite Application Loops

An application can encounter an infinte loop due to any number of reasons. In this case, the kernel is often programmed to provide a forced interrupt every 10 ms in which it can decide to force an application to stop running and transfer cpu resources to another application. This prevents infinite loops in user applications to crash the system.

Illegal Application Memory Access

An application can refer to illegal memory locations. Systems employ memory management and protection mechanisms in the kernel to prevent applications from actualling doing damage in the illegal memory addresses.

Infinite Kernel Loops

A kernel that encounters an infinite loop will have no way to resume any other process in the computer. This results in a complete system failure since there is nothing to "interrupt" the kernel itself.

Illegal Kernel Memory Access

A kernel that tries to access an invalid memory address will never be prevented from doing so. This almost always results in a system failure as errors resulting from the illegal access is propogated throught the system.

Simultaneous register access

An application can end up accessing the same registers used in another application. To help prevent such situations, we resort to context switching between applications so that an application's registers can be overwritten and reused when the application is not running.

Context Switching

Context switching is the act of suspending one process and resuming another. This is what schedule() does in wensyos1. This is done by saving and restoring an application's registers such as the eax, ebp, and esp registers. These registers are found in the process' "process descriptor". Each process has a process descriptor which is stored in the OS' "process descriptor table".

However, as applications and hardware get more complex, a system may require more registers to run each application. To solve this problem, we split process registers into "common" (eax, ebp, eip) and "uncommon" (Floating Point Registers) parts, saving only the common registers and specified uncommon registers and letting the other registers be reused.

Process description
tables

                                                                                      Figure 4: Process Description Tables.

Virtual Memory Addressing

Virtual memory allocation is implemented with the aid of a Virtual Memory Manager (VMM). The VMM is a kernel process that allows user processes to believe that they have been given one neatly allocated block of RAM in which to run off of. In reality, the VMM may translate the memory addresses used by a process into an actual physical memory address that, unlike figure 5, may be scattered and fragmented throuought the physical RAM. Although this step of translating memory addresses may slow down the overall system because of the addition of another layer of complexity, this feature resolves the important issue of programs accessing "forbidden" memory addresses. In effect, the VMM can completely hide the memory used by, say, the kernel itself.

Virtual memory
management

                                                                                      Figure 5: Virtual Memory Management.

Device Access

Robustness is a very important issue when dealing with hardware device access. Every system is built from many different devices with lots of variation and "weird" features between devices. This variation requires a robust interface to protect each device from each other and the user. We do not want to have situations in which a piece of code may accidentally or intentionally set something on fire. More realistically, we do not want code to be modifying sensitive data on a hard drive unknowingly.

As programmers, we want a clean interface to intaract with devices that will conform to our standards of abstraction and modularity. We don't want to have to deal with the low level details of how to read and write to and from a device, let alone how to deal with the differences between different devices and how reading and writing applies to them.

Two Classes of Devices
  1. Asynchronous (Streaming)
    • Network
    • Mouse
    • Keyboard
  2. Synchronous (Random Access)
    • Disk Storage
    • Memory

Each class of devices has their own set of device access operations that make sense to the devices in that class. For example, it would not make sense for a program to use lseek() on an asynchronous device such as a mouse. However, both the asynchronous and synchronous devices listed above make sense to have a read() operation.