Class 11
CS 202
10 March 2015

On the board
------------

1. Last time
2. Finish discussion of device drivers
3. Blocking vs. non-blocking I/O
4. Virtual memory intro
5. Segmentation

---------------------------------------------------------------------------

1. Last time

    I/O: architecture, CPU/device interaction, demo

    keyboard repeat rate: seems to be either BIOS or keyboard driver

2. Device drivers

    Device drivers solve a software engineering problem ...

    ... but also *create* problems

    So we have to worry about potentially sketchy drivers ...

    ... but we also have to worry about potentially sketchy devices.

        a buggy network card can scribble all over memory 
        (solution: use IOMMU; advanced topic)

        plug in your USB stick: claims to be a keyboard; starts issuing
        commands. (IOMMU doesn't help you with this one.)

        plug in a USB stick: if it's carrying a virus (aka malware),
        your computer can now be infected. (Iranian nuclear reactors are
        thought to have been attacked this way. Unfortunately for us,
        the same attacks could work against our power plants, etc.)
    
    Stuxnet example

3. Software architecture: blocking vs. non-blocking I/O

    (also known as synchronous vs asynchronous I/O)

    This concept exists everywhere (JavaScript vs. browser, kernel vs.
    devices, network client vs. network server, group work among human
    beings, etc.). 
    
    It is probably easiest to approach it by examining the interface
    between *user-level programs* and the *kernel*. 

    [draw a picture of function call. process put to sleep or not.
    if not, could get status code back, rather than data in buffers.]

    non-blocking I/O (also known as "asynchronous I/O") means:
    
        """
        the request returns immediately. to be sure that the logical
        operation (getting data in the case of reads, sending data in
        the case of writes) is complete, your program has to be prepared
        to: 

             * make another call later, or

             * register for interest in a *callback*, which will
             notify your code that the data was actually sent or
             received.
        """
        
    QUESTION: which of the kernel's interfaces to the hardware are
    synchronous, and which employ asynchronous techniques? (this is sort
    of a trick question).
 
    ANSWER: essentially, the kernel as a whole never really blocks (even
    though isolated threads of control within the kernel can block). if
    the kernel as a whole blocked, that would prevent it from doing
    other things. thus: 
    
        *interrupts*, *polling*, *DMA*: these are techniques to allow
        the kernel to do other things while it's waiting for a device to
        finish its job. They are asynchronous techniques.

        *Busy waiting* and *programmed I/O* not quite like blocking
        because blocking generally means, "Please deschedule me until
        this call returns." whereas the kernel (mostly) has no concept
        of being descheduled (we can imagine it in an idle loop when no
        processes are running). Busy waiting and programmed I/O are akin
        to async calls that are continually rechecked.

    blocking-vs-nonblocking is a fundamental interface choice that comes
    up everywhere.
    
    need to understand this distinction in order to understand:
    
        --> how one can implement threads in user space.
    
        --> event-driven programming

        (topic of later classes.)

4. Virtual memory intro

    --very important idea in computer systems

    --setup

        draw picture:

            * program:

             0x500     movl 0x200000, %eax    # ????
             0x504     incl %eax, 1
             0x508     movl %eax, 0x300000    # ????

    	    [CPU ---> translation box --> physical addresses]

            how many virtual memory translations happen when the line
            above is executed?  (answer: 5.
                    3 for the instructions
                    1 for the load
                    1 for the store)
                

    --"to virtualize" means "to lie" or "to fool". we'll say how this is
    implemented in a few moments. for now, let's look at the benefits of
    being interposed on memory accesses.

    --benefits:

        --programmability (book calls this "transparency"):
            
            --programs use addresses like 0, 0x200000, etc. (see example
            above)

            --three benefits, at least:

            (a) program *thinks* it has lots of memory, organized in a
            contiguous space

            (b) programs can use "easy-to-use" addresses like 0,
            0x20000, whatever. compiler and linker don't have to worry
            about where the program actually lives in memory when it
            executes.

            (c) multiple instances of same program foo are each loaded,
            each thinks its using memory addresses like 0x50000,
            whatever, but of course they're not using the same physical
            cells in RAM

        --protection:

            --processes cannot read or write each other's memory

            --this protection is at the heart of the isolation among
            processes that is provided by the OS

                --prevents bug in one process from corrupting another
                process.

                --don't even want a process to observe another
                process's memory (like if that process has secret or
                sensitive data)

            --the idea is that if you cannot name something, you cannot
            use it. this is a deep idea.

        --effective use of resources:

            --programmers don't have to worry that the sum of the memory
            consumed by all active processes is larger than physical
            memory.

        --sharing:

	    --processes share memory under controlled circumstances,
	    but that physical memory may show up at very different
	    virtual addresses
	    --that is, two processes have a different way to refer
	    to the same physical memory cells


        --other things too (we may study later, depending on time)

    --how is this translation implemented?

        --in modern systems, hardware does it. this hardware is
        configured by the OS.

        --this hardware is called the MMU, for memory management unit,
        and is part of the CPU

        --why doesn't OS just translate itself? similar to asking why we
        don't execute programs by running them on an emulation of a
        processor (too slow)

    --thing to remember in what follows:
    
        --OS is going to be setting up data structures that the hardware
        sees
    
        --these data structures are *per-process* 
        

5. Segmentation

    A. segmentation in general

	segmentation means:

	    memory addresses treated like offsets into a contiguous
	    region.

    
        consider 14-bit address:

            first two bits select are the segment number (this is in the
            first hex digit)

            next 12 bits (next three hex digits) give offset


        seg     base       limit         rw
        -----------------------------------
        0      0x4000      0x46ff        10
        1      0x0000      0x04ff        11
        2      0x3000      0x3fff        11


    the above table results in the mapping below. convince yourself of
    this!!!!!!

            virtual                physical
            -------               ---------

        [0x0000, 0x0700)  -->  [0x4000, 0x4700)
        [0x1000, 0x1500)  -->  [0x0000, 0x0500)
        [0x2000, 0x3000)  -->  [0x3000, 0x4000)
        [0x3000, 0x3fff)  --> not mapped


        where is

            0x0240?       [4240]

            0x1108        [0108]

            0x265c        [365c]

            0x3002        [???]

            0x1600        [illegal]
   

    This allows sharing: how?


    Disadvantages:

        --program may need to know about segments (not in the example
        above but happens on the x86; see below)

        --contiguous bytes required

        --fragmentation

    External vs. internal fragmentation


    B. segmentation on the x86

    next time....