Class 11 CS 202 10 March 2015 On the board ------------ 1. Last time 2. Finish discussion of device drivers 3. Blocking vs. non-blocking I/O 4. Virtual memory intro 5. Segmentation --------------------------------------------------------------------------- 1. Last time I/O: architecture, CPU/device interaction, demo keyboard repeat rate: seems to be either BIOS or keyboard driver 2. Device drivers Device drivers solve a software engineering problem ... ... but also *create* problems So we have to worry about potentially sketchy drivers ... ... but we also have to worry about potentially sketchy devices. a buggy network card can scribble all over memory (solution: use IOMMU; advanced topic) plug in your USB stick: claims to be a keyboard; starts issuing commands. (IOMMU doesn't help you with this one.) plug in a USB stick: if it's carrying a virus (aka malware), your computer can now be infected. (Iranian nuclear reactors are thought to have been attacked this way. Unfortunately for us, the same attacks could work against our power plants, etc.) Stuxnet example 3. Software architecture: blocking vs. non-blocking I/O (also known as synchronous vs asynchronous I/O) This concept exists everywhere (JavaScript vs. browser, kernel vs. devices, network client vs. network server, group work among human beings, etc.). It is probably easiest to approach it by examining the interface between *user-level programs* and the *kernel*. [draw a picture of function call. process put to sleep or not. if not, could get status code back, rather than data in buffers.] non-blocking I/O (also known as "asynchronous I/O") means: """ the request returns immediately. to be sure that the logical operation (getting data in the case of reads, sending data in the case of writes) is complete, your program has to be prepared to: * make another call later, or * register for interest in a *callback*, which will notify your code that the data was actually sent or received. """ QUESTION: which of the kernel's interfaces to the hardware are synchronous, and which employ asynchronous techniques? (this is sort of a trick question). ANSWER: essentially, the kernel as a whole never really blocks (even though isolated threads of control within the kernel can block). if the kernel as a whole blocked, that would prevent it from doing other things. thus: *interrupts*, *polling*, *DMA*: these are techniques to allow the kernel to do other things while it's waiting for a device to finish its job. They are asynchronous techniques. *Busy waiting* and *programmed I/O* not quite like blocking because blocking generally means, "Please deschedule me until this call returns." whereas the kernel (mostly) has no concept of being descheduled (we can imagine it in an idle loop when no processes are running). Busy waiting and programmed I/O are akin to async calls that are continually rechecked. blocking-vs-nonblocking is a fundamental interface choice that comes up everywhere. need to understand this distinction in order to understand: --> how one can implement threads in user space. --> event-driven programming (topic of later classes.) 4. Virtual memory intro --very important idea in computer systems --setup draw picture: * program: 0x500 movl 0x200000, %eax # ???? 0x504 incl %eax, 1 0x508 movl %eax, 0x300000 # ???? [CPU ---> translation box --> physical addresses] how many virtual memory translations happen when the line above is executed? (answer: 5. 3 for the instructions 1 for the load 1 for the store) --"to virtualize" means "to lie" or "to fool". we'll say how this is implemented in a few moments. for now, let's look at the benefits of being interposed on memory accesses. --benefits: --programmability (book calls this "transparency"): --programs use addresses like 0, 0x200000, etc. (see example above) --three benefits, at least: (a) program *thinks* it has lots of memory, organized in a contiguous space (b) programs can use "easy-to-use" addresses like 0, 0x20000, whatever. compiler and linker don't have to worry about where the program actually lives in memory when it executes. (c) multiple instances of same program foo are each loaded, each thinks its using memory addresses like 0x50000, whatever, but of course they're not using the same physical cells in RAM --protection: --processes cannot read or write each other's memory --this protection is at the heart of the isolation among processes that is provided by the OS --prevents bug in one process from corrupting another process. --don't even want a process to observe another process's memory (like if that process has secret or sensitive data) --the idea is that if you cannot name something, you cannot use it. this is a deep idea. --effective use of resources: --programmers don't have to worry that the sum of the memory consumed by all active processes is larger than physical memory. --sharing: --processes share memory under controlled circumstances, but that physical memory may show up at very different virtual addresses --that is, two processes have a different way to refer to the same physical memory cells --other things too (we may study later, depending on time) --how is this translation implemented? --in modern systems, hardware does it. this hardware is configured by the OS. --this hardware is called the MMU, for memory management unit, and is part of the CPU --why doesn't OS just translate itself? similar to asking why we don't execute programs by running them on an emulation of a processor (too slow) --thing to remember in what follows: --OS is going to be setting up data structures that the hardware sees --these data structures are *per-process* 5. Segmentation A. segmentation in general segmentation means: memory addresses treated like offsets into a contiguous region. consider 14-bit address: first two bits select are the segment number (this is in the first hex digit) next 12 bits (next three hex digits) give offset seg base limit rw ----------------------------------- 0 0x4000 0x46ff 10 1 0x0000 0x04ff 11 2 0x3000 0x3fff 11 the above table results in the mapping below. convince yourself of this!!!!!! virtual physical ------- --------- [0x0000, 0x0700) --> [0x4000, 0x4700) [0x1000, 0x1500) --> [0x0000, 0x0500) [0x2000, 0x3000) --> [0x3000, 0x4000) [0x3000, 0x3fff) --> not mapped where is 0x0240? [4240] 0x1108 [0108] 0x265c [365c] 0x3002 [???] 0x1600 [illegal] This allows sharing: how? Disadvantages: --program may need to know about segments (not in the example above but happens on the x86; see below) --contiguous bytes required --fragmentation External vs. internal fragmentation B. segmentation on the x86 next time....