Class 25 (last one!) CS 202 7 May 2015 On the board ------------ 1. Last time 2. Trusting trust 3. Further thoughts on trust 4. Wrap-up --------------------------------------------------------------------------- 1. Last time Unix protection model Attacks and problems Other thoughts on security 2. Trusting trust --first of all, the word "trust" is a bad thing in computer security (this is an unfortunate linguistic fact). to "trust" something means to "assume it correct", which in turn means "to be in trouble if the assumption is false". so "removing trust" is a good thing. so is making things "trust*worthy*" (that is, worthy of being assumed correct), but it is in general hard to make any given component truly trustworthy. --you'll notice that the "trusted computing" initiatives from various powerful interests subvert this word. who exactly is being trusted and who is exactly isn't being trusted? "trusted computing" sounds great linguistically, but "trusted computing platforms" do not actually mean what they sound like A. background on this paper by Thompson: Thompson gave this lecture/paper after winning the Turing Award, which is considered by many to be the Nobel prize of Computer Science. The paper is stunning but takes patience and a few readings to understand. We're going to reproduce most of what Thompson did but will follow the ideas in an order different from the one in the paper. B. adding a feature to a language What if we wanted to add a feature to Java? Say that the Java compiler is written in C, in a file called java.c. So we modify java.c, and rerun the C compiler on java.c, producing a new Java compiler that understands a new feature of Java Now what if we wanted to add a feature to the C programming language? Well, for all practical purposes, the C compiler is also written in C, and let's assume that the entire C compiler is implemented in a file called "cc.c". To add a feature to the C programming language, we need to modify cc.c, and run the old C compiler on the new file. At this point, we have a new C compiler that understands a new feature of the language. C. Context As sometimes happens today, earlier versions of Unix were distributed with a full set of binaries and source for those binaries. This source included source for the compiler, the OS, the program 'login', etc. Because the system was quite small, it was common for people to make a change in one source file and then to recompile all of their programs. So program recompilation happened a lot. D. In this environment, how could someone as clever as Thompson add a bug to the login program without leaving a trace in the source files? **GOAL: have no source files hint at the bug, and meanwhile, the bug will persist across all recompilations [DRAW PICTURES] E. How can we write a self-reproducing program in pseudocode? X = "Output 'X'. Output '='. Output quote mark. Output X. Output quote mark. Output X." Output 'X'. Output '='. Output quote mark. Output X. Output quote mark. Output X. Run that, and you get itself. Here's a version that includes other instructions: X = "[execute whatever.] Output 'X'. Output '='. Output quote mark. Output X. Output quote mark. Output X." [execute whatever.] Output 'X'. Output '='. Output quote mark. Output X. Output quote mark. Output X. Here is a simpler version: Print this followed by its quotation: "Print this followed by its quotation". [BTW, the GNU Public License works like this. It's a self-replicating license! the license specifies that to make a copy of the code, you have to release the source **with the license itself included**. the license talks about itself, just as a self-replicating program must.] Here's a self-replicating program in Scheme: ((lambda (x) `(,x ',x)) '(lambda (x) `(,x ',x))) F. Result: some well-known string in the C compiler source now compiles to binary that does the following: << (1) if compiling "login", insert a bug (2) if you see the well-known string in the C compiler itself, replace it with everything between << >> >> G. What's the moral of the story? What if you disassembled the binaries? Would the attack be visible there? (Depends on whether the disassembler was bugged.) H. Postscript Russ Cox reports: "The original hack, by the way, did not work perfectly. It made the compiler just a little bigger each time it compiled itself. Eventually someone discovered this and tried to figure out why, and they compiled via an assembly listing (cc -S x.c; as x.s), and the hack disappeared. (It was not enabled when printing an assembly listing with -S.)." [follow-up: Ken Thompson reports: "it was a '\0' added to a string every time."] 3. Further thoughts on trust Question: what do you have to trust to be sure that no one is aware of, nor can ever be aware of, your IM conversation with someone else? no eavesdropping no dumping of data to be analyzed later IM binary isn't bugged Question: what if the hardware itself is buggy? What do we do then? (People are worried about this.) 4. Mandatory access control [not covered; leaving here for general interest] historically, military was interested in a different security model what if the enemy gains access to secret data? could email it somewhere. mandatory access control vs discretionary Unix is discretionary, because process can specify security policy --change permissions on files it writes to, give its privileges to others military goal was to disallow such discretion --mandatory policy set by security officer, processes have no choice typical model: Bell-LaPadula, concerned with secrecy system has subjects (e.g. processes) and objects (e.g. files) each subject and object is assigned a security level (c,s) c = classification (unclassified, secret, top-secret, ...) s = category set (nuclear, crypto, NATO, ...) "dominates" relation: (c1, s1) dominates (c2, s2) iff c2 < c1 and s2 a subset of s1 use the dominates relation for access control S does an operation on O if operation will allow S to observe O, check if L(S) dominates L(O) if operation will allow S to modify O, check if L(O) dominates L(S) the dominates relation is transitive, so we get nice properties malicious process cannot copy top-secret file to unclassified file hard to retrofit to Unix hard to use in practice (i) very specialized security level scheme, fits military but not many others (ii) hard to figure out when to declassify data -- everything becomes secret (iii) unclear interaction with traditional access control (Unix users, groups) --yet still need both mechanisms but SELinux is getting some interest and use, and it incorporates some of these ideas issue: Covert channels --information leaks out. e.g., a secret process gets compromised. sure, it can't leak data by writing to a file. but it can consume CPU or not. --historically, very difficult to defend against this. 5. In closing.... --You have learned about the x86; processes, syscalls, fork/exec, shells, concurrency and synchronization and deadlock, and alternatives to shared memory multithreading; software safety (the therac-25), scheduling, I/O, devices, disks, virtual memory, file systems, logging, crash recovery, transactions, distributed systems, stack smashing, Unix's protection model, and other security topics, software development --You have learned to write and debug low-level code --I think you now know a lot more than you did at the beginning. (For instance, I wager that if you were to go back now and do lab 1 or lab 2 again from scratch you would not find it as difficult.) --Congratulations on having learned all of these things! I hope you enjoyed it!