Class 27 CS 372H 1 May 2012 On the board ------------ 1. Last time 2. Reflections on Trusting trust 3. Unix security model --------------------------------------------------------------------------- 1. Last time --finished VMWare ESX --stack smashing 2. Trusting trust --first of all, the word "trust" is a bad thing in computer security (this is an unfortunate linguistic fact). to "trust" something means to "assume it correct", which in turn means "to be in trouble if the assumption is false". so "removing trust" is a good thing. so is making things "trust*worthy*" (that is, worthy of being assumed correct), but it is in general hard to make any given component truly trustworthy. --you'll notice that the "trusted computing" initiatives from various powerful interests subvert this word. who exactly is being trusted and who is exactly isn't being trusted? "trusted computing" sounds great linguistically, but "trusted computing platforms" do not actually mean what they sound like A. background on this paper by Thompson: Thompson gave this lecture/paper after winning the Turing Award, which is considered by many to be the Nobel prize of Computer Science. The paper is stunning but takes patience and a few readings to understand. We're going to reproduce most of what Thompson did but will follow the ideas in an order different from the one in the paper. B. adding a feature to a language What if we wanted to add a feature to Java? Say that the Java compiler is written in C, in a file called java.c. So we modify java.c, and rerun the C compiler on java.c, producing a new Java compiler that understands a new feature of Java Now what if we wanted to ad a feature to the C programming language? Well, for all practical purposes, the C compiler is also written in C, and let's assume that the entire C compiler is implemented in a file called "cc.c". To add a feature to the C programming language, we need to modify cc.c, and run the old C compiler on the new file. At this point, we have a new C compiler that understands a new feature of the language. C. Context As sometimes happens today, earlier versions of Unix were distributed with a full set of binaries and source for those binaries. This source included source for the compiler, the OS, the program 'login', etc. Because the system was quite small, it was common for people to make a change in one source file and then to recompile all of their programs. So program recompilation happened a lot. D. In this environment, how could someone as clever as Thompson add a bug to the login program without leaving a trace in the source files? **GOAL: have no source files hint at the bug, and meanwhile, the bug will persist across all recompilations [DRAW PICTURES] E. How can we write a self-reproducing program in pseudocode? X = "Output 'X'. Output '='. Output quote mark. Output X. Output quote mark. Output X." Output 'X'. Output '='. Output quote mark. Output X. Output quote mark. Output X. Run that, and you get itself. Here is a simpler version: Print this followed by its quotation: "Print this followed by its quotation". [BTW, the GNU Public License works like this. It's a self-replicating license! the license specifies that to make a copy of the code, you have to release the source **with the license itself included**. the license talks about itself, just as a self-replicating program must.] Here's a self-replicating program in Scheme: ((lambda (x) `(,x ',x)) '(lambda (x) `(,x ',x))) F. Result: some well-known string in the C compiler source now compiles to binary that does the following: << (1) if compiling "login", insert a bug (2) if you see the well-known string in the C compiler itself, replace it with everything between << >> >> G. What's the moral of the story? --------------------------------------------------------------------------- Admin notes --short class; course evaluation forms passed out today --final exam in a week and a half: start reviewing --final projects due not long after: start working on them now --review session Monday of the exam --------------------------------------------------------------------------- 3. Protection and security in Unix A. Intro --why security in the OS? managing resources for different applications must protect different users from one another file system memory processes --access control matrix (conceptual construct) File 1 File 2 File 3 .... User 1 r/w User 2 r User 3 w --don't maintain matrix manually or entirely --use tools such as groups or role-based access control individuals roles resources x r1 a y r2 b z c [lots of diagonal lines between but not across columns] B. The Unix protection model --designed for specific purpose: multiple users time-sharing a Unix system. here's the security model: (i) UIDs and GIDs (ii) access control on files, per UID and per GID (iii) special user: root (UID=0) to which access control doesn't apply (iv) privileged operations only root can do (v) some implicit privileges (i) process has a user ID and one or more group IDs (ii) access control on files --system stores with each file --user who owns the file and group that file is in --permissions for user, anyone in the file's group, and other --can see this by doing "ls -l": rw- rw- r-- .... basic operations: read, write, execute [rwx] --which permissions apply? --if process's UID matches , then user permissions --if process has GID matching , then group permissions --otherwise, 'other'. --directory has permissions too --"read" means, roughly, "can list files in this directory" --"execute" means, roughly, "can use pathnames in this directory" (iii) uid 0, called root, treated specially by the kernel as administrator --uid 0 has all permissions --how do uid's get set? setuid() call uid=0 can change to any other uid other uid's cannot invoke setuid(), to a first approximation --Unix login runs as root checks username, password against /etc/shadow calls setuid(user), runs user's shell --Here's more detail on login --Unix users typically stored in files in /etc --key files include "passwd", "group", and, often, "shadow" or "master.passwd" --purpose of shadow file is to separate the material that needs to be user-visible (list of users and UIDs on the system) from the cryptographic material. passwords are never stored in the clear, but you don't want to expose "shadow" to users. they could then easily conduct an offline dictionary attack. making the shadow file non-world-readable addresses that. --for each user, the files contain: --the textual username (for example, "mwalfish" or "root") --numeric user ID and group IDs --one-way hash of user's password: (salt, H(salt,pasword)) [salt makes it harder to break many passwords at once: attacker who is working on the password file offline cannot just compare the hashes to pre-computed values; the attacker must, for each entry, pump every password through the hash function.] --other information, including user's full name, login shell, etc. --/usr/bin/login runs as root --Reads username and password from terminal --Looks up username in /etc/passwd, etc. --Computes H(salt, typed password) and checks that it matches the hash in the "shadow" or "password" file --If matches, sets group ID and user ID corresponding to username --Execute user's shell with execve system call --Unfortunately, this security model (where uid 0 can do anything) leads to lots of privileged code, which in turn means that bugs are particularly dangerous. --Here's an example that uses login. Consider that on some systems, rlogind (remote login) runs as root. --The "login" command takes a flag, "-f" which means "don't ask for password". If the flag is supplied, the login succeeds only if the requested username is the current user, or if the current user is root. Unfortunately, rlogind runs "login username". Attack: at the login screen, pass user "-froot". So it looks like this: login: -froot [no password] # Why this worked on that buggy system: login sees "-f" flag and asks, "Is the requesting user the same?" (Answer: yes. Both login and rlogind are running as root here.) (iv) there are certain operations that only root can do Examples: --binding to ports less than 1024 --change current process's user or group ID --mount or unmount file systems --opening raw sockets (so you can do something like ping remote machines, for example) --set clock --halt or reboot machine --change UIDs (so login program needs to run as root) [Problem: you need to have all of root's permission to do *any* of these things (yes, can drop privileges, but we'll see that's easier said than done). That is a *lot* of privilege to do any one action. That is problematic for reasons we'll see.] (v) some implicit privileges --file descriptors, etc. --fork() gives parent ability to control child, etc. --can ptrace() processes at same UID (sort of) result: everything is a bit incoherent C. setuid --next time