Class 17 (make-up, video-taped) CS 480-008 1 April 2016 On the board ------------ 1. Last time 2. Public key crypto concepts, continued Public key encryption Textbook RSA Digital signatures 3. Web security intro --------------------------------------------------------------------------- 1. Last time --defending against untrusted OSes: Haven --public key crypto intro 2. Some crypto concepts We're looking at a few primitives: [last time] key exchange public-key encryption digital signatures Public key encryption Each party has a public-key pair: (pk, sk) pk is public key; broadcast to the world sk is secret key; guard it! Setup: two algorithms: Enc, Dec (and a key generation algorithm, which we will mostly neglect) Interface: (pk,sk) <-- Gen(security_parameter) ciphertext <-- Enc(public key, message) plaintext <-- Dec(secret key, ciphertext) Want/need: ** Dec(sk, Enc(pk, m)) = m ** Without sk, eavesdropping adversary (who sees ciphertext) cannot guess message m NOTICE: if the encryption algorithm is deterministic, the scheme is insecure (why???) RSA Three starting facts/definitions from group theory: (1) Two integers x,y are _relatively prime_ if they have no common factors besides 1. This is often written as gcd(x,y)=1, meaning that the greatest common divisor is 1. Ex: 6,10 are not relatively prime (they have a common factor of 2) Ex: 8,9 are relatively prime because gcd(8,9)=1. (2) Any integer N>1 induces a multiplicative group consisting of all integers that are relatively prime to N. In notation: ZN* = { a \in {1,....,N-1} | gcd(a, N) = 1} The number of elements in ZN* is written phi(N), known as "Euler's phi function." In other words, phi(N) is the number of numbers less than N that are relatively prime to N. (3) For all a in ZN*, a^{phi(N)} = 1 mod N This is because for any group G and any element g in G, g^{|G|} = e, where e is the group identity (1 in the case above), and that follows from basic group theory. "Textbook RSA" Here is an INSECURE variant of RSA, the so-called "Textbook RSA." Do not use this in your systems! Key generation: -- generate N = p q , for two primes p, q [fact: for N of this form, phi(N) = (p-1)(q-1)] -- identify e such that e and phi(N) are relatively prime. -- identify d such that d*e = 1 mod phi(N) -- return N, e, d -- public key: (N,e); private key: (N,d) Encrypt(pubkey (N,e), msg m): regard m as an element in the group ZN*. c = m^e mod N Decrypt(secretkey (N, d), ciphertext c)  m = c^d mod N This scheme is insecure, so we won't and can't show the second "want/need" above. But we can show the first: (m^e)^d mod N = m^(ed) mod N = m^(k*phi(N) + 1) mod N, [b/c ed=1 mod phi(N)] = m^(k*phi(N))*m mod N, [rewriting] = m mod N [fact (3)] = m Digital signatures Each party has a public-key pair: (pk, sk) pk is public key; broadcast to the world sk is secret key; guard it! Setup: two algorithms: Sign, Verify (and a key generation algorithm, which we will mostly neglect) sig <-- Sign(sk, msg) {0,1} <-- Verify(pk, msg, sig) Want/need: ** For all msg: Verify(pk, msg, Sign(sk, msg)) = 1 ** Without sk, an adversary cannot forge signatures, i.e., if adversary does not have sk but tries to produce msg, sig pairs, then Verify(pk, msg, sig) = 0 One can implement an insecure version of digital signatures with Textbook RSA. Key generation: Same as earlier Sign(secretkey (N,d), msg m): s = m^d mod N Verify(pubkey (N,e), msg m, sig s): if (s^e mod N == m) return 1 else return 0 For this reason, some people say that digital signatures are the "inverse" of encryption, but this view obscures more than it illuminates, so we encourage you to keep the two concepts separate. Also, the scheme above is ridiculously insecure. Why? adversary chooses arbitrary s' in ZN*. adversary sets the message m' to s'^e. adversary ouputs (m', s'). This is a forgery: it will pass verification, but the owner of sk never created this (msg, sig) pair. Other attacks possible too. A modification is to hash the message first with a collision-resistant hash function, but one again needs to be careful. Use of signing + Diffie-Hellman key exchange: S-->C: (g, p, g^x)_[signed] C-->S: g^y 3. Web security A. Intro We are switching gears. Now we are going to study isolation between *sites* in a *client-side* Web browser. Overall plan is called the "same-origin policy" (SOP) --The SOP is described in _The Tangled Web_ --New mechanisms have come out since "The Tangled Web" But mostly adding to the design, rather than replacing it What's the top-level motivation for the SOP? Browser visits site foo.com. Assume: foo.com malicious. foo.com delivers adversarial JavaScript to the browser. What could go wrong? Assume no SOP. Ex.1. User is also visiting site bar.com --The JavaScript in foo.com manipulates the DOM (JavaScript's representation of the page) for bar.com. --Can deface bar.com --Can also rewrite links in bar.com so that when the user clicks on 'Enter password', the password is delivered to foo.com Ex.2. User isn't visiting another site. --But the JavaScript from foo initiates Web requests to moo.com, where the user is authorized (perhaps moo.com stores private data for the user). The authority might be encapsulated in the cookie that the browser presents to moo.com. --Now the restricted data is living in program objects (the DOM, JavaScript variables) that foo.com's code can manipulate. This is a problem, because: --The adversarial JavaScript can then issue Web requests to foo.com to exfiltrate the content of the restricted data: http://www.foo.com/heres-the-users-data-for-moo-com?abc..... The SOP prevents these and other issues by regarding each site as a separate *security principal*, or origin, and imposing the rule that JavaScript from one origin may not modify the DOM of a page from a different origin. This addresses Ex1. The SOP also stipulates that JavaScript from one origin may not issue HTTP requests to a different origin, which addresses Ex2. The SOP is big and complicated.... let's take a step back How did the browser security plan come about? Origin: Netscape browser introduced SOP when adding support for Javascript Incremental design/development: no single coherent design. Noone expected web browsers to be used in the ways they are today. Security issues patched as they were discovered, with extra rules/checks. Browser vendors competed (and to some extent still compete) on functionality. Adding new features (or even security mechanisms) before standards. Historically, W3C has largely been documenting what browsers already do, instead of proposing new standards that browsers will then implement. Browsers didn't always agree on overall plan, or the implementation details. Browser vendors do something that roughly resembles the specs Many quirks. See quirksmode.org. As a result, many inconsistent corner cases that can be exploited. Now, there's quite a bit of collaboration "behind the scenes". Developers of Chrome, Firefox, IE talk to each other a fair amount. Important issues get fixed slowly over time. Compatibility is a huge constraint, hard to break old sites. (Users will stop using your web browser!) Some of the fixes take place in the browser and Javascript libraries (jQuery, etc). When possible, just a compatibility layer on top of raw browser APIs. Some of the improvements through new headers E.g., Content-Security-Policy Many of the attacks we will talk about in class are more difficult to pull off E.g., most of the attacks we will see in lab5 don't work with Chrome B. Background, threat model, setting What is the Web, really? In the old days, it was like in lab1: a simple client/server architecture (client was your web browser, server was a machine on the network that could deliver static text and images to your browser). The web has changed: now the browser is very complicated. --JavaScript: Allows a page to execute client-side code. --DOM model: Provides a JavaScript interface to the page's HTML, allowing the page to add/remove tags, change their styling, etc. --Cookies: storage in browser, used for e.g. user authentication --XMLHttpRequests (AJAX): Asynchronous HTTP requests. --Web sockets: Full-duplex client-server communication over TCP. --Web workers: Multi-threading support. --Multimedia support: