Class 2 CS 480-008 28 January 2016 On the board ------------ 1. Last time 2. DNS 3. Network layer (IP) 4. Transport layer (UDP, TCP) 5. Application layer (anything) --sockets API --------------------------------------------------------------------------- 1. Last time Intro to course Intro to networking unit Computer networks are interesting --end-points highly programmable, middle kind of boring (only kind of). --can program all of the nodes! --extremely easy to innovate and develop new uses of the network (the Web was not designed by computer scientists or network architects! the Web was an application of the network that required zero buy-in from network engineers) --contrast: telephone network: end-points ridiculously simple, middle has complexity. --worse, can't program most phones, need FCC approval for new devices, no visibility, etc. [Aside: if you're interested in this stuff, take classes in networking! Or program away! Or read the RFCs (short for "Request For Comments" but despite the name, they are standards). ** Few things are as open and well-documented as the various protocols that form, and run over, the Internet.] Layered picture [redraw it] Today: cover IP, transport, app Going to simplify what is happening below IP 2. DNS: how do names turn into addresses DNS = Domain Name System. One of the most successful distributed systems in history. type: $ dig www.cims.nyu.edu ask "." for the name server (NS) for .edu. (".edu" is known as a TLD, or top-level domain) ask that NS for the NS for .nyu.edu. ask that NS for the NS for cims.nyu.edu. where do names come from? ICANN has authority over the name space. registry holds the names. registrars assigns names within the given domain; resulting records go in the registry 3. IP Internet Protocol (IP): classic technology (took over the world, almost literally) --IP used to connect multiple networks --Runs over a variety of physical networks --Most computers today speak IP --We will focus on the classic version of IP: IPv4. Fundamentals: --Every host has a unique 4-byte (32-bit) IP address --for example: access.cims.nyu.edu is 128.122.49.15 linserv.cims.nyu.edu is 128.122.49.125 (Notice: IPv4 permits 2^{32} addresses. And portions of the space are not usable. Not enough for all devices that want to connect! Middleboxes, private networks, NAT, etc. help deal with the shortage, at the cost of complexity. IPv6 also deals with the shortage: IPv6 has 16-byte (128-bit) addresses. With 2^{128} addresses, it seems hard to run out. But IPv6 is still not deployed everywhere.) --Where do addresses come from? --The top-level assignment is by IANA, who delegates to ARIN (for north america), who assigns to either NYU or NYU's providers.) --For example, NYU gets: 128.122.0.0 - 128.122.255.255 192.76.177.0 - 192.76.177.255 192.86.139.0 - 192.86.139.255 216.165.0.0 - 216.165.127.255 see http://whois.arin.net/rest/org/NYU/nets --This is a different name space from Domain Names. For example: access.cims.nyu.edu is 128.122.49.15 fox.geekny.com is 128.122.140.111 could have: foo.cims.nyu.edu being 5.17.35.6 --How do packets get where they're going? *Forwarding*: router sees a packet with a destination, looks up the destination, decides which link to send it out. DEMO: $ traceroute ..... *Routing* solves the problem of knowing where all of the hosts are attached, and how to reach them --Dijkstra's algorithm, Link state, path vector, etc., etc. --Address space structured to make routing practical at global scale (because of the hierarchy and aggregation) --Result: number of routing entries across the Internet vastly smaller than the number of addresses --this was hugely important for scaling. still is, though becoming less so (as memory gets cheaper) DEMO: $ netstat -arn --How do hosts get IP addresses? two possibilities: --manual configuration --BTW, even edge routers get this thing configured manually. A third-tier ISP is told: "here's the IP address of the other end of this link." --If you have a cable modem, it does this --DHCP --Commercial providers exchange prefixes, using BGP. Lots of complexity and considerations (technical concerns and business ones interact) --Internally, use other protocols: IS-IS, OSPF, etc. TRANSITION --we do not yet have a way to indicate what application or process on the destination computer gets the packet --we also don't cleanly handle things like failure, congestion in the network, etc. 4. Transport layer DRAW PICTURE: layer role TCP UDP ICMP("ping") {flow control, port space} IP {forwarding} Ethernet {framing} radio copper_wires fiber {signal propagation} --Onboard: * Motivation * TCP vs UDP * port space * Congestion control --Motivation: failure, demultiplexing, flow control, etc. Several types of error can affect packet delivery --Bit errors (e.g., electrical interference, cosmic rays) --Packet loss (packets dropped when queues fill on overload) --Link and node failure In addition, properly delivered frames can be delayed, reordered, even duplicated How much should OS (or the networking modules) expose to application? --Some failures cannot be masked (e.g., server dead) --Others can be (e.g., retransmit lost packet) --But masking errors may be wrong for some applications (e.g., old audio packet no longer interesting if too late to play) UDP and TCP most popular protocols on IP --Both use 16-bit _port_ number as well as 32-bit IP address --Applications _bind_ to a port and receive traffic to that port (discuss later what the interface is) UDP: User Datagram Protocol --Exposes packet-switched nature of Internet --Sent packets may be dropped, reordered, even duplicated (but generally not corrupted). Application's problem to deal with these errors TCP: transmission control protocol --Provides illusion of a reliable "pipe" between two processes on two different machines --Masks lost and reordered packets so apps don't have to worry --Handles congestion and flow control Uses of TCP --Most applications use TCP --Easier interface to program to (reliability) --Automatically avoids congestion (don't need to worry about taking down network) --Example: Interacting with www.cs.nyu.edu --Browser resolves IP address of www.cs.nyu.edu --Browser connects to TCP port 80 on that IP address --Over TCP connection, browser requests and gets home page 5. Application layer Servers typically listen on well-known ports SSH: 22 Email: 25 Finger: 79 Web / HTTP: 80 What is the interface to the networking stack? --Application programmer classically sees *sockets*. (Inspired by pipes ) Write data on one machine, read it on another *sockets* can represent many different network protocols, but: --classically an interface to TCP/IP and UDP --sometimes an interface to IP or Ethernet (raw sockets) --sockets API: DEMO /* senders and receivers */ int sockfd = socket(AF_INET, SOCK_STREAM|SOCK_DGRAM|, 0); [note: with AF_INET in the first position, the setting of SOCK_STREAM vs SOCK_DGRAM controls whether the app's data is going to go over TCP or UDP]. [with UDP sockets, send atomic messages that may be reordered or lost] [with TCP sockets, bytes written on one end are read on the other, provided no failures. but no guarantees that reads will return the full amount requested ... or that the data will be packetized according to the number of times the sender called send(). With TCP, you *must* sit there in a loop and keep reading. You know you're done because either (a) the application-level protocol is expected to understand where message boundaries begin and end or (b) the first machine closed its connection to the server] int rc = close(); select(); /* for asynchronous network I/O; we won't use this in lab1 */ struct sockaddr_in { short sin_family; short sin_port; uint32_t sin_addr; char sin_zero[8]; }; /* senders */ int rc = connect(sockfd, &addr, addrlen); int rc = send(sockfd, buf, len, 0); int rc = sendto(sockf, buf, len, 0, &sockaddr, addrlen, 0); /* receivers */ int rc = bind(sockfd, &addr, addrlen); int rc = listen(sockfd, backlog_len); int rc = accept(sockfd, &addr, &adddrlen); int rc = recv(sockfd, buf, len, 0); int rc = recvfrom(sockfd, buf, len, 0, &addr, &addrlen); NOTES: * connections are named by 5 components: protocol (TCP), local IP address, local port, remote IP address, remote port * UDP does not require connected sockets * OS tracks all of this state in a PCB (protocol control block). ------------------------------------------------------------------- Reference: poke around with some tools: --"ifconfig -a" (Unix) --"netstat -arn" (Unix) --"dig [hostname]" (Unix) --"dig -x [IP address]" (Unix) --"ipconfig /all" (windows) --"route print" (Windows?) --"arp -a" (Unix)