Packets, routers, and reliability
Spotify engineer Lynn Root and Vint Cerf, Father of the Internet, explain how information on the Internet is broken down into packets, and how to ensure that information is reliably received.
Want to join the conversation?
- When something is being streamed, does that mean that not all the information is checked by TCP at one time but rather whenever the packets arrive? Also, when the streamed product begins to buffer, does that mean that there is a lag in information received or a buildup? Is there any way to stop that from happening?(21 votes)
- TCP checks packets as they arrive, whether you are downloading a file, or streaming. It makes sure that each packet received is not corrupted and in order and then sends an acknowledgement with the next expected segment. The special thing about the acknowledgements that it sends is, they are cumulative acknowledgements, which means that the acknowledgements say that everything up to that segment number is okay.
A TCP application's buffer is like a pool with a garden hose and a drain. The water is the data. When you use the data, you open up the drain of the pool. If you receive data from TCP, you add water to the pool with the garden hose. If you run out of water, you have no data to watch your movie etc. When the water level in the pool gets too low you need to buffer. All buffering does is close the drain for a while, so that the water level in the pool can go back up to the top. It only buys you time before the water level gets low again and you have to do it again. The way to stop it is to use a smaller drain (use up less data by choosing a smaller frame rate, lower resolution, etc.) or get a bigger hose (better internet connection).
Hope this makes sense(80 votes)
- does TCP/IP protocol exist in my browser or in my computer? in what language they are written?(18 votes)
- TCP and IP protocols are probably already coded into your Operating System.
They also exist in every program you use that requires using the internet. So the TCP protocol is also present in your browser, as well as Skype or your Antivirus.
They can be written in many different languages.(26 votes)
- Does sending information in packets make it easier or harder for hackers to see what is being sent?(9 votes)
- Each packet can be encrypted but if the packets are sent "in the clear" and a hacker is using a packet sniffer, yes, a hacker can intercept, acquire, and reassemble a message.(18 votes)
- If some data is lost during transmission, where will it go?(8 votes)
- The data is destroyed. The electricity itself is lost as electromagnetic radiation.(16 votes)
- So does it mean billions of other people are using my router?(5 votes)
- More like you are able to communicate with billions of other people via your router. Your router is just an access point for sending/receiving data.(11 votes)
- If TCP is responsible to check the completeness of received data, where does it get the check list from? What if the check list is incomplete, too? Maybe TCP doesn't work in this way, someone please explain.(1 vote)
- It doesn't really use a check list.
What really happens is each data segment has a sequence number.
When the receiver receives the segment they then send back an acknowledgement number which indicates the start of next data segment they should receive.
Here's what happens when things go as planned:
Alice sends Bob segment 79 containing 10 bytes
Bob receives it ok and responds back with an acknowledge number 89 (79+10)
Here's what happens when segments are received out of order:
Alice sends Bob segment 89 containing 20 bytes
It gets lost
Alice sends Bob segment 109 containing 15 bytes
Bob receives this segment okay, but is still expecting segment 89, so Bob sends back an acknowledge number 89
Here's what happens when no acknowledgement is received:
Alice sends segment 89 again.
It somehow gets lost too.
Alice waits for an acknowledgment from Bob, but doesn't receive one.
Alice then send segment 89 again.
Bob receives this segment and sends back an acknowledgement number 124 (109+15)
(Because Bob already received the segment starting at 109)
Hope this makes sense(10 votes)
- If all data can be broken up into smaller packets then why are there limits on file sizes that can be emailed?(2 votes)
- The data has to be stored on a server that belongs to your email provider, and servers cost money! So email providers limit file sizes so they don't run out of memory. Bandwidth is also expensive.(7 votes)
- Say Computer #1 requests some data, and Computer #2 requests data at the exact same time. Is it possible for a packet to hold data for more than one computer (like FedEx delivers for tons of different people)?(1 vote)
- Typically, a packet only has a destination address for a single computer, which tells the routers where to send the data. However, one can use a broadcast address (https://en.wikipedia.org/wiki/Broadcast_address) which asks the routers to send to everyone within a certain part of the network. Broadcast addresses are typically only used by the network to configure and update itself, so it is not usually used to send data to computers requesting data.
An example of where broadcast addresses are used for evil purposes is a Smurf attack (https://en.wikipedia.org/wiki/Smurf_attack)
Hope this makes sense(7 votes)
- Is HTTP another protocol than TCP, and are they doing different things to check the packets?(2 votes)
- HTTP and TCP are different protocols, at different levels of the network. HTTP is an application protocol, while TCP is a transport protocol. What this means is that they are both used on the same packet, for different things. TCP (Transport Control Protocol) is used to move the packet from one end system to another (for instance, from KhanAcademy to your computer). HTTP (HyperText Transfer Protocol) is used to translate the data received through TCP into the website you actually see.(4 votes)
- Wouldn't streaming a song use UDP and not TCP? I was under the impression streaming uses UDP.(2 votes)
- You can use both. The advantage of streaming music using TCP means you don't have to worry about lost packets, the disadvantage is the lower speed.
If you want something at 100% perfect then TCP is generally the way to go, UDP is for the cases where you don't mind a loss of quality occuring once in a while.(3 votes)
- [Voiceover] Seven, six, five, four, three, two, one. (funky music) - Hi, my name is Lynn Root. I am the Software Engineer here at Spotify, and I will be the first to admit that I often take for granted the reliability of the Internet. The sheer amount of information zooming around the Internet is astonishing. But how's it possible for every piece of data to be delivered to you reliably? Say you want to play a song from Spotify. It seems like your computer connects directly to Spotify's servers, and Spotify sends you a song on a direct, dedicated line. But actually, that's not how the Internet works. If the Internet were made of direct, dedicated connections, it would be impossible to keep things working as millions of users join, especially since there is no guarantee that every wire and computer is working all the time. Instead, data travels on the Internet in a much less direct fashion. - Many, many years ago, in the early 1970s, my partner, Bob Kahn, and I began working on the design of what we now call the Internet. Bob and I had the responsibility and the opportunity to design the Internet's protocols and its architecture. So, we persisted in participating in the Internet's growth and evolution for all of this time, up to, and including, the present. The way information gets transferred from one computer to another is pretty interesting. It need not follow a fixed path. In fact, your path may change in the midst of a computer-to-computer conversation. Information on the Internet goes from one computer to another in what we call a packet of information, and a packet travels from one place to another on the Internet a lot like how you might get from one place to another in a car, depending on traffic congestion or road conditions, you might choose or be forced to take a different route to get to the same place each time you travel. And just as you can transport all sorts of stuff inside a car, many kinds of digital information can be sent with IP packets, but there are some limits. What if, for example, you need to move a space shuttle from where it was built to where it will be launched? A shuttle won't fit in one truck, so it needs to be broken down into pieces, transported using a fleet of trucks. They could all take different routes, and might get to the destination at different times, but once all the pieces are there, you can reassemble the pieces into the complete shuttle, and it'll be ready for launch. On the Internet, the details work similarly. If you have a very large image that you want to send to a friend or upload to a website, that image might be made up of tens of billions of bits of ones and zeroes, too many to send along in one packet. Since it's data on a computer, the computer sending the image can quickly break it into hundreds or even thousands of smaller parts called packets. Unlike cars or trucks, these packets don't have drivers, and they don't choose their route. Each packet has the internet address of where it came from and where it's going. Special computers on the Internet, called routers, act like traffic managers to keep the packets moving through the networks smoothly. If one route is congested, individual packets may travel different routes through the Internet, and they may arrive at the destination at slightly different times, or even out of order. - So, let's talk about how this works. As part of the Internet Protocol, every router keeps track of multiple paths for sending packets, and it chooses the cheapest available path for each piece of data, based on destination IP address for the packet. "Cheapest," in this case, doesn't mean cost, but time and non-technical factors such as politics and relationships between companies. Often the best route for data to travel isn't necessarily the most direct. Having options for paths makes the network fault tolerant, which means the network can keep sending packets, even if something goes horribly, horribly wrong. This is the basis for a key principle of the Internet, reliability. Now, what if you want to request some data and not everything is delivered? Say you want to listen to a song. How can you bee 100% sure all the data will be delivered so the song plays perfectly? Introducing your new best friend, TCP, Transmission Control Protocol. TCP manages the sending and receiving of all your data as packets. Think of it like a guaranteed mail service. When you request a song on your device, Spotify sends the song broken up into many packets. When your packets arrive, TCP does a full inventory and sends back acknowledgements of each packet received. If all packets are there, TCP signs for your delivery, and you're done. (upbeat music) If TCP finds some packets are missing, it won't sign. Otherwise, your song wouldn't sound as good, or portions of the song could be missing. For each missing or incomplete packet, Spotify will resend them. Once TCP verifies the delivery of many packets for that one song request, your song will start to play. (upbeat music) What's great about the TCP and router systems is they're scalable. They can work with eight devices or 8,000,000,000 devices. In fact, because of these principles of fault tolerance and redundancy, the more routers we add, the more reliable the Internet becomes. What's also great is we can grow and scale the Internet without interrupting service for anybody using it. - The Internet is made of hundreds of thousands of networks and billions of computers and devices connected physically. These different systems that make up the Internet connect to each other, communicate with each other, and work together because of agreed upon standards for how data is sent around on the Internet. Computing devices, or routers along the Internet, help all the packets make their way to the destination where they're reassembled, if necessary, in order. This happens billions of times a day, whether you and others are sending an email, visiting a webpage, doing a video chat, using a mobile app, or when sensors or devices on the Internet talk to each other.