If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

User Datagram Protocol (UDP)

AP.CSP:
CSN‑1 (EU)
,
CSN‑1.C (LO)
,
CSN‑1.C.1 (EK)
,
CSN‑1.C.2 (EK)
,
CSN‑1.C.4 (EK)
The User Datagram Protocol (UDP) is a lightweight data transport protocol that works on top of IP.
UDP provides a mechanism to detect corrupt data in packets, but it does not attempt to solve other problems that arise with packets, such as lost or out of order packets. That's why UDP is sometimes known as the Unreliable Data Protocol.
UDP is simple but fast, at least in comparison to other protocols that work over IP. It's often used for time-sensitive applications (such as real-time video streaming) where speed is more important than accuracy.

Packet format

When sending packets using UDP over IP, the data portion of each IP packet is formatted as a UDP segment.
Diagram of a UDP segment within an IP packet. The IP packet contains header and data sections. The IP data section is the UDP segment, which itself contains header and data sections.
Each UDP segment contains an 8-byte header and variable length data.

Port numbers

The first four bytes of the UDP header store the port numbers for the source and destination.
A networked device can receive messages on different virtual ports, similar to how an ocean harbor can receive boats on different ports. The different ports help distinguish different types of network traffic.
Here's a listing of some ports in use by UDP on my laptop:
A command line terminal with the command "sudo lsof -i -n -P | grep UDP". The command outputs the following table:
ProcessProcess IDTypePort
launchd1IPv4UDP *:137
launchd1IPv4UDP *:138
syslogd45IPv4UDP *:54465
mDNSResponder186IPv4UDP *:5353
mDNSResponder186IPv6UDP *:5353
mDNSResponder186IPv4UDP *:65327
mDNSResponder186IPv6UDP *:65327
mDNSResponder186IPv4UDP *:55657
mDNSResponder186IPv6UDP *:55657
Google12306IPv6UDP *:5353
Each row starts with the name of the process that's using the port and ends with the protocol and port number.
🔍 What sort of network traffic do those processes handle? If you search the web for the process name plus the port number, you can probably figure it out. You could even try it on the computer you're using now.

Segment Length

The next two bytes of the UDP header store the length (in bytes) of the segment (including the header).
Two bytes is 16 bits, so the length can be as high as this binary number:
1111111111111111
In decimal, that's left parenthesis, 2, start superscript, 16, end superscript, minus, 1, right parenthesis or 65, comma, 535. Thus, the maximum length of a UDP segment is 65, comma, 535 bytes.

Checksum

The final two bytes of the UDP header is the checksum, a field that's used by the sender and receiver to check for data corruption.
Before sending off the segment, the sender:
  1. Computes the checksum based on the data in the segment.
  2. Stores the computed checksum in the field.
Upon receiving the segment, the recipient:
  1. Computes the checksum based on the received segment.
  2. Compares the checksums to each other. If the checksums aren't equal, it knows the data was corrupted.
To understand how a checksum can detect corrupted data, let's follow the process to compute a checksum for a very short string of data: "Hola".
First, the sender would encode "Hola" into binary somehow. The following encoding uses the the ASCII/UTF-8 encoding:
start text, H, end textstart text, o, end textstart text, l, end textstart text, a, end text
01001000011011110110110001100001
That encoding gives these 4 bytes:
01001000, space, 01101111, space, 01101100, space, 01100001
Next, the sender segments the bytes into 2-byte (16-bit) binary numbers:
010010000110111101101100011000010100100001101111\\ 0110110001100001
To compute the checksum, the sender adds up the 16-bit binary numbers:
0100100001101111+01101100011000011011010011010000\large\begin{aligned} 0100100001101111& \\ \underline{+0110110001100001}& \\ 1011010011010000 \end{aligned}
The computer can now send a UDP segment with the encoded "Hola" as the data and 1011010011010000 as the checksum.
The entire UDP segment could look like this:
FieldValue
Source port number00010101, space, 00001001
Destination port number0001010, space, 100001001
Length00000000, space, 00000100
Checksum10110100, space, 11010000
Data01001000, space, 01101111, space, 01101100, space, 01100001
What if the data got corrupted from "Hola" to "Mola" on the way?
First let's see what the corrupted data would look like in binary.
"Mola" encoded into binary...
start text, M, end textstart text, o, end textstart text, l, end textstart text, a, end text
01001101011011110110110001100001
...and then segmented into 16-bit numbers:
010011010110111101101100011000010100110101101111\\ 0110110001100001
Now let's see what checksum the recipient would compute:
0100110101101111+01101100011000011011100111010000\large\begin{aligned} 0100110101101111& \\ \underline{+0110110001100001}& \\ 1011100111010000 \end{aligned}
The recipient can now programmatically compare the checksum they received in the UDP segment with the checksum they just computed:
  • Received: 1011010011010000
  • Computed: 1011100111010000
Do you see the difference?
When the recipient discovers that the two checksums are different, it knows that the data was corrupted somehow along the way. Unfortunately, the recipient can not use the computed checksum to reconstruct the original data, so it will likely just discard the packet entirely.
The actual UDP checksum computation process includes a few more steps than shown here, but this is the general process of how we can use checksums to detect corrupted data.

🙋🏽🙋🏻‍♀️🙋🏿‍♂️Do you have any questions about this topic? We'd love to answer—just ask in the questions area below!

Want to join the conversation?

  • marcimus pink style avatar for user Yizuhi Galaviz
    UDP doesn't do anything about packets arriving out of order, right? Is that why sometimes, in live streaming, the audio and video are not synchronized? Or why the live streams sometimes lag?
    (35 votes)
    Default Khan Academy avatar avatar for user
    • aqualine ultimate style avatar for user Martin
      Yes, UDP does without handshakes. That means the information received is somewhat unreliable when it comes to ordering, duplicates and packets arriving at all.
      And you correctly identified the problems that arise with that :)
      (33 votes)
  • leaf red style avatar for user layaz7717
    What might cause data to become corrupted? Also, when you get the notification that a file is corrupted, does it have the same meaning? The data has gotten messed up somehow?
    (8 votes)
    Default Khan Academy avatar avatar for user
    • leaf green style avatar for user Shane McGookey
      It might be helpful to consider what "data" is. In this context, we are talking about data that is transmitted over a network. If you are using a personal computer, then the process of transmitting data over a network involves transitioning data from your HDD or SSD to your computer's RAM via internal buses, and then to the NIC (Network Interface Card) to communicate it across a network. The data (which is represented using binary) must then traverse the connection between your device and the device you are sending data to. This traversal could entail moving the data over a WiFi network, over Ethernet cables, etc.

      This is an extensive process, and we certainly take for granted its complexity when we interact with the Internet each and every day. If during this process any bits of the data (again, data is being represented as binary; 1s and 0s) were to flip (go from 0 to 1, or 1 to 0) or were to be lost then the data received by the receiving machine wouldn't be the same as what was originally sent. In this case, the machine may not be able to interpret the data anymore - as it has lost its meaning - and so you end up with data corruption.

      To your latter question, yes. When information saved to the machine's non-volatile storage has been saved incorrectly and has thus lost its meaning, then the computer will notify you that the data has been corrupted.
      (17 votes)
  • leaf red style avatar for user layaz7717
    What exactly happens when videos start to glitch? For example, in Zoom meetings, sometimes people tend to look all "blocky" and the details in their video are not defined. Is that something to do with data corruption?
    (6 votes)
    Default Khan Academy avatar avatar for user
    • aqualine ultimate style avatar for user Martin
      Yes, that generally indicates issues with packages, a lot of packets actually (a single dropped packets could be easily dealt with because of error-correcting encodings), generally it will just be a little blurriness or wobbly sound or something similarly barely noticeable.
      (7 votes)
  • leafers sapling style avatar for user green_ninja
    Hi!

    I'm having trouble with adding binary numbers. I watched a few of Khan Academy's videos on YouTube, but I don't understand the mathematical reasoning behind it. Can someone please explain?

    Thanks!
    (4 votes)
    Default Khan Academy avatar avatar for user
    • aqualine ultimate style avatar for user Aryan Metha
      In the decimal number system, each place has a value in powers of 10. in binary , each place has value in powers of 2.

      What I do while adding binary numbers is , I convert them to decimal and then perform operations on them. If the number is too large, use a computer program to do the converting.
      (0 votes)
  • blobby green style avatar for user Neev Badu
    (2 votes)
    Default Khan Academy avatar avatar for user
    • hopper jumping style avatar for user pamela ❤
      Good question! The 4 bytes is the width of the header. Together, the source port number and destination port number in the first row take up 4 bytes. Since they're shown equal sized, each of them take up 2 bytes (16 bits). Similarly the segment length and checksum together take up 4 bytes, and each take up 2 bytes.
      (7 votes)
  • area 52 blue style avatar for user Dhairya Patel
    Is it possible for the data in the checksum (the two bytes) to be corrupt?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Aland Soran
    Are the terms package, packet and segment referring to the same thing? if not, could you please define each one? thank you
    (1 vote)
    Default Khan Academy avatar avatar for user
    • starky ultimate style avatar for user KLaudano
      "Package" is an informal term that people seem to use in place of "packet".

      Segments, packets, and frames are created at different layers in the OSI model and each adds its own header to the data with more information. Segments are created at the transport layer and include port numbers. Packets are created at the network layer from segments and have IP addresses. Frames are created at the data link layer from segments and have MAC addresses.

      (I know you didn't ask about frames, but for the sake of completeness I felt it should be added.)
      (6 votes)
  • starky sapling style avatar for user John Schur
    I'm sure this has come up before, but doesn't it seem possible to reconstruct a packet that is deemed corrupt by reverse-resolving for the missing or corrupted segment of the packet?
    (0 votes)
    Default Khan Academy avatar avatar for user
    • blobby green style avatar for user Abhishek Shah
      It is possible! The relevant term is error correction. There's generally a tradeoff between size of message (increase the space resource) and "reverse-resolving" (increase the time resource). So instead of reverse-resolving, it's better to resend or vice-versa. TCP usually does error detection (not correction), but computer memory often uses ECC (error-correcting codes) for this exact purpose.

      For instance,
      " he d g barks" could be error corrected to "the dog barks". Our brains perform this process frequently.

      Hope this helps!
      (7 votes)
  • leafers tree style avatar for user Andrei
    If the message can be corrupted can't the checksum number also be? Then the message would be discarded even though the message might be completely fine.
    (3 votes)
    Default Khan Academy avatar avatar for user
  • leaf red style avatar for user layaz7717
    How long does it take for data to transport? I know that this will obviously depend on distance, but is there a general time frame? It's kind of hard to believe that we can watch live streams in which the video and audio are only a second or two behind the actual happenings of the game, etc.
    (2 votes)
    Default Khan Academy avatar avatar for user
    • aqualine ultimate style avatar for user Martin
      As incredible as it sounds it's milliseconds. I don't know if you heard about ping (playing online computer games without worrying about ping used to be almost unimaginable, but things might have changed with faster internet), but playing a computer game with a ping of 300 (0.3 seconds for communication between server and computer) is almost impossible when playing something that requires you to respond quickly to your environment.
      You can also open a terminal on your computer (on windows right click on start and click on windows powershell, on Linux press CTRL+ALT+T) and type something like

      ping google.com //or any other location really, but don't overdo that

      to see how long it takes packets to travel to a target location.
      (2 votes)