If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Phylogenetic trees

What a phylogenetic tree is. How to read phylogenetic trees and determine which species are most related.

Key points:

  • A phylogenetic tree is a diagram that represents evolutionary relationships among organisms. Phylogenetic trees are hypotheses, not definitive facts.
  • The pattern of branching in a phylogenetic tree reflects how species or other groups evolved from a series of common ancestors.
  • In trees, two species are more related if they have a more recent common ancestor and less related if they have a less recent common ancestor.
  • Phylogenetic trees can be drawn in various equivalent styles. Rotating a tree about its branch points doesn't change the information it carries.

Introduction

Humans as a group are big on organizing things. Not necessarily things like closets or rooms; I personally score low on the organization front for both of those things. Instead, people often like to group and order the things they see in the world around them. Starting with the Greek philosopher Aristotle, this desire to classify has extended to the many and diverse living things of Earth.
Most modern systems of classification are based on evolutionary relationships among organisms – that is, on the organisms’ phylogeny. Classification systems based on phylogeny organize species or other groups in ways that reflect our understanding of how they evolved from their common ancestors.
In this article, we'll take a look at phylogenetic trees, diagrams that represent evolutionary relationships among organisms. We'll see exactly what we can (and can't!) infer from a phylogenetic tree, as well as what it means for organisms to be more or less related in the context of these trees.

Anatomy of a phylogenetic tree

When we draw a phylogenetic tree, we are representing our best hypothesis about how a set of species (or other groups) evolved from a common ancestor1. As we'll explore further in the article on building trees, this hypothesis is based on information we’ve collected about our set of species – things like their physical features and the DNA sequences of their genes.
In a phylogenetic tree, the species or groups of interest are found at the tips of lines referred to as the tree's branches. For example, the phylogenetic tree below represents relationships between five species, A, B, C, D, and E, which are positioned at the ends of the branches:
A horizontal phylogenetic tree. Ancestors are on the left with present-day species on the right. Branches are designated as a straight line, and branch points are shown as a vertical line connecting two branches. Five species of interest are designated by letters A, B, C, D, and E labeled branches on the right side of the tree. The tree begins with one line on the left, which branches (at a branch point) to form species E and another line that branches into a pair of double branches. From there, the first pair of double branches forms species A and B, while the second pair of double branches forms species C and D.
Image modified from Taxonomy and phylogeny: Figure 2 by Robert Bear et al., CC BY 4.0
The pattern in which the branches connect represents our understanding of how the species in the tree evolved from a series of common ancestors. Each branch point (also called an internal node) represents a divergence event, or splitting apart of a single group into two descendant groups.
At each branch point lies the most recent common ancestor of all the groups descended from that branch point. For instance, at the branch point giving rise to species A and B, we would find the most recent common ancestor of those two species. At the branch point right above the root of the tree, we would find the most recent common ancestor of all the species in the tree (A, B, C, D, E).
A horizontal phylogenetic tree. Ancestors are on the left with present-day species on the right. Branches are designated as a straight line, and branch points are shown as a vertical line connecting two branches. Five species of interest are designated by letters A, B, C, D, and E labeled branches on the right side of the tree. The tree begins with one line on the left, labeled root, which branches (at a branch point) to form species E and another line that branches into a pair of double branches. From there, the first pair of double branches forms species A and B, while the second pair of double branches forms species C and D. The first branch point in the tree is labeled Most recent common ancestor of A, B, C, D, and E. The point where the tree branches into species A and B is labeled Most recent common ancestor of A and B.
Image modified from Taxonomy and phylogeny: Figure 2 by Robert Bear et al., CC BY 4.0
Each horizontal line in our tree represents a series of ancestors, leading up to the species at its end. For instance, the line leading up to species E represents the species' ancestors since it diverged from the other species in the tree. Similarly, the root represents a series of ancestors leading up to the most recent common ancestor of all the species in the tree.

Which species are more related?

In a phylogenetic tree, the relatedness of two species has a very specific meaning. Two species are more related if they have a more recent common ancestor, and less related if they have a less recent common ancestor.
We can use a pretty straightforward method to find the most recent common ancestor of any pair or group of species. In this method, we start at the branch ends carrying the two species of interest and “walk backwards” in the tree until we find the point where the species’ lines converge.
For instance, suppose that we wanted to say whether A and B or B and C are more closely related. To do so, we would follow the lines of both pairs of species backward in the tree. Since A and B converge at a common ancestor first as we move backwards, and B only converges with C after its junction point with A, we can say that A and B are more related than B and C.
A horizontal phylogenetic tree. Ancestors are on the left with present-day species on the right. Branches are designated as a straight line, and branch points are shown as a vertical line connecting two branches. Five species of interest are designated by letters A, B, C, D, and E labeled branches on the right side of the tree. The tree begins with one line on the left, which branches (at a branch point) to form species E and another line that branches into a pair of double branches. From there, the first pair of double branches forms species A and B, while the second pair of double branches forms species C and D. The point where the tree branches into species A and B is labeled Most recent common ancestor of A and B. A pink line traces from species A back to this point, while a red line traces from species B back to this point. The point where the tree branches into a pair of double branches is labeled Most recent common ancestor of B and C. A red line traces from species B back to this point, while a lavender line traces from species C back to this point.
Image modified from Taxonomy and phylogeny: Figure 2 by Robert Bear et al., CC BY 4.0
Importantly, there are some species whose relatedness we can't compare using this method. For instance, we can't say whether A and B are more closely related than C and D. That’s because, by default, the horizontal axis of the tree doesn't represent time in a direct way. So, we can only compare the timing of branching events that occur on the same lineage (same direct line from the root of the tree), and not those that occur on different lineages.

Some tips for reading phylogenetic trees

You may see phylogenetic trees drawn in many different formats. Some are blocky, like the tree at left below. Others use diagonal lines, like the tree at right below. You may also see trees of either kind oriented vertically or flipped on their sides, as shown for the blocky tree.
Three examples of phylogenetic trees. The first tree is horizontal with a root on the left and branches forming 90 degree angles before forming species on the right hand side. The second tree is similar to the first tree, but it is vertical, with the root at the bottom and the species listed at the top. The third tree is similar to the second tree, but the branches are diagonal.
Image modified from Taxonomy and phylogeny: Figure 2 by Robert Bear et al., CC BY 4.0
The three trees above represent identical relationships among species A, B, C, D, and E. You may want to take a moment to convince yourself that this is really the case – that is, that no branching patterns or recent-ness of common ancestors are different between the two trees. The identical information in these different-looking trees reminds us that it's the branching pattern (and not the lengths of branches) that's meaningful in a typical tree.
Another critical point about these trees is that if you rotate the structures, using one of the branch points as a pivot, you don’t change the relationships. So just like the two trees above, which show the same relationships even though they are formatted differently, all of the trees below show the same relationships among four species:
A series of vertical phylogenetic trees showing that phylogenetic trees can be rotated at a branch, resulting in the same tree. Numbers will be assigned to the branches for description, but do not appear in the diagram. The first tree begins with a root at the bottom, which splits into branches 1 and 2. Branch 1 becomes species W. Branch 2 splits into branches 3 and 4. Branch 3 becomes species X. Branch 4 splits into branches 5 and 6, which becomes species Y and Z respectively. The species are listed from left to right as W, X, Y, and Z. If branch 2 is rotated horizontally 180 degrees it results in the same tree, although the species are listed from left to right as W, Z, Y, and X. At the bottom of the diagram four equal trees are shown, but with rotations occurring at different branches.
Image modified from Taxonomy and phylogeny: Figure 3 by Robert Bear et al., CC BY 4.0
If you don’t see right away how that is true (and I didn’t, on first read!), just concentrate on the relationships and the branch points rather than on the ordering of species (W, X, Y, and Z) across the tops of the diagrams. That ordering actually doesn’t give us useful information. Instead, it’s the branch structure of each diagram that tells us what we need to understand the tree.
So far, all the trees we've looked at have had nice, clean branching patterns, with just two lineages (lines of descent) emerging from each branch point. However, you may see trees with a polytomy (poly, many; tomy, cuts), meaning a branch point that has three or more different species coming off of it2. In general, a polytomy shows where we don't have enough information to determine branching order.
A horizontal phylogenetic tree. The root branches into a lower branch that becomes species U and an upper branch that further splits into two branches. The bottom branch splits into two branches, forming species S and T. The top branch splits into 3 branches, forming species P, Q, and R. These three species are highlighted and labeled Polytomy: we don't have enough info to determine branching order!
Image modified from Taxonomy and phylogeny: Figure 2 by Robert Bear et al., CC BY 4.0
If we later get more information about the species in a tree, we may be able to resolve a polytomy using the new information.

Where do these trees come from?

To generate a phylogenetic tree, scientists often compare and analyze many characteristics of the species or other groups involved. These characteristics can include external morphology (shape/appearance), internal anatomy, behaviors, biochemical pathways, DNA and protein sequences, and even the characteristics of fossils.
To build accurate, meaningful trees, biologists will often use many different characteristics (reducing the chances of any one imperfect piece of data leading to a wrong tree). Still, phylogenetic trees are hypotheses, not definitive answers, and they can only be as good as the data available when they're made. Trees are revised and updated over time as new data becomes available and can be added to the analysis. This is particularly true today, as DNA sequencing increases our ability to compare genes between species.
In the next article on building a tree, we’ll see concrete examples of how different types of data are used to organize species into phylogenetic trees.

Want to join the conversation?