Introduction: Graph Algorithms
A lot of problems in real life are modeled as graphs, and we need to be able to represent those graphs in our code. In this (short) tutorial, we're going to go over graphs, representing those graphs as adjacency lists/matrices, and then we'll look at using Breadth First Search (BFS) and Depth First Search (DFS) to traverse a graph. We'll be writing our code in C++, although the concepts can be extended to any programming language.
This tutorial is supposed to focus on the code, so the drawings are going to be ugly, and the definitions are going to be quick reminders instead of in-depth explanations.
To put it simply, graphs are a set of edges that connect some vertices. To say it "less mathy", there are some points, and some lines that connect those points.
In the graph above, we have four vertices (4, 1, 2, 5) that are connected by some edges. In this case, the edges are bidirectional, meaning 1 is connected to 4 and 4 is connected to 1. Some graphs have directed edges, meaning travel is only allowed in one direction.
Now someone at node 4 can get to node 1, but node 1 can't directly travel to node 4.
Finally, sometimes edges have weights on them.
A reason edges would have weights are to represent distances, as an example. For instance, one road could be 10 miles long, and another is 13.
So how do we represent these graphs in our code? We create what are called adjacency matrices and adjacency lists.
An adjacency matrix is basically a table/spreadsheet, where each row and column represent a vertex. If the vertex 1 and vertex 4 are connected, then cell (1, 4) is a 1. If they aren't connected, then the cell is a 0.
If there's a certain edge weight, then you can represent that edge weight instead of 1 in the cell.
Adjacency matrices take up O(N^2) space. To find if a connection between two vertices exist takes O(1) time, since it's a quick lookup in the table. To add a vertex takes O(N) time, since you need to add a whole row for the new vertex (which also means initialization of the graph takes O(N^2) time).
An adjacency list is another way to represent the edges of a graph. You create a list/array/vector that represents each vertex, and inside of each index for that vertex exists another list/vector/array for all of the vertices that the owner is connected too.
An adjacency list takes up less space than an adjacency list. The actual space it takes up varies based on the graph, but in the worse case, it could take up O(N^2) space if all vertices are connected to each other, which makes it MUCH worse than an adjacency matrix.
To look up a connection takes up O(N) time, as you have to search through a vertex's friend list to see if the friend exists. To add a new vertex is simple, though, and takes O(1) time, since you're just adding it to the end of a list.
Traversing The Graph
So we can represent graphs in our code. (Well, you'll see the coding representations soon.) But how do we traverse through a graph? Specifically, if we're at vertex 1, can we get to vertex 3? What path do we take?
To traverse through a graph, we use BFS and DFS. For the sake of our examples, we're going to traverse through the following graph:
Breadth First Search (BFS)
BFS is a technique for traversing through a graph where we start at a certain vertex, visit all of that vertex's neighbors, then visit the neighbor's neighbors, then visit the neighbor's neighbor's neighbors, etc. (In that order.)
Read through the comments to see what's going on.
Depth First Search (DFS)
Similar to BFS, DFS is a way to traverse a graph. However, instead of using a visiting all of a vertices neighbors before visiting the neighbor's neighbors, DFS just keeps visiting each new node it sees, meaning that it will usually go down a long path, and then come back to visit what it missed. As opposed to a queue, DFS works using recursion.
Read the comments to see how it works.
We went over Graphs, Adjacency Lists, Adjacency Matrices, BFS, and DFS. These data structures and traversal techniques are vital to using graphs.
The next step is to learn about calculating the shortest paths in graphs. See the next blog post to do this.
Then, take a look at maximum flow in this blog post..