Graph Data Structure


Graphs are one of the most powerful and flexible data structures used in computer science. They are used to represent relationships or connections between various entities and are incredibly useful in applications like social networks, web page linking, route mapping, and more. A Graph is made up of vertices (or nodes) and edges that connect pairs of vertices. This simple yet powerful structure allows for a wide variety of problems and algorithms, such as shortest path, network flow, and graph traversal.


What is a Graph?

A Graph is a collection of nodes (also called vertices) and edges (also called arcs) that connect pairs of nodes. It is used to model pairwise relationships between objects in various fields, such as computer science, engineering, and social sciences.

Basic Graph Terminology

  1. Vertex (or Node): A basic unit of a graph that represents an entity.
  2. Edge (or Arc): A connection or relationship between two vertices.
  3. Adjacency: Two vertices are adjacent if they are connected by an edge.
  4. Degree: The number of edges incident to a vertex.
    • In-degree: Number of edges coming into the vertex (in directed graphs).
    • Out-degree: Number of edges going out of the vertex (in directed graphs).
  5. Path: A sequence of vertices connected by edges.
  6. Cycle: A path where the first and last vertices are the same.
  7. Connected Graph: A graph in which there is a path between any two vertices.
  8. Disconnected Graph: A graph where some vertices are not reachable from others.

Types of Graphs

Graphs can be classified in several ways, based on the directionality of the edges, the presence of cycles, and other characteristics.

1. Directed Graph (Digraph)

In a directed graph, the edges have a direction. Each edge points from one vertex to another. For example, in a social network, an edge may represent a "follow" relationship, which is one-way.

  • Example: If there is an edge from vertex A to vertex B, it means there is a directed relationship from A to B but not necessarily from B to A.

2. Undirected Graph

In an undirected graph, the edges do not have a direction. An edge between two vertices represents a bidirectional relationship.

  • Example: In a friendship graph, if A and B are friends, the edge connecting them is undirected because friendship is mutual.

3. Weighted Graph

A weighted graph is a graph where each edge has a weight or cost associated with it. This is useful in problems like finding the shortest path or optimal routing, where the edges represent distances or costs.

  • Example: In a road network, the weight of an edge could represent the distance or time to travel between two cities.

4. Unweighted Graph

In an unweighted graph, edges do not carry any weight. This type of graph is simpler and is often used in basic graph traversal algorithms.

5. Cyclic Graph

A cyclic graph contains at least one cycle. A cycle is a path that starts and ends at the same vertex.

  • Example: A graph representing roads in a city can contain cycles (e.g., a roundabout).

6. Acyclic Graph

An acyclic graph is a graph that does not contain any cycles. These are often used in scheduling, dependency resolution, etc.

  • Example: A Directed Acyclic Graph (DAG) is used to represent tasks with dependencies, where tasks cannot have circular dependencies.

7. Bipartite Graph

A bipartite graph is a graph whose vertices can be divided into two distinct sets such that no two vertices within the same set are adjacent.

  • Example: In a job assignment problem, the vertices in one set represent employees, and the vertices in the other set represent jobs. An edge between an employee and a job indicates that the employee is qualified for the job.

Graph Representation

Graphs can be represented in multiple ways, depending on the type of graph and the algorithm to be used. The two most common methods of representing graphs are:

1. Adjacency Matrix

An adjacency matrix is a 2D array (matrix) used to represent a graph. If there are n vertices, the matrix is of size n x n. Each element of the matrix represents an edge between two vertices.

  • For an undirected graph, the matrix is symmetric (i.e., matrix[i][j] == matrix[j][i]).
  • For a directed graph, the matrix may not be symmetric.

In a weighted graph, the matrix stores the weights of the edges. For an unweighted graph, it typically stores 1 for the presence of an edge and 0 for its absence.

Example of an Adjacency Matrix (undirected, unweighted graph):

   A  B  C  D
A [0, 1, 1, 0]
B [1, 0, 1, 1]
C [1, 1, 0, 1]
D [0, 1, 1, 0]

2. Adjacency List

An adjacency list is a collection of lists or arrays. Each vertex in the graph has a list that contains all of its adjacent vertices (i.e., all vertices connected by an edge).

  • For an undirected graph, if there is an edge between vertices u and v, then both u's list and v's list will include each other.

Example of an Adjacency List (undirected graph):

A -> B -> C
B -> A -> C -> D
C -> A -> B -> D
D -> B -> C

3. Edge List

An edge list is a collection of all the edges in the graph. Each edge is represented as a pair of vertices (for an undirected graph) or an ordered pair (for a directed graph).

Example of an Edge List (undirected graph):

[ (A, B), (A, C), (B, C), (B, D), (C, D) ]

Graph Traversal Algorithms

Graph traversal is the process of visiting all the vertices of a graph in a systematic way. The two most commonly used graph traversal algorithms are Depth-First Search (DFS) and Breadth-First Search (BFS).

1. Depth-First Search (DFS)

DFS explores as far down a branch as possible before backtracking. It is typically implemented using recursion or a stack.

  • Time Complexity: O(V + E), where V is the number of vertices and E is the number of edges.
  • Use Cases: Pathfinding, topological sorting, and solving puzzles like mazes.

DFS Algorithm:

  • Start at a vertex and explore as far as possible along each branch before backtracking.
  • Mark the visited vertices.
  • If all vertices are visited, the traversal ends.

2. Breadth-First Search (BFS)

BFS explores all the neighbors of a vertex before moving on to their neighbors. It uses a queue to keep track of the vertices to visit next.

  • Time Complexity: O(V + E).
  • Use Cases: Finding the shortest path in an unweighted graph, web crawlers, and social networks.

BFS Algorithm:

  • Start at a vertex, visit all its neighbors, and add them to the queue.
  • Then, visit all the neighbors of the neighbors, and so on.
  • Mark the visited vertices.

Applications of Graphs

Graphs have wide applications in various fields. Some of the key applications include:

  1. Social Networks: Graphs are used to model social connections. Vertices represent people, and edges represent relationships (e.g., friendships or follows).
  2. Routing Algorithms: In networks (like the internet or transport systems), graphs are used to find optimal paths between nodes, such as the shortest route in GPS systems.
  3. Dependency Resolution: In software development, graphs (particularly Directed Acyclic Graphs, or DAGs) are used to represent task dependencies in build systems.
  4. Recommendation Systems: Graphs are used to model relationships between users and products, where products are connected based on user preferences or similarities.
  5. Web Crawling: Graphs are used to represent web pages and hyperlinks between them. Algorithms can be used to crawl the web and index pages.
  6. Game AI: In video games, graphs represent the possible moves or states of the game, and algorithms like DFS or BFS can be used to find optimal strategies.