The Web Graph: A Global Perspective
- The web graph (directed graph) is a massive network of interconnected web pages, forming the foundation of how the internet works.
- However, within this vast structure, we can identify smaller sub-graphs that focus on specific topics or themes.
Directed graph
A collection of vertices (nodes) connected by edges (links) that have a specific direction.
Sub-graph
A smaller part of a larger graph, containing a subset of nodes and edges that maintain the original connections between those nodes.
- The web graph is a directed graph where:
- Nodes represent web pages.
- Edges represent hyperlinks between pages.
- It is massive, with billions of nodes and edges, making it one of the largest graphs ever created.
- The web graph is dynamic, constantly evolving as new pages are added and old ones are removed.
Sub-Graphs: Focused Networks
- A sub-graph is a smaller, more focused part of the larger web graph.
- It contains a subset of nodes and edges from the original graph, preserving the connections between those nodes.
- Sub-graphs are often used to analyze specific topics or communities within the web.
Consider a sub-graph focused on machine learning:
- It includes web pages like research articles, tutorials, and blogs related to machine learning.
- The edges represent hyperlinks between these pages.
- This sub-graph helps researchers analyze how information flows within the machine learning community.