Leiden algorithm
Part of a series on | ||||
Network science | ||||
---|---|---|---|---|
Network types | ||||
Graphs | ||||
|
||||
Models | ||||
|
||||
| ||||
The Leiden algorithm is a community detection algorithm developed by Traag et al [1] at Leiden University. It was developed as a modification of the Louvain method to address the issues with disconnected communities.
Graph components
[edit]Before defining the Leiden algorithm, it will be helpful to define some of the components of a graph.
Vertices and edges
[edit]A graph is composed of vertices (nodes) and edges. Each edge is connected to two vertices, and each vertex may be connected to zero or more edges. Edges are typically represented by straight lines, while nodes are represented by circles or points. In set notation, let be the set of vertices, and be the set of edges:
where is the directed edge from vertex to vertex . We can also write this as an ordered pair:
Community
[edit]A community is a unique set of nodes:
and the union of all communities must be the total set of vertices:
Partition
[edit]A partition is the set of all communities:
Quality
[edit]Similar to modularity, the quality function is used to assess how well the communities have been allocated. The Leiden algorithm uses the Constant Potts Model (CPM):[2]
Algorithm
[edit]The Leiden algorithm is similar to that of the Louvain method, with some important modifications.
Step 1: First, each node in the network is assigned to its own community.
Step 2: Next, we decide which communities to move the nodes into and update the partition .
queue = V(G) # create a queue from the nodes while queue != empty: node = queue.next() # get the next node delta_H = 0 for C in communities: # compute the change in quality for each community if delta_H(node, C) > delta_H: delta_H = delta_H(node, C) community = C if delta_H > 0: move node to community outside_nodes = { node_i | (node, node_i) are edges and node_i is not in community } # find the nodes which are connected to the node but not in the community queue.add(outside_nodes not already in queue)
Step 3: Assign each node in the graph to its own community in a new partition called .
Step 4: The goal of this step is to separate poorly-connected communities:
for C in communities of P: # find the nodes in the community which have lots of edges within the community well_connected_nodes = { node | node is in C, |E(node, C\node)| >= gamma ||node||(||C|| - ||node||) } for node in well_connected_nodes: if node is singleton under P_refined: well_connected_communities = { C_i | C_i is in P_refined, C_i is a subset of C, |E(C_i, C\C_i)| >= gamma*||C_i||(||C|| - ||C_i||) for C_i in well_connected_communities: compute probability P(C_i) # 0 if assignment of node to C_i decreases quality of P_refined, greater weights for greater quality increases assign node to C_i by sampling P(C_i) distribution
Step 5: Use the refined partition to aggregate the graph. Each community in becomes a node in the new graph .
Example: Suppose that we have:
Then our new set of nodes will be:
Step 6: Update the partition using the aggregated graph. We keep the communities from partition , but the communities can be separated into multiple nodes from the refined partition :
Example: Suppose that is a poorly-connected community from the partition :
Then suppose during the refinement step, it was separated into two communities, and :
When we aggregate the graph, the new nodes will be:
but we will keep the old partition:
Step 7: Repeat Steps 2 - 6 until each community consists of only one node.
References
[edit]- ^ Traag, Vincent A; Waltman, Ludo; van Eck, Nees Jan (26 March 2019). "From Louvain to Leiden: guaranteeing well-connected communities". Scientific Reports. 9 (1): 5233. arXiv:1810.08473. Bibcode:2019NatSR...9.5233T. doi:10.1038/s41598-019-41695-z. PMC 6435756. PMID 30914743.
- ^ Traag, Vincent A; Van Dooren, Paul; Nesterov, Yurii (29 July 2011). "Narrow scope for resolution-limit-free community detection". Physical Review E. 84 (1): 016114. arXiv:1104.3083. Bibcode:2011PhRvE..84a6114T. doi:10.1103/PhysRevE.84.016114. PMID 21867264.