What is Network Analysis?. A brief introduction with examples | by Mengsay Loem | Towards Data Science
Mục Lục
What is Network Analysis?
A brief introduction with examples
If you are using any social media applications, you may have experienced the friend or follower suggestions functions. Have you ever wondered how these functions work? One common technology used in these cases is Network Analysis.
What is a Network?
A network refers to a structure representing a group of objects/people and relationships between them. It is also known as a graph in mathematics. A network structure consists of nodes and edges. Here, nodes represent objects we are going to analyze while edges represent the relationships between those objects.
For example, if we are studying a social relationship between Facebook users, nodes are target users and edges are relationships such as friendships between users or group memberships. In Twitter, edges can be following/follower relationships.
Image by Author
Why Network Analysis?
Network Analysis is useful in many living application tasks. It helps us in deep understanding the structure of a relationship in social networks, a structure or process of change in natural phenomenons, or even the analysis of biological systems of organisms.
Again, let’s use the network of social media users as an example. Analyzing this network helps in
- Identifying the most influent person/people in a group
- Defining characteristics of groups of users
- Prediction of suitable items for users
- Identifying CM targets ,etc.
Other easy-to-understand examples are the Friend Suggestion function in Facebook or Follow Suggestion function in Twitter.
Who is the Important Person?
A crucial application of network analysis is identifying the important node in a network. This task is called Measuring Network Centrality. In social network analysis, it can refer to the task of identifying the most influential member, or the representative of the group.
Image by Author
For example, which node do you think is the most important one in the above figure?
Of course, to define the most important node, we need a specific definition of the Important Node. There are several indicators used to measure the centrality of a node.
- Degree centrality: node with a higher degree has higher centrality
- Eigenvector centrality: adding to the degree of one node, the centralities of neighbor nodes are considered. As a result, the eigenvector corresponding to the highest eigenvalue of the adjacency matrix represents the centrality of nodes in the network
- Betweenness centrality: the number of paths between two nodes that go through the i-th node is considered as the i-th node’s betweenness centrality.
- Closeness centrality: the length of the path from the i-th node to other nodes in the network is considered as the i-th node’s closeness centrality. With this definition, for example, this centrality can be applied in the task of defining a suitable evacuation site in a city.
import networkx as nx
import numpy as np
import matplotlib.pyplot as pltG = nx.Graph()
G.add_nodes_from(["A","B","C","D","E","F","G","H","I","J","K"])
G.add_edges_from([("A","C"),("B","C"),("C","D"),("D","E"),
("D","G"),("A","G"),("F","H"),("G","H"),("H","I"),
("I","J"),("I","K")])nx.draw(G, node_size=400, node_color='red', with_labels=True, font_weight='bold')
print("degree centrality:")
for k, v in sorted(nx.degree_centrality(G).items(), key=lambda x: -x[1]):
print(str(k)+":"+"{:.3}".format(v)+" ", end="")
print("\n")print("eigenvector centrality:")
for k, v in sorted(nx.eigenvector_centrality(G).items(), key=lambda x: -x[1]):
print(str(k)+":"+"{:.3}".format(v)+" ", end="")
print("\n")print("between centrality:")
for k, v in sorted(nx.betweenness_centrality(G).items(), key=lambda x: -x[1]):
print(str(k)+":"+"{:.3}".format(v)+" ", end="")
print("\n")print("closeness centrality:")
for k, v in sorted(nx.closeness_centrality(G).items(), key=lambda x: -x[1]):
print(str(k)+":"+"{:.3}".format(v)+" ", end="")
print("\n")
Image by AuthorImage by Author
Who are we?
Another application of network analysis is the Community Detection task. This task purpose to divide a network into groups of nodes that are similar in any specific features. Examples of this task are a task of defining groups of users in SNS who share common interests/opinions, find groups of customers to advertise specific items, recommendation systems in online shopping systems, etc.
Many researchers are working on algorithms to effectively solve community detection problems. Some well-known algorithms/methods in this task are Kernighan-Lin algorithms, Spectral Clustering, Label propagation, Modularity Optimization, etc.
sample codeImage by AuthorImage by Author
What is else?
Besides these applications, network analysis also plays important role in time series analysis, natural language processing, telecommunication network analysis, etc. Recently, the technology of Machine Learning (Deep Learning) is also used in network analysis. In this case, research on Graph Embedding and Graph Neural Networks are interesting topics.
For more detail, I recommend the following sites and textbooks.
- Network Science (http://networksciencebook.com/)
- Networks: A Very Short Introduction
(http://www.veryshortintroductions.com/view/10.1093/actrade/9780199588077.001.0001/actrade-9780199588077) - Networks, Crowds, and Markets
(https://www.cs.cornell.edu/home/kleinber/networks-book/)