Social Network Analysis

Social Network Analysis

UA hidden connection is stronger than an obvious one.

Heraclitus

This chapter will review some basic social network analysis fundamentals. Understanding the essential theories, concepts, and terminology is critical for further discussion of information modeling and control within the scope of online social networks. The network concepts of density, structural holes, strength of ties, centrality, and distance are briefly explained with visual examples and some mathematical representations. Small world networks and polarization are discussed and are of particular interest when examining online social media groups. Using a simple three-node network group, the relationship between a network configuration and its adjacency matrix is examined. To conclude the chapter, an example of a directional sociogram and its accompanying adjacency matrix for a sports club is given to pave the way toward practical social network analysis applications.

Density and Structural Holes

Network density is defined as the number of direct actual connections divided by the number of possible direct connections in a network. A “potential connection” is a connection that could potentially exist between any two nodes, although it may not actually be connected. An “actual connection” is one that actually exists [39]. Equation 5.1 gives the mathematical calculation for network density, where n is the number of nodes in the network. Figure 5.1 visually compares a sample sparse and dense network.

Comparison of sparse and dense networks

FIGURE 5.1: Comparison of sparse and dense networks.

A real-life group such as a class or club would typically be reasonably dense because each individual is usually acquainted with (or directly connected to) their classmates or group members. Similarly, online groups with high levels of direct communication such as family social media groups or online game “guilds” of sufficiently small size will be relatively dense. Higher levels of density often come paired with an increase of information spread and a sense of community along with the resultant inter-group social support structures. By their nature, small networks tend to be denser than large social networks. It’s easy to know everyone in a class of twenty individuals, but knowing everyone in an entire school becomes increasingly unfeasible.

In direct contrast to the concept of density is what Burt refers to as “structural holes” [40]. Imagine two dense networks comprised of individuals that mostly know one another, and a single individual is a part of both groups, being their only common connection. If we imagine the networks combined into a single, larger grouping, there exists a structural hole within the new network, centered around the cluster-bridging individual. Figure 5.2 illustrates a single connecting individual bridging two clusters within the same social network. One may naturally wonder why these groups are in the same network and not divided, but several real-world examples of structural holes in social networks are common. Politically different groups within the same country, rival teams within a sports league, and college courses taught by a single professor at two different schools are all good examples of social network structural holes.

Illustration of a structural hole

FIGURE 5.2: Illustration of a structural hole.

Weak and Strong Ties

The concept of weak ties is closely related to that of a structural hole, in that weakly tied social networks are linked by a few bridging individuals, such that two or more distinct group clusters can be readily identified. Practically speaking, weak ties help prevent large networks from being completely fragmented by facilitating the spread of information between segments. However, other factors help define a tie strength, such as the length of time individuals are acquainted, level of interaction, and how close in friendship or acquaintance individuals subjectively feel toward one another [32]. Pairs with strong ties might be friends or family members, while those with weak ties are more likely simple acquaintances, coworkers, or neighbors. Especially in online social networks, weak ties can play a critical role in information diffusion. Strong ties can be seen in the opposite fashion. They facilitate reinforcement of group values and tend to feed the same ideas and culture back into the group. An example of a network with both strong and weak ties is shown in Figure 5.3, where solid lines represent strong ties and dash lines signify weak ties.

Strong and weak ties

FIGURE 5.3: Strong and weak ties.

Centrality and Distance

In simplest terms, centrality describes how connected a node is to the network [23]. A centralized node will be highly connected to several other important nodes and hence have easier access to a number of network members when compared to a low centrality node. As there are many ways to define the importance of a node based on its connectivity, there are multiple methods used to define centrality quantitatively. Popular centrality measurements including degree centrality, closeness centrality, betweenness centrality, eigenvector centrality, and Katz centrality, to name a few. In Figure 5.4, nodes of high centrality are readily apparently by their high level of connectivity and importance to the network structure. The removal of these nodes would considerably change the network structure, while outlier nodes with few connections woidd keep the basic structure of the network intact.

A random network demonstrating centrality and distance

FIGURE 5.4: A random network demonstrating centrality and distance.

Degree centrality can be thought of as a node’s risk of catching whatever information (in this context) is flowing through the network immediately and represented mathematically as follows:

where v is the node of interest. Additionally, degree centrality can be expanded to the entire network group to measure network centrality, or the degree to which the network is centralized is determined by:

where v* is the highest degree node of network graph G. H is defined as:

with y* as the node with the highest degree centrality in the network Y that maximizes //. Mathematically, closeness centrality (the most intuitive measure) is calculated as:

which is the reciprocal of “farness”, where N is the total number of nodes in the network and d(y,x) is the distance between the x and у vertices [41].

Related to centrality is the idea of “distance” between nodes of a network. Also known as a geodesic distance, network structure distance is formally defined as the distance between any two nodes is the length of the shortest path via the edges or binary connections between nodes [42]. Typically, distance is calculated using breadth-first traversal [43] or Dijkstra’s algorithm [44]. Consider Figure 5.4 again and note that the highly connected central nodes can reach nearly any other node in only a few steps, by following their connection lines. Direct neighbors will be reachable in only a single step, while individual outer nodes will take two of three steps. In contrast, less centralized nodes will need, at minimum, several steps (perhaps passing through these centralized nodes) to reach other outlying network members. The centrality and distance concepts become increasingly important when discussing large populations such as cities or political groups. In the context of online social media, these principles remain true. An entrepreneurial or political leader will be a centralized individual within a country when making an online post or announcement, just as they would be if their information were to spread in traditional media outlets such as television and newspapers. On social networking sites such as Facebook, “friends of friends” are a greater network distance from an individual than their core friend group, so receiving and relating information becomes slower and more difficult.

Small World Networks

Consider a network in which there is no overlap between each individual’s personal networks if taken as a series of simple nodes and their direct neighbors. In this scenario, each new individual added to a network brings in an entire group of new network members that they alone have acquaintance with. It’s easy to see that networks organized in this fashion can attain extensive reach by adding only a few members. However, such groupings are not common in

Small World Networks

49

typical real-world networks, particularly when discussing online social networks. Friends have common friends (or friends of friends) that all know each other from the same college or club. Coworkers know many others in the office and do not befriend each other in entirely isolated groups. There are usually several non-unique individuals with relationships from overlapping sources (via a combination of strong and weak ties). These types of networks are known as “small world” networks [45].

Small world networks are perhaps the most commonly discussed and analyzed due to both their limited scope and realistic inter-connectivity. Such networks can be imagined as several sets of highly connected teams with some connectivity between members of other teams. A sample small-world network is shown in Figure 5.5.

Sample small world network group

FIGURE 5.5: Sample small world network group.

There are several advantages to small-world networks. In fact, they are generally regarded as highly robust. For example, any one random node will have a reasonably short path to another node. If something happens to one subgroup or it is somehow cut-off from the others, it is not entirely isolated (though it will potentially require additional steps to reach). Due to the high connectedness of the smaller groups, they can communicate and work quickly and efficiently, while still having a connection with other groups. It’s easy to see how a small-world network configuration would be beneficial in a business or production environment. In biological groups, it helps reduce potential damage a virus or genetic mutation might have on a population, as infected subgroups are less connected to other groups. Examples of small-world networks include power grids, social network influencers, and voting groups within a political party.

With all of the advantages of small-world networks, there are also some key potential disadvantages, especially in relation to social media and information spread. Small network groups tend to resist change since members of their subgroup have a strong influence over each member (regardless of what true or false information might be coming from other subgroups). This can allow misinformation to be widely believed within a smaller and closer group and makes it very difficult to change each member’s beliefs. Social media “echo chambers” based on culture, socio-economic class, and political ideologies are allowed to thrive in such a network environment. This is why many social media users will continuously encounter the same talking points and further reinforce existing beliefs. How often have we asked ourselves why our friend groups and colleagues seem so reasonable, but others are posting nonsense?

Clusters, Cohesion, and Polarization

The idea of social network clusters are closely linked to Charles Cooley’s concept of primary groups:

By primary groups, I mean those characterized by intimate face-to-face association and cooperation. They are primary in several senses, but chiefly in that, they are fundamental in forming the social nature and ideals of the individual. The result of intimate association, psychologically, is a certain fusion of individualities in a common whole, so that one’s very self, for many purposes at least, is the common life and purpose of the group. Perhaps the simplest way of describing this wholeness is by saying that it is a “we”; it involves the sort of sympathy and mutual identification for which “we” is the natural expression. One lives in the feeling fo the whole and finds the chief aims of his will in that feeling [46].

In many ways, clusters are similar to Cooley’s primary groups, but they do not overlap. Under cluster categorizations, one cannot be a member of multiple clusters at once. Sometimes there exist hierarchies and organization by which members identify themselves, but oftentimes (especially in vast social networks), formalized categorizations can get messy and blurred even if they technically exist [32]. Based on datasets created by Lada Adamic in 2004, Figure 5.6 shows two distinct politically oriented blogs: liberal and conservative, forming two distinct (highly polarized) clusters within the online social bloggers’ network [47].

Polarization of the U.S. Political Blogosphere. Courtesy of [48]

FIGURE 5.6: Polarization of the U.S. Political Blogosphere. Courtesy of [48].

Cohesion is a measure of network group connectivity in social groups. It defines the minimum number of individuals that must be removed from the group to cause it to dissociate. Ideally, a highly cohesive group will be connected to several members within the same cluster in a network such that severing individuals from the group does not cause the cluster to break apart to any substantial degree. Cohesive primary groups within a larger social network are often casually referred to as “cliques”. The strength or cohesion of cliques can be measured by their ability to pull together as a group to resist disruptive forces directed toward the clustered network group [49]. For example, if someone challenges the beliefs and norms of a cohesive cluster, it will join together to reinforce those beliefs and norms.

In modern social network commentary, cluster polarization is a hot topic. Figure 5.6 exemplifies a highly polarized political community in which the vast majority of online social network members blog with strong ideological tendencies, usually in direct opposition to another strongly cohesive cluster. Most members are either firmly liberal or conservative, with only a small section of the network acting as moderate bloggers. The concepts of homophily and filter bubbles discussed earlier come into play in scenarios where a network is polarized, as members surround themselves with information with which they already agree. Concerns have been expressed over the dangers of this trend, especially with the advent of online social media sites where members of a cohesive clique can easily fall into their own bubbles of personalized news feeds, search recommendations, and YouTube video programs [50]. Modeling and attempts at controlling highly polarized groups will be addressed in later parts in detail.

The Adjacency Matrix

In the previous sections, some network relationships were examined on a high level, but there was no mention of how to represent them mathematically. One way to describe a network and its interrelationships is the adjacency matrix.

Again, let us consider the three-node sociogram from the previous chapter, as shown in Figure 5.7. Notice that each node pair of individual relationships has two elements of interest: direction and the presence of a connection. An adjacency matrix can be formed from the simple social network structure to show the mathematical relationships between each pair of the networked group. In the sample adjacency matrix presented in Table 5.1, 0 represents no connection between the paired groups and 1 represents a connection. It should be noted that the connection is directional. While one person may be connected to an adjacent individual, that second individual may not have a connection to the initial person. Unidirectional connection situations are common in social media, in which one user might follow a celebrity or social media “influencer”, but not be reciprocally followed.

Revisiting the three-node relationship map

FIGURE 5.7: Revisiting the three-node relationship map.

TABLE 5.1: Adjacency matrix of a sample 3-node network.

1

2

3

1

0

1

1

2

1

0

0

3

1

1

0

The tabular matrix can be rewritten as a standard matrix for future mathematical manipulation.

Example: A Fencing Club Sociogram

A typical fencing club is broken up into three subgroups based on the club member’s primary weapon: foil, epee, and saber. Naturally, members of each weapon subgroup know each other. During the course of club practice days, friendships are developed between members that are independent of weapon type based on common interests, personality type, etc. Additionally, there is a central figure: the fencing coach. The coach knows all of the members and interacts with them regularly during practice times. An overview of the fencing club’s connectivity is shown through the sociogram in Figure 5.8.

Sociogram of a fencing club

FIGURE 5.8: Sociogram of a fencing club.

Notice that all of the relationships are bidirectional. This makes sense in the example’s context, as the in-person nature of the interactions would imply that if one person is acquainted with another, the other person would know them back. Also, notice that Scott, the fencing coach, is the central figure connecting the three subgroups. Though not shown, there can also be strong and weak ties associated between individuals for added complexity. The adjacency matrix can easily be determined by visually tracing the directional connections between members, as shown in Table 5.2.

TABLE 5.2: Adjacency matrix of a fencing club.

Annie

David

Jamie

Jason

Lili

Marc

Mike

Sam

Sarah

Scott

Stacy

Ted

Annie

0

0

0

0

0

1

0

1

0

1

0

0

□avid

0

0

0

0

1

0

1

1

0

1

0

0

Jamie

0

0

0

1

0

0

0

0

1

1

1

0

Jason

0

0

1

0

0

0

0

0

1

1

1

0

Lili

0

1

0

0

0

0

1

0

1

1

0

0

Marc

1

0

0

0

0

0

0

1

0

1

0

1

Mike

0

1

0

0

1

0

0

1

0

1

0

0

Sam

1

1

0

0

0

0

1

0

1

1

0

1

Sarah

0

0

1

1

1

0

0

1

0

1

1

0

Scott

1

1

1

1

1

1

1

1

1

0

1

1

Stacy

0

0

1

1

0

0

0

0

1

1

0

0

Ted

1

0

0

0

0

1

0

1

0

1

1

0

Tiffany

1

1

0

0

1

0

1

0

0

1

0

0

Exercises

Exercises

  • 1. Discuss if a sparse network or a dense network will be more suitable for information spread? For misinformation spread?
  • 2. Give an example of a real-life social media network having both strong and weak ties.
  • 3. Create a random map using Social Network Visualizer (SocNetV) or similar software having 50 nodes. Compute the centrality and distance of this network.
  • 4. Draw a sociogram of two or more intersecting groups in your daily life (friends, clubs, family, etc.). Explain your sociogram and how the groups are connected through one or more nodes. Create an adjacency matrix based on your sociogram. Who are the most and least connected individuals?

Part II