Network analysis of multivariate data in psychological science | Nature Reviews Methods Primers

Although network approaches as discussed here draw on insights from statistics and network theory, the specific combination of techniques discussed in this paper has its roots in psychometric modelling in psychological contexts. This section discusses three areas in which this approach has been particularly successful. First, the domain of personality research, where network models have been applied to describe the interaction between stable behavioural patterns that characterize an individual. Second, the domain of attitude research, in which networks have been designed to model the interaction between attitude elements (feelings, thoughts and behaviours) to explain phenomena such as polarization. Last, the domain of mental health research, where networks have been used to represent disorders as systems of interacting symptoms and to represent key concepts such as vulnerability and resilience.

Personality research

Personality researchers are interested in examining the processes characterizing personality traits69. One type of these processes is motivational: research shows that traits such as conscientiousness or extraversion can be considered as means to achieve specific goals, for example getting tasks done and having fun, which have been identified as goals relevant for conscientiousness and extraversion, respectively70. Psychometric network analysis of personality traits and motivational goals combined offers a novel way to explore relations among relatively stable dispositions. Personality networks can represent personality at different levels of abstraction, from higher-order traits to facets to specific items. One could wonder which abstraction level should be preferred. The answer requires balancing simplicity and accuracy of predictions and of explanations. Focusing on a level that is too abstract might result in losing important details, whereas adding elements beyond necessary could result in noisy estimates and, thus, faulty conclusions. An approach that can help is out of sample predictability71. We illustrate this by reanalysing data from Costantini et al.41 (Study 3) that include 9 goals identified as relevant for conscientiousness and 30 items from an adjective-based measure of conscientiousness that assess three main facets: industriousness, impulse control and orderliness44.

Data and analysis

In this sample (N = 432) we explored how well we could predict goals using a tenfold cross-validation approach72. The networks depicted in Fig. 4 represent Gaussian graphical models estimated with the qgraph R package15, using graphical lasso regularization. The lambda parameter for graphical lasso was selected through the extended Bayesian information criterion (γ = 0.5 (ref.33)). We varied the level of representation of the personality dimensions from general (single trait) to specific (3 facets) to molecular (30 items) and explored the relationships between personality and 9 goal scores.

Fig. 4: Strength centrality estimates for all nodes in three networks of personality research data.figure 4

Network of relationships between motivational goals (yellow) and conscientiousness at the level of the trait (panel a), its facets (panel b) and items (panel c). Blue edges represent positive connections and red edges represent negative connections; thicker edges represent stronger relationships. Relationships between personality and goals are emphasized with saturated colours. *Items reverse-scored before entering network estimation. d | Strength centrality for each goal in each network.

Full size image

The results depicted in Fig. 4a suggest that some goals are positively associated and some negatively associated with an overall conscientiousness score. Two goals, personal realization (node 3) and be safe (node 7), do not show direct connections to the trait. However, this network does not consider several ways in which one can be conscientious. Some people can be more organized, others can be more controlled and yet others can be more industrious43. The facet-level network (Fig. 4b) shows that most goals are related to a specific subset of one or two of the three facets, thus characterizing more clearly specific portions of the trait. At this level, personal realization (node 3) is positively related to industriousness but negatively connected to the remaining facets, something that would not have been apparent had we considered the trait level exclusively. At the item level (Fig. 4c), connections appear generally consistent with those emerging at the facet level, albeit with some exceptions. For example, avoid or manage things you do not care about (node 6) shows relations with items of orderliness, whereas no such connection emerged at the facet level.

Results

Figure 4d shows strength centrality estimates for all nodes in the three networks. Irrespective of the abstraction level considered, the most central goal was do something well, avoid mistakes (node 4). The centrality of node 4 is due to connections to other goals, rather than to its connections to conscientiousness. Such connections suggest that node 4 might serve as a means for several other goals. For example, one could speculate that doing things well might be important in the pursuit of more abstract goals, such as personal realization (node 3) or having control (node 2) (see ref.72 for a discussion of the abstractness of these goals).

Results show that the trait level is never the best level for prediction and that some goals are best predicted at the item level and others at the facet level (Table 1), albeit in one case (goal 16) the trait level performed better than the item level. In general, specific levels might be useful if one is mainly interested in examining which elements of the personality system drive the association with a criterion73 or if one is purely interested in prediction. In our example, the item level performed, on average, slightly better than the facet level in terms of prediction, although this was not the case for all goals (see also ref.74). A preference for more abstract levels sometimes amounts to sacrificing a small portion of prediction in exchange for a noticeable gain in theoretical simplicity. Furthermore, using abstract predictors can sometimes assuage multicollinearity. At the same time, abstracting too much can lump together concepts that are better understood separately. There is no ultimate answer to the selection of the best abstraction level in personality as it heavily depends on the questions being asked and the data available. In general, the facet level might often provide a good balance between specificity and simplicity75,76.

Table 1 Out of sample predictive accuracy (R2) of goals in networks at different abstraction levels

Full size table

Attitude research

Social psychologists are interested in how beliefs and attitudes can change over time. We illustrate the use of networks to improve our understanding of these processes with a study of attitudes towards Bill Clinton in the United States in the early 1990s. Based on the network theory of attitudes (Box 2) one expects that temperature should decrease throughout the years, because Bill Clinton was probably more on individuals’ minds when he was president than before he was president. We investigate changes in the network structure of these attitudes in the years before and during his presidency and whether the temperature of the attitude network changes. In this example, we estimate temperature using variations in how strongly correlated the attitude elements are at the different time points. Temperature of attitude networks can, however, also be measured by several proxies, such as how much attention individuals direct towards a given issue and how important they judge the issue.

Data and analysis

We use data from the open access repository of the ANES between 1992 and 1996 including beliefs and emotions towards Bill Clinton. For this example, the presented data have been previously reported77,78. Beliefs were assessed using a four-point scale ranging from describes Bill Clinton extremely well to not at all. Emotions were assessed using a dichotomous scale with answer options of yes, have felt and no, never felt. Dichotomizing the belief questions, we fit an Ising model with increasing constraints representing their hypotheses to this longitudinal assessment of beliefs and emotions in the American electorate. We investigate the impact on the fit of the model of constraining edges between nodes to be equal across time points, constraining the external fields to be equal across time points and constraining the temperature (the entropy of the system) to be equal across time points. Additionally, we tested whether a dense network (all nodes are connected) or a sparse network (at least some edges are absent) fits the data best. After estimating the network, we applied the walktrap algorithm to the network to detect different communities, such as, for example, sets of highly interconnected nodes68,79. The walktrap algorithm makes use of random walks to detect communities. If random walks between two nodes are sufficiently short, these two nodes are assigned to the same community.

Results

The results show a sparse network with a stable network structure, where edges do not differ between time points (Fig. 5). The model with varying external information and temperature fitted the data best. Figure 5a shows the estimated network at the four time points. The attitude network is connected: every attitude element is at least indirectly connected to every other attitude element. As can be seen, negative emotions of feeling afraid and angry are strongly connected to each other, as are positive emotions of feeling hope and pride. Within the beliefs, believing that Bill Clinton gets things done and provides strong leadership are closely connected. The belief that he cares about people is closely connected to the positive emotions. The walktrap algorithm detected two communities: one large community that contains all beliefs and the positive emotions; and one smaller community that contains the negative emotions. This indicates that positive emotions are more closely related to (positive) beliefs than positive and negative emotions are related to each other.

Fig. 5: Illustration of an estimated attitude network from panel data.figure 5

a | Estimated attitude network towards Bill Clinton. Colour of nodes corresponds to communities detected by the walktrap algorithm. Blue edges indicate positive connections between attitude elements and red edges indicate negative connections; width of the edges corresponds to strength of connection. b | Change in temperature throughout time. c | Histograms for overall attitude towards Bill Clinton in each year.

Full size image

Figure 5b shows changes in temperature throughout the years. As can be expected from the network theory of attitudes (Box 2), the temperature of the attitude network generally decreased throughout the years, with the sharpest drop before the election in 1996 revealing an increase in the specificity of respondents’ attitudes towards Clinton. This implies that attitude elements became more consistent over time, resulting in more polarized attitudes. The increase in temperature between 1993 and 1994, however, is somewhat surprising.

Figure 5c shows the distribution of the overall attitude, separately measured on a scale ranging from 0 to 100, with higher numbers indicating more favourable attitudes. Based on the decreasing temperature of the attitude networks, a corresponding increase in the extremity of these distributions is to be expected. This is exactly what was found; the variance of the distributions increased in a somewhat similar fashion as the temperature of the attitude network decreased. The increase in the variance between 1993 and 1994 was the only exception.

Mental health research

Mental health research and practice rest on reportable symptoms and observable signs. Therapists interviewing patients will ask questions about subjective symptoms as well as assess signs of behavioural distress (such as agitated hand-wringing and crying). The challenge for both mental health researchers and therapists is to determine the cause of the person’s constellation of signs and symptoms. Therapists, moreover, have the additional charge of using this information to devise an appropriate course of treatment. The network theory of psychopathology80,81 suggests that mental disorders are best understood as clusters of symptoms sufficiently unified by causal relations among those symptoms that support induction, explanation, prediction and control82,83 (Box 3). Signs and symptoms are constitutive of disorder, not the result of an unobservable common cause. We illustrate this with an example study of social interaction and its relations to mental health variables in a student sample during the COVID-19 pandemic.

Box 3 Disease models versus network structures in mental health

Symptoms and signs associated with mental illness do not co-occur randomly. For example, recurrent obsessive thoughts about potential contamination co-occur more often with compulsive handwashing than with paranoid delusions. The tendency for some symptoms to co-occur may be owing to a common underlying cause. For example, consider a patient complaining of fatigue, pain upon swallowing, a fever and white patches in the throat. A physician may posit the Streptococcus bacterium as the common cause of the co-occurrence of the patient’s signs and symptoms86,87, and can eliminate the patient’s illness by therapeutically targeting the bacteria rather than the resulting symptoms. This bacterial model of disease became firmly entrenched early in psychiatry’s history, shaping the field’s methods and motivating researchers to identify the common underlying cause of regularly co-occurring signs and symptoms81 (see the figure, part a). Despite the widespread and often implicit influence of the bacterial model of disease, failures to discover biomarkers of putative underlying entities have continued to mount during the past century146. The network theory of psychopathology provides an alternative account of why some symptoms tend to co-occur37,80. Rather than being the independent, functionally unrelated consequences of an underlying common cause, the network theory of psychopathology posits that symptoms co-occur owing to causal interactions among the signs and symptoms themselves81,147 (see the figure, part b). Indeed, the Diagnostic and Statistical Manual of Mental Disorders criteria often specify functional relations among symptoms. For example, compulsive rituals diminish the distress provoked by obsessions and avoidance behaviour in panic disorder arises as a consequence of recurrent panic attacks. This simple idea forms the foundation of the network approach to psychopathology and motivates the effort to investigate the structure of relationships among symptoms using psychometric network analysis.

Data and analysis

Researchers have devised an ecological momentary assessment study following 80 students (mean age = 20.38 years, standard deviation = 3.68, range = 18–48 years; n = 60 female, n = 19 male, n = 1 other) from Leiden University for 2 weeks in their daily lives50. With 19 different nationalities represented, this sample is highly international. Most students are single (n = 50), one–third of the students are currently employed and about 1 in 5 students report prior mental health problems. In this study, participants are asked about the extent of their worry, sadness, irritability and other subjective phenomenological experiences four times per day via a smartphone application. We use multilevel vector autoregressive modelling to assess the contemporaneous and temporal associations among problems related to generalized anxiety and depression. As a reminder, the contemporaneous network covers relations within the same 3-h assessment window, and the temporal network lag – 1 relations between one 3-h window and the next.

Results

The resulting networks can be used to inform our understanding of how the modelled variables evolve over time (Fig. 6). In this application, the model suggests that the cognitive symptom worry and the affective symptom nervous exhibit a strong contemporaneous association but do not exhibit a conditional dependence relation in temporal analyses, indicating that the relation between these items may be limited to a 3-h time interval. Similarly, we can clarify the paths by which external factors, such as social interaction, predict and are predicted by mental health. For example, the contemporaneous association between offline social interaction (nodes 8) and worry (node 3) occurs via feelings of loneliness (node 7), information which could be used in the generation of hypotheses about the causal relationships among these symptoms. It is also notable that different types of social interaction are differentially associated with loneliness. Offline social interaction is conditionally associated with lower levels of loneliness, whereas online social interaction is associated with higher levels of loneliness. The temporal associations further inform our understanding of these relationships. Difficulty envisioning the future and difficulty relaxing predict online social interaction, and online social interaction predicts subsequent difficulty relaxing. This illustrates how psychometric network analysis of time series naturally leads to more detailed hypotheses about the system under study; do note that this use of network analysis is exploratory and that generated hypotheses require independent testing, ideally through research that utilizes experimental interventions.

Fig. 6: Time-series networks.figure 6

Contemporaneous network (left) of conditional associations between variables obtained after controlling for temporal effects in the temporal network (right); latter represents carry-over effects from one time point to the next. Blue edges indicate positive connections and red edges indicate negative connections; width of edges corresponds to strength of connection.

Full size image

Network analyses not only equip researchers to investigate the associations among symptoms but also provide a novel framework for conceptualizing treatment. There are at least two potential ways one can intervene on a system, such as that depicted in Fig. 6. First, we can lower the mean level of a node by diminishing its frequency or severity. For example, we could intervene on the online social interaction node, hoping, based on the contemporaneous relations, that it might promote offline social interaction, alleviate loneliness and, in turn, foster less worry, more optimism and greater interest and pleasure. However, even if initially successful, merely intervening on a node may be insufficient, leaving the person vulnerable to relapse, as the structure of the network remains intact. If pessimism and an inability to relax are, indeed, encouraging online social interaction, then when our intervention on this node ceases, the problem may return, erasing our treatment gains. Accordingly, instead of targeting a specific node (or symptom), we may target the link between symptoms, thereby changing the structure of the network. For example, rather than aiming to reduce online social interaction in general, we could specifically target the tendency to engage in online social interaction when the person experiences pessimism or difficulty relaxing, thereby eliminating the temporal association between these symptoms and online social interaction and disrupting the network.