Individual differences among deep neural network models | Nature Communications

Mục Lục

Individual differences emerge in deeper network layers

We here investigate the extent to which deep neural networks exhibit individual differences. We approach this question by training multiple instances of the All-CNN-C network architecture11 and a custom architecture (All-CNN-7) on an object classification task (CIFAR-1012), followed by an in-depth analysis of resulting network-internal representations. Network instances varied only in the initial random assignment of weights, while all other aspects of network training were kept identical. All networks performed similarly in terms of classification accuracy (ranging between 84.4–85.9% and 77.6–78.95% top-1 accuracy for All-CNN-C, and All-CNN-7, respectively).

To study and compare network-internal representations, we extracted network activation patterns for 1000 test images (100 for each of the CIFAR-10 categories, Fig. 1a) and characterized the underlying representations in terms of pairwise distances in the high-dimensional activation space (Fig. 1b). The reasoning of this approach is that if two images are processed similarly in a given layer, then the distance between their activation vectors will be low, whereas images that elicit distinct patterns will have a large activation distance. The matrix of all pairwise distances (size 1000 × 1000) thereby describes the representational geometry of the test images, i.e., how exemplars of various object categories are grouped and separated by the units of a given layer (see below for a detailed depiction of the RDM structure).

To visualize the representational geometries of different network instances and layers, we projected the data into 2D using multidimensional scaling (MDS, metric stress). As can be seen in Fig. 2 for two exemplary cases of All-CNN-C, subsequent network layers increasingly separate out the different image categories, in line with the training objective (see Supplementary Fig. 1 for point-wise stress estimates).

Fig. 2: Representational geometries at different network depths of two DNN instances.

The internal representations of two network instances were characterized based on their representational geometries. We computed the pairwise distances (correlation distance) between activity patterns in response to 1000 test stimuli from 10 visual categories and visualized them in 2D via multidimensional scaling (MDS; metric stress criterion; categories shown in different colors). With increasing depth, networks exhibit increased category clustering.

Full size image

Moving closer to the question of individual differences in network representations, we next compared the arrangement of activation vectors across network layers and instances (2nd level RSA, see “Methods” section). That is, we again computed pairwise distances, but this time not based on original activation patterns, but rather based on the extracted network RDMs. This 2nd level comparison has multiple benefits. For one, focus on pattern distances offers a characterization of network-internal representations that is largely invariant to rotations of the underlying high-dimensional space, including a random shuffle of network units (see Supporting Information for more details). Secondly, representational spaces of varying dimensionality can be directly compared, as the dimensionality of the RDM is fixed by the number of test images used.

We computed this second-level distance measure (i.e., the dissimilarity between RDMs rather than activation vectors) for all network layers and instances. Visualizing the respective distances in 2D (MDS, metric stress), we observe that representations diverge substantially with increasing network depth (Fig. 3). While different network instances are highly similar in layer 1, indicating agreement in the underlying representations, subsequent layers diverge gradually with increasing network depth. Note that for later layers, the blue stripes parallel to the main diagonal indicate higher similarity across layers within a given network instance compared to the similarities across instances for a given network layer (Supplementary Fig. 2).

Fig. 3: Network individual differences emerge with increasing network depth.

a We compare the representational geometries across all network instances (10) and layers (9 convolutional) for All-CNN-C by computing all pairwise distances between the corresponding RDMs. b We projected the data points in a (one for each layer and instance) into 2D via MDS. Layers of individual network instances are connected via gray lines. While early representational geometries are highly similar, individual differences emerge gradually with increasing network depth.

Full size image

Representational consistency: quantifying DNN differences

Following this initial qualitative assessment, we performed quantitative analyses for each network layer by testing how well the distribution of representational distances generalizes across network instances. This was accomplished by computing representational consistency, defined as the shared variance between the upper triangle of the respective RDMs (Fig. 1c, each triangle contains 499,500 distance estimates, results are obtained from 45 pairwise network comparisons for each respective layer and network architecture as 10 network instances are trained for each architecture, analysis pseudocode provided in the “Methods” section). This measure of consistency is based on all pairwise distances between category exemplars (100 exemplars for 10 categories each). We, therefore, refer to this as exemplar-based consistency (see “Methods” section for further details).

Representational consistency is based on comparing network RDMs. To compute these RDMs, we used correlation distance as a dissimilarity measure, as it is currently the most frequently used distance measure in systems and computational neuroscience (later sections will investigate further distance measures). As shown in Fig. 4, representational consistency drops substantially with increasing network depth for both network architectures. For All-CNN-C, consistency (i.e., shared variance in distance estimates), drops to 44%, for All-CNN-7, consistency drops to 71%.

Fig. 4: Representational consistency decreases with increasing network depth.

Average representational consistency for each network layer computed across all pairwise comparisons of network instances (45 comparisons for 10 instances, computed separately for two network architectures). Error bars indicate 95% confidence intervals (CI, bootstrapped).

Full size image

To get better insights into the size of this effect, additional networks were trained (i) based on different images originating from the same categories, and (ii) based on different categories (see “Methods” section for details). The observed drops in consistency for different weight initializations are comparable to training the networks with the same distribution of categories but completely separate image data sets (Supplementary Fig. 3, blue vs. orange).

To ensure that the effects observed are not specific to correlation distance used in computing the RDMs, additional analyses were performed based on cosine, (unit length pattern-based) Euclidean distance and norm difference (measuring the absolute difference in the norm activation vectors, Fig. 5). In all cases, representational consistency was observed to drop considerably with increasing network depth. These results demonstrate that while different network instances reach very similar classification performance, they do so via distinct internal representations in the intermediate and higher network layers.

Fig. 5: Representational consistency decreases irrespective of distance measure.

Representational consistency decreases with increasing layer depth for both tested DNN architectures, and across multiple ways to measure distances in multivariate population responses (cosine (a), Euclidean distance and unit length pattern-based Euclidean distance (b), and differences in vector norm (c)). Average representational consistency shown for each layer, computed across all pairwise comparisons of network instances (45 comparisons for 10 instances), together with a 95% bootstrapped confidence interval.

Full size image

The above results represent an important existence proof for substantial DNN individual differences that can occur in computational neuroscience analysis pipelines. To expand our experiments to network architectures commonly used to predict brain data3,4,13,14,15, we trained and tested 10 network instances of a recent version of AlexNet16 on a large-scale object classification data set ILSVRC 201217. As AlexNet requires larger input images than the previously used CIFAR-10 (width/height of 224px vs. 32px), we sampled a new test set that nevertheless reflects the categorical structure of CIFAR-10: 100 images from each of the 10 CIFAR-10 classes were used to compute network RDMs. Replicating our previous results, consistency was also found to decrease with increasing network depth for AlexNet. The strongest individual differences were observed in fully connected layer fc6 (62% explained variance). We observe consistency levels of 84% in the penultimate layer (Fig. 6).

Fig. 6: Representational consistency decreases in AlexNet.

We repeated our above analyses of representational consistency on a set of AlexNet instances trained on large-scale object classification data set ILSVRC 2012. Again, we only vary the initial random seed of the network weights. In line with our previous results, we observe a decrease in representational consistency from early to late network layers. The minimal average consistency is observed in layer fc6, which exhibits 62% of the shared variance across network RDMs. As AlexNet requires the input of size 224 × 224, which is significantly larger than the 32 × 32 image size of CIFAR-10 used earlier, we created an independent set of larger images from the same 10 categories while following the same data set structure (100 images per CIFAR-10 category). Ten network instances correspond to 45 pairwise distance estimates per network layer, average representational consistency shown here with 95% confidence intervals (bootstrapped).

Full size image

Causes of decreasing representational consistency

We have demonstrated above that different network instances can exhibit substantial individual differences in their internal representations. Next, we investigate potential mechanisms that may contribute to this effect.

Our first analyses are based on the hypothesis that the training goal of maximal category separability does not put a strong constraint on the relative positions of categories and category exemplars in high-dimensional activation space. To investigate this possibility, for the 10 network instances of All-CNN-C used in the previous section, we computed a category clustering index (CCI) for each network layer using the network responses to the set of 1000 test images (drawn from 10 categories). CCI is defined as the normalized difference in average distances for stimulus pairs from different categories (across) and stimulus pairs from the same category (within): CCI = (across − within)/(across + within) (see “Methods” section). CCI can be regarded as a multivariate extension to a previously introduced category tuning index18. It approaches zero with no categorical organization and is positive if stimuli from the same category cluster together (maximum possible CCI = 1). We find a negative relationship between CCI and representational consistency (Pearson r = −0.92, p = 0.001; robust correlation19, see Supplementary Fig. 4). This indicates that network layers that separate categories better exhibit stronger individual differences, as measured via nonlinear representational consistency. These results are in line with previous findings demonstrating that linear class-separability increases with network depth20, and observations of decreasing network similarities with increasing layer depth6,7,8,21.

A negative relationship between category clustering and representational consistency is compatible with two possible scenarios: first, networks could exhibit a different arrangement of the overall category clusters. While linear class-separability in the penultimate network layer is required for successful task completion, this does not necessarily imply centroid consistency. That is, we cannot exclude a scenario in which a pair of networks show a similar level of class-separability, albeit a different overall arrangement of class centroids. In this case, class-separability would be high in both cases, but centroid consistency would be low. Second, focusing on distances within-category clusters, different arrangements of individual exemplars within the clusters could be the source of individual differences. Both, overall category and category exemplar placement are not constrained by the categorization training objective.

To investigate the variability in general cluster placement, we computed representational consistency based on the ten category centroids (RDMs computed from the pairwise distances of average response patterns for each category). This analysis revealed that centroid consistency is considerably higher than the previous exemplar-based consistency (Fig. 7a, μcentroid-based = 0.8801, CI95 = [0.8700, 0.8905] vs. μexemplar-based = 0.4429, CI95 = [0.4291, 0.4551] for correlation distance; μcentroid-based = 0.9515, CI95 = [0.9450, 0.9571] vs. μexemplar_based = 0.7384, CI95 = [0.7312, 0.7466] for Euclidean distance, all computed for the final layer of All-CNN-C). This finding cannot be explained by the lower number of pairwise comparisons (45 vs. 499,500 for centroid and stimulus RDMs, respectively) or the operation of averaging large numbers of activation patterns (each centroid is computed based on 100 activation patterns), as computing centroids from random stimulus assignments yielded significantly lower centroid-based representational consistency (95% CI of centroid-based consistency based on random class assignment [0.14, 0.81], Fig. 7b). Together, these results suggest that category centroids are located in similar geometric arrangements in-network instances trained off of different seeds, rendering overall category placement a less likely source of the observed individual differences.

Fig. 7: Category centroids are highly consistent across network instances.

a Centroid-based representational consistency (green) remains comparably high throughout, whereas the consistency of within-category distances decreases significantly with increasing network depth (error bars indicate 95% confidence intervals, average data shown, computed from 45 network comparisons across 10 network instances). This indicates that differences in the arrangement of individual category exemplars, rather than large-scale differences between class centroids are the main contributor to the observed individual differences. b High centroid-based representational consistency cannot be explained by the smaller RDMs or the averaging of multiple response patterns, as centroids of randomly sampled classes show a significantly lower mean consistency (95% CI in the light gray background).

Full size image

The reliable arrangement of category centroids suggests that a main source of the observed individual differences lies in the arrangement of category exemplars within the category clusters themselves. This view was corroborated by computing consistency, not on the whole exemplar-based RDM that contains all pairwise distances, but only on the dissimilarities of exemplars of the same categories (within-category consistency, see “Methods” section). Focusing on within-category distances, we observe a drop in consistency that is largely comparable to the original decrease for exemplar-based consistency computed based on the whole RDM (Fig. 7a).

In addition to an individual placement of category centroids and category exemplars, properties of the used dissimilarity measures could be a source for lower representational consistency, especially in cases of a rotated representational space. Many commonly used DNNs use rectified linear units (ReLUs) as a nonlinear operation, resulting in unit activations ≥ 0. If different network instances learned different projections that are equivalent to a rotation in this all-positive space, then this change will not affect classification performance. However, it can affect estimates of correlation and cosine distances (Supplementary Fig. 5). As shown in Supplementary Fig. 6, rotations around the origin have additional effects on correlation distances but not on cosine distances.

To test the magnitude of this effect, we subtracted the mean activation pattern across all test images from the units of a given layer (cocktail-blank normalization). This normalization led to increases in representational consistency for RDMs computed using correlation or cosine distance (see Supplementary Fig. 7 for details). While the size of the effect is comparably small, these results indicate that a cocktail-blank normalization can be of potential benefit when comparing correlation- or cosine-based RDMs of multiple DNNs or DNNs and brain data.

Bernoulli dropout affects representational consistency

An explanation of individual differences via missing constraints imposed by the training objective raises the possibility that explicit regularization during network training can provide the missing representational constraints22,23. We investigated this possibility experimentally by training networks at various levels of Bernoulli dropout. We trained 10 network instances of All-CNN-C for each of 9 dropout levels (Bernoulli dropout probability ranging from 0 to 0.8, a total of 90 network instances trained) and subsequently tested the resulting representations for their ability to classify input as well as for their representational consistency. To test for differences in task performance, we computed the top-1 categorization accuracy for the training- and test data. For the test data, we compare network performance with and without dropout at the time of inference. In line with the literature23, we find reduced training accuracy, but enhanced test accuracy at moderate dropout levels (Fig. 8a).

Fig. 8: Effects of Bernoulli dropout on task performance and representational consistency.

a Task performance, the average across all 10 network instances shown with 95% CI for the training set (blue), test set (orange), and when using dropout sampling at inference time for the test set (red, 1 sample). b Average representational consistency in the final convolutional layer of All-CNN-C as a function of dropout probability during training and test (dropout probability at test time set to equal dropout probability during training, consistency derived from 45 network pairs). When using dropout at test time, multiple samples can be drawn for each stimulus in the test set (creating multiple RDMs). Consistency for network pairs was computed for the respective average RDM for each instance. Consistency was observed to be highest when 10 samples were obtained from a DNN trained and tested at a dropout rate of 60%. c The clustering index for the penultimate layer of All-CNN-C increases with increasing Bernoulli dropout probability (10 network instances, error bars 95% CI).

Full size image

The effects of dropout training on representational consistency were investigated using layer 9 of All-CNN-C, which exhibited the lowest consistency levels in our original analyses. Focusing on the effects of using Bernoulli dropout during training, we observe that it reduced network individual differences. The highest consistency was found for a dropout probability of 0.6, which led to an average of 64.7% shared variance (rightmost column in Fig. 8b)—a marked increase compared to the 44% observed without dropout.

In analogy to our analyses of test accuracy in which we apply Bernoulli dropout at the time of inference, we investigated how far obtaining multiple test samples of the activation patterns affect representational consistency. For each network instance, we computed 10 RDM samples while keeping the dropout mask identical across network instances and the dropout rate identical to training. Like this, we obtained 10 RDM samples for each network instance and subsequently use the average RDM to compute representational consistency (see “Methods” section). We find that increasing the number of RDM samples led to increased representational consistency for all dropout levels (Fig. 8b). Maximum representational consistency was observed for 10 RDM samples at a dropout probability of 0.6, reaching an average of 67.8% shared variance. This suggests that dropout applied during training and test can increase the consistency of the representational distances across network instances.

To investigate a possible mechanism of how dropout may have positively affected consistency, we computed the category clustering index, as previously defined, for the penultimate layer of All-CNN-C trained at various dropout levels. The reasoning for this was that if category centroids are highly consistent, then stronger clustering of category exemplars around the centroids will at the same time yield higher overall representational consistency. As shown in Fig. 8c, we observe a positive relationship between dropout probability and category clustering, supporting our hypothesis. However, while clustering is increased further for dropout levels >0.6, representational consistency starts to decrease. We explain this effect by observing that centroid consistency is significantly decreased (μdropout=0.8 = 0.7422, CI95 = [0.6881, 0.7854]) compared to the no dropout case (μno_dropout = 0.8801, CI95 = [0.8700, 0.8905]) for the highest dropout level of 0.8. Thus, while denser clustering around centroids increases consistency in cases where the centroids themselves are consistent (here up to dropout levels of 0.6), high levels of dropout break the centroid consistency and therefore lead to an overall decrease in representational consistency.

Representational consistency across training trajectories

We observed above that representational consistency across network instances is remarkably stable for category centroids. This raises the question of whether this alignment is the result of task training, or whether category centroids are already well-aligned early during training. To investigate the effects of training, we computed representational consistency (exemplar-based and centroid-based) across network instances and training epochs for the final layer of All-CNN-C. These analyses indicate that networks exhibit high consistency after the first training epoch, which decreases from thereon (Fig. 9a, b). From about 50 epochs onwards, networks exhibit relatively stable representations with each network remaining on its own learning trajectory (Fig. 9a, multiple diagonal lines indicate stable representations across training compared to other network instances). These results indicate that task training increases individual differences, whereas learning trajectories of individual networks across time remain surprisingly robust. In line with this, representational consistency and task performance exhibit a strong negative relationship (Pearson r = −0.91, p < 0.001; robust correlation19, Fig. 9b–d). In line with our earlier results, category centroids are significantly more consistent, even for the earliest epochs, but otherwise exhibit similar training effects (Supplementary Fig. 8).

Fig. 9: Final-layer representational consistency (exemplar-based) across training epochs.

a Comparing representational consistency across early epochs [1 to 10] (left) and throughout all training epochs [1 to 350 in steps of 50] (right). Lines parallel to the main diagonal indicate that network instances remain on their distinct representational trajectory compared to other networks. Average consistency shown across 45 network pairs, derived from 10 network instances. b Representational consistency, computed and averaged across all network pairs (45 pairs total) for each training epoch, demonstrates increasing individual differences with training (shown with 95% CI). c Test performance across training (average top-1 accuracy across 10 network instances with 95% CI). d Representational consistency and test performance exhibit a strong negative relationship (Pearson r = −0.91, p < 0.001; robust correlation) indicating that task training enhances individual differences (dots represent network training epochs, error bar indicates 95% CI).

Full size image