Network data
Network data
This page contains links to some network data sets I’ve compiled over the
years. All of these are free for scientific use to the best of my
knowledge, meaning that the original authors have already made the data
freely available, or that I have consulted the authors and received
permission to the post the data here, or that the data are mine. If you
make use of any of these data, please cite the original sources.
The data sets are in GML format. For a description of GML see here.
GML can be read by many network analysis packages, including Gephi and Cytoscape. I’ve written a simple
parser in C that will read the files into a data structure. It’s available
here. There are many features of GML not
supported by this parser, but it will read the files in this repository
just fine. There is a Python parser for GML available as part of the
NetworkX package here and
another in the igraph package,
which can be used from C, Python, or R. If you know of or develop other
software (Java, C++, Perl, R, Matlab, etc.) that reads GML, let me know.
Data sets
Other sources of network data
There are a number of other pages on the web from which you can download
network data. Here are a few that I am aware of:
- UCINet
data sets: Social network data sets released with the UCINet software
by Steve Borgatti et al. - Pajek
data sets: Example data sets released with the Pajek software by
Vladimir Batagelj and Andrej Mrvar. - Indiana University
data sets: A set of very large data sets, including some non-network
data sets, compiled by the School of Library and Information Science at
Indiana University. Network data sets include the NBER data set of US
patent citations and a data set of links between articles in the on-line
encyclopedia Wikipedia. - Duncan Watts’ data
sets: Data compiled by Prof. Duncan Watts and collaborators at Columbia
University, including data on the structure of the Western States Power
Grid and the neural network of the worm C. Elegans. - Laszlo Barabasi’s
data sets: Data compiled by Prof. Albert-Laszlo Barabasi and
collaborators at the University of Notre Dame, including web data and
biochemical networks. - Alex
Arenas’s data sets: Data compiled by Prof. Alexandre Arenas and
collaborators at Universidad Rovira i Virgili, including metabolic network
data and the network from their study of the collaboration patterns of jazz
musicians. - Stanford Large
Network Dataset Collection: A substantial collection of data sets
describing very large networks, including social networks, communications
networks, and transportation networks.
Last modified: April 19, 2013