Statnet vs igraph

There are two main packages used in social network analysis in R: Statnet and igraph. Statnet is actually a suite of package with a focus on network modeling using Exponential Random Graph Models. In contrast, igraph is a single package with a broad range of network analysis tools but little focus on modeling graphs. In addition, igraph is also available in Python, Mathematica and C.

There are many other packages out there that deal with specific network problems but in my experience most of them integrate into the statnet or igraph framework (or both).

The purpose of this page is not to demonstrate what all you can do with igraph and statnet or to show which one of these are better. Instead I want to show you basic operations in both so it is easier to translate code from package to another if necessary.

Loading a Network

Both packages have function(s) to convert dataframes or matrices into their own network object. The biggest difference here is that igraph has an array of functions to deal with a variety of common network types, while statnet uses only one function (network()).

In the below examples we use two datasets. The adjacency matrix is the Islamic State Group network available via UCINET. The edgelist is campaign donors in the 2016 Nevada State Senate election that is part of my own data on campaign donors available on the Harvard Dataverse.

If you want to follow along the Nevada egdelist data is here and the metadata is here.

The network() function can be used to convert both an adjacency matrix or an edgelist. You can set the type of data using matrix.type argument but it will also guess. In general if you provide a matrix it will assume it is an adjacency matrix. If you provide a data.frame it will assume it is an edgelist.

In order to convert our dataframe into a matrix we need to use as.matrix() but this can create a problem if some of our columns are not numbers. OFten the first column in your adjacency csv is the vertex names. To deal with this when we read in the data using read.csv() we are going to set row.names=1 which makes the first column in your csv into rownames instead of treating it as a regular column.

library(network)

adj_df <- read.csv("network_data/IS_BBC_61_IS_BBC-FRIENDS.csv", row.names=1)
adj_mat <- as.matrix(adj_df)
statnet_net <- network(adj_mat, directed=FALSE)

Reading in an edgelist is comparatively easy. Assuming that you have a csv with your first column as where the edge starts and your second column as where the edge ends then just put that dataframe directly into the network() call.

library(network)

edge_df <- read.csv("network_data/NV-2013-2016-Edges.csv")
statnet_net <- network(edge_df, directed=FALSE)

There are some additional arguments to worry about (there are more as well but this should cover most cases):

  • directed is whether the network is directed (default is TRUE).
  • loops is whether edges can point towards themselves(default is FALSE).
  • multiple is whether there are multiple types of edges (default is FALSE).

igraph has a set of functions that can be used to convert different data types into a network. For adjacency matrix we use graph_from_adjacency_matrix() and graph_from_data_frame() but there is also read_graph() which can be used to read data directly from a file, including a wide range of formats (pajek, graphml, gml, etc).

In order to convert our dataframe into a matrix we need to use as.matrix() but this can create a problem if some of our columns are not numbers. OFten the first column in your adjacency csv is the vertex names. To deal with this when we read in the data using read.csv() we are going to set row.names=1 which makes the first column in your csv into rownames instead of treating it as a regular column.

library(igraph)

adj_df <- read.csv("network_data/IS_BBC_61_IS_BBC-FRIENDS.csv", row.names=1)
adj_mat <- as.matrix(adj_df)
igraph_net <- graph_from_adjacency_matrix(adj_mat, mode="undirected")

Reading in an edgelist is comparatively easy. Assuming that you have a csv with your first column as where the edge starts and your second column as where the edge ends then just put that dataframe directly into the graph_from_data_frame() call.

library(igraph)

edge_df <- read.csv("network_data/NV-2013-2016-Edges.csv")
igraph_net <- graph_from_data_frame(edge_df, directed=FALSE)

The additional arguments are a bit different across the two functions.

  • In graph_from_adjacency_matrix() you use mode to indicate the type of matrix and there are a variety of options to deal with non-symmetric matrices that you want to treat as undirected (the default is directed).
  • In graph_from_data_frame() the only additional argument is whether it is directed or not (the default is TRUE).

Loading Data with Vertex Attributes

Both igraph and statnet have ways to add vertex attributes on to your network during the creation of the network. In this example I am only using the edgelist data. If you are using adjacency data the same steps can be used for the statnet package but for igraph I think you have to first get it into an edgelist format.

Your metadata or vertex attributes will be in an additional dataframe with the first column as the vertex identifier. This needs to be the same as the vertex identifier you have in your edgelist or adjacency matrix.

library(network)

edge_df <- read.csv("network_data/NV-2013-2016-Edges.csv")
meta_df <- read.csv("network_data/NV-2013-2016-Meta.csv")

statnet_net <- network(edge_df, vertices=meta_df, 
                directed=FALSE)

Your metadata or vertex attributes will be in an additional dataframe with the first column as the vertex identifier. This needs to be the same as the vertex identifier you have in your edgelist or adjacency matrix.

library(igraph)

edge_df <- read.csv("network_data/NV-2013-2016-Edges.csv")
meta_df <- read.csv("network_data/NV-2013-2016-Meta.csv")

igraph_net <- graph_from_data_frame(edge_df, vertices=meta_df, 
                directed=FALSE)

Accessing and Modifying Attributes

Both igraph and statnet allow you to modify vertex and edge attributes (and graph attributes). In addition the functions here are very similar but there are also some shortcuts in each package as well that are not similar.

Warning

The biggest difference between the two is that statnet modifies the network object in place while igraph does not.

To access a vertex attribute you use get.vertex.attribute() with the network object and name of the attribute you want to access.

You can set a vertex attribute using set.vertex.attribute() along with the network object, the name of the attribute you want to access, the value(s) and which vertex (or vertices) you want to modify. When you call this it modifies the network in-place and nothing is returned

Each vertex has a numeric ID from 1 to the number of vertices n the network.

library(network)

total_donations <- get.vertex.attribute(statnet_net, "Total")
## Access the first 10 donation amounts
total_donations[1:10]
 [1]  87000.0 112000.0  80500.0  81000.0 142431.1 109000.0 133500.0  43000.0
 [9] 182500.0  30500.0
## Change the first vertex to have a "Total" of 100
set.vertex.attribute(statnet_net, "Total", value=100, 
    v=1)

modified_donations <- get.vertex.attribute(statnet_net, "Total")
modified_donations[1:10]
 [1]    100.0 112000.0  80500.0  81000.0 142431.1 109000.0 133500.0  43000.0
 [9] 182500.0  30500.0
## Convert the total amount for all vertices to be in $1,000s
set.vertex.attribute(statnet_net, "Total", total_donations/1000)


thousand_donations <- get.vertex.attribute(statnet_net, "Total")
## Access the first 10 donation amounts
thousand_donations[1:10]
 [1]  87.0000 112.0000  80.5000  81.0000 142.4311 109.0000 133.5000  43.0000
 [9] 182.5000  30.5000

You can also access and modify vertex attributes using %v% as an operator instead. This simplifies the above code, though it isn’t possible to modify an individual vertex.

## Access the attribute PerRep
percent_rep <- statnet_net %v% "PerRep"
percent_rep[1:10]
 [1]  37.35632  54.01786  47.82609  61.72840  44.53457  35.32110  31.08614
 [8]   0.00000  46.57534 100.00000
## Convert this to a proportion and store it

statnet_net %v% "Proportion_Rep" <- percent_rep/100

Access and modifying edge attributes follows in a similar way but with get.edge.attibute(), get.edge.attribute() and %e%.

To access a vertex attribute you use vertex_attr() with the network object and name of the attribute you want to access.

You can set a vertex attribute using set_vertex_attribute() along with the network object, the name of the attribute you want to access, the value(s) and which vertex (or vertices) you want to modify. When you call this the new network object is returned.

Each vertex has a numeric ID from 1 to the number of vertices n the network.

library(igraph)

total_donations <- vertex_attr(igraph_net, "Total")
## Access the first 10 donation amounts
total_donations[1:10]
 [1]  87000.0 112000.0  80500.0  81000.0 142431.1 109000.0 133500.0  43000.0
 [9] 182500.0  30500.0
## Change the first vertex to have a "Total" of 100
igraph_net <- set_vertex_attr(igraph_net, "Total", value=100, 
    index=1)

modified_donations <- vertex_attr(igraph_net, "Total")
modified_donations[1:10]
 [1]    100.0 112000.0  80500.0  81000.0 142431.1 109000.0 133500.0  43000.0
 [9] 182500.0  30500.0
## Convert the total amount for all vertices to be in $1,000s
igraph_net <- set_vertex_attr(igraph_net, "Total", value=total_donations/1000)


thousand_donations <- vertex_attr(igraph_net, "Total")
## Access the first 10 donation amounts
thousand_donations[1:10]
 [1]  87.0000 112.0000  80.5000  81.0000 142.4311 109.0000 133.5000  43.0000
 [9] 182.5000  30.5000

In addition to this there are two other ways to modify vertex attributes. The first is to access them using V()$ the second is to directly assign values using vertex_attr() <-.

## Access the attribute PerRep
percent_rep <- V(igraph_net)$PerRep
percent_rep[1:10]
 [1]  37.35632  54.01786  47.82609  61.72840  44.53457  35.32110  31.08614
 [8]   0.00000  46.57534 100.00000
## Convert this to a proportion and store it
V(igraph_net)$Proportion_Rep <- percent_rep/100

## The above and below are equivalent 
vertex_attr(igraph_net, "Proportion_Rep") <- percent_rep/100

One benefit of this feature is that it is able to access and modify attributes based on other attributes. For example, imagine I want to create a new variable that indicates whether groups gave only to Republicans or only to Democrats, or both:

V(igraph_net)$Type <- "Both"
vertex_attr(igraph_net, "Type", V(igraph_net)[PerDem == 100]) <- "Democrat Only"
vertex_attr(igraph_net, "Type", V(igraph_net)[PerRep == 100])  <- "Republican Only"

## Create a table to see he counts 
table(V(igraph_net)$Type)

           Both   Democrat Only Republican Only 
            304             168             127 

Access and modifying edge attributes follows in a similar way but with get.edge.attibute(), get.edge.attribute() and %e%.

Extracting Data from Networks

One common occurrence in network analysis is extracting nodal/vertex data out from your network. For example, you might want to use regression analysis to look at whether centrality of a node is related to other nodal characteristics. To do this you’ll load the data into a network, calculate the centrality scores, and then want to append this to your data. The easiest way (in my opinion) is to add these centrality statistics as a vertex attribute then convert the network into a vertex based dataframe.

For statnet this is accomplished using as.data.frame() and indicating whether you want to convert to a vertex or edge dataframe using the unit= argument. In the below example I calculate the degree and betweenness centrality of each node, then convert it to a dataframe and show the first 5 rows.

library(network)
library(sna)

statnet_net %v% "degree" <- degree(statnet_net, gmode="graph")
statnet_net %v% "between" <- betweenness(statnet_net, gmode="graph")

df <- as.data.frame(statnet_net, unit="vertices")

df[1:5,]
  vertex.names         ContributorName              CatCodeIndustry
1         1887          NEWMONT MINING                       Mining
2         2906            WYNN RESORTS           Gambling & Casinos
3          541             CENTURYLINK Telecom Services & Equipment
4          958 FARMERS INSURANCE GROUP                    Insurance
5          396             BOYD GAMING           Gambling & Casinos
                      CatCodeGroup                      CatCodeBusiness
1       Energy & Natural Resources            Metal mining & processing
2                 General Business       Casinos, racetracks & gambling
3     Communications & Electronics                   Telecommunications
4 Finance, Insurance & Real Estate Insurance agencies, brokers & agents
5                 General Business       Casinos, racetracks & gambling
    PerDem   PerRep    DemCol    RepCol    Total Proportion_Rep degree
1 62.64368 37.35632 #00003DAF #00003DAF  87.0000      0.3735632     82
2 45.98214 54.01786 #1A0000AF #1A0000AF 112.0000      0.5401786    225
3 52.17391 47.82609 #00000AAF #00000AAF  80.5000      0.4782609    163
4 38.27160 61.72840 #3D0000AF #3D0000AF  81.0000      0.6172840    169
5 55.46543 44.53457 #00001AAF #00001AAF 142.4311      0.4453457     86
    between
1  198.2822
2 3518.4604
3  393.7207
4  472.0289
5  536.6969

For graiph this is accomplished using as_data_frame() and indicating whether you want to convert to a vertex or edge dataframe using the what= argument. In the below example I calculate the degree and betweenness centrality of each node, then convert it to a dataframe and show the first 5 rows.

library(igraph)

V(igraph_net)$degree <- degree(igraph_net)
V(igraph_net)$between <- betweenness(igraph_net)

df <- as_data_frame(igraph_net, what="vertices")

df[1:5,]
     name         ContributorName              CatCodeIndustry
1887 1887          NEWMONT MINING                       Mining
2906 2906            WYNN RESORTS           Gambling & Casinos
541   541             CENTURYLINK Telecom Services & Equipment
958   958 FARMERS INSURANCE GROUP                    Insurance
396   396             BOYD GAMING           Gambling & Casinos
                         CatCodeGroup                      CatCodeBusiness
1887       Energy & Natural Resources            Metal mining & processing
2906                 General Business       Casinos, racetracks & gambling
541      Communications & Electronics                   Telecommunications
958  Finance, Insurance & Real Estate Insurance agencies, brokers & agents
396                  General Business       Casinos, racetracks & gambling
       PerDem   PerRep    DemCol    RepCol    Total Proportion_Rep Type degree
1887 62.64368 37.35632 #00003DAF #00003DAF  87.0000      0.3735632 Both     82
2906 45.98214 54.01786 #1A0000AF #1A0000AF 112.0000      0.5401786 Both    225
541  52.17391 47.82609 #00000AAF #00000AAF  80.5000      0.4782609 Both    163
958  38.27160 61.72840 #3D0000AF #3D0000AF  81.0000      0.6172840 Both    169
396  55.46543 44.53457 #00001AAF #00001AAF 142.4311      0.4453457 Both     86
       between
1887  198.2822
2906 3518.4604
541   393.7207
958   472.0289
396   536.6969

Using Both Packages

You need to be careful when you load both igraph and the statnet suite of packages. They have several functions with the exact same names. The example below shows what happens when you do this:

library(igraph)
library(sna)


degree(statnet_net)[1:10] #will work
 [1] 164 450 326 338 172 312 142 258 108 224
degree(igraph_net)[1:10] #won't work 
Error in FUN(X[[i]], ...): as.edgelist.sna input must be an adjacency matrix/array, edgelist matrix, network, or sparse matrix, or list thereof.

A solution to this is to prepend function calls with the library the come from:

sna::degree(statnet_net)[1:10] #will work
 [1] 164 450 326 338 172 312 142 258 108 224
igraph::degree(igraph_net)[1:10] #will work 
    1887     2906      541      958      396  9686991     3216  9688247 
      82      225      163      169       86      156       71      129 
    8044 24820735 
      54      112 

You can also unload packages using detach("package:igraph", unload=TRUE) or detach("package:sna", unload=TRUE)