Introduction

Welcome

This vignette provides an introduction to using the backbone package to extract the backbone of a network. A backbone is a sparse and unweighted subgraph of a weighted or unweighted network that contains only the most ‘important’ or ‘significant’ edges A backbone can be useful when the original network is too dense, when edge weights are not needed, or when edge weights are difficult to interpret. For a complete example using empirical data, please see the Empirical Example of Backbone Extraction vignette.

The backbone package can be cited as:

Neal, Z. P. (2025). backbone: An R package to Extract Network Backbones. CRAN. https://doi.org/10.32614/CRAN.package.backbone

For additional resources on the backbone package, please see https://www.rbackbone.net/.

If you have questions about the backbone package or would like a backbone hex sticker, please contact the maintainer Zachary Neal by email (zpneal@msu.edu). Please report bugs in the backbone package at https://github.com/zpneal/backbone/issues.

Installing and loading the package

After the backbone package has been installed from CRAN using install.packages("backbone"), it can be loaded in the usual way:

library(backbone)
#>  ____   backbone v3.0.2
#> |  _ \  Cite: Neal, Z. P., (2025). backbone: An R package to extract network
#> |#|_) |       backbones. CRAN. https://doi.org/10.32614/CRAN.package.backbone
#> |# _ <
#> |#|_) | Help: type vignette("backbone"); email zpneal@msu.edu; github zpneal/backbone
#> |____/  Beta: type devtools::install_github("zpneal/backbone", ref = "devel")

Upon successful loading, a startup message will display that shows the version number, citation, ways to get help, and ways to contact me.

This installs and loads the current release of the package. However, the package is constantly being updated with new features and bug fixes. You can install the development version of the package using devtools::install_github("zpneal/backbone", ref = "devel", build_vignettes = TRUE). Although the development version has been tested, some new features may not yet work as intended.

For ease of illustration, this vignette also makes use of functions in the igraph package, which can be installed from CRAN using install.packages("igraph") and loaded:

library(igraph)
#> Warning: package 'igraph' was built under R version 4.5.1
#> 
#> Attaching package: 'igraph'
#> The following objects are masked from 'package:stats':
#> 
#>     decompose, spectrum
#> The following object is masked from 'package:base':
#> 
#>     union

Package overview

The primary use of the backbone package is backbone extraction, that is, identifying and preserving only the most important edges in a network. Different types of networks require different backbone extraction methods, however the basic workflow using the backbone package is the same for all methods:

You begin with some network data dat, which may take the form of an an adjacency/incidence matrix or Matrix, or an igraph object.
You can run backbone(), which will detect your type of network and extract the backbone using an appropriate model and reasonable defaults. Or, depending on the type of network contained in dat (e.g., weighted, bipartite, unweighted), you can run the applicable underlying backbone function (backbone_from_projection(), backbone_from_weighted(), or backbone_from_unweighted()). In all cases, these functions return a backbone of the same class as dat, so your workflow isn’t interrupted.
By default, backbone() displays narrative text describing the backbone extraction with the relevant citations. This narrative is also available optionally from the other functions using narrative = TRUE. It is suitable for use in manuscripts, and ensures complete and consistent reporting of the backbone extraction.

The sections below describe and illustrate the backbone functions applicable to different types of networks, including weighted networks, weighted projections, and unweighted networks. Additionally, the package contains some utility functions that are used by the backbone methods, but may also have independent applications.

New in Version 3.0.0 – In older versions, each backbone extraction model required a different function. For example, you’d use disparity() to extract a backbone using the disparity filter, and mlf() to extract it using the marginal likelihood filter. Now all models for a specific type of network are accessible in a single function. For example, you can use backbone_from_weighted(model = "disparity") to use the disparity filter, and backbone_from_weighted(model = "mlf") to use the marginal likelihood filter.

Supported data formats

The functions in the backbone package support three data formats that can be used to represent a network or graph in R:

matrix or Matrix - A bipartite network can be represented as an incidence matrix, and a unipartite network can be represented as an adjacency matrix. An incidence/adjacency matrix can be supplied to a backbone extraction function as a matrix object or as a Matrix object created by the Matrix package.
igraph - A network can be supplied to a backbone extraction function as an igraph object from the igraph package.

The backbone package does not support edgelist formats because these can be ambiguous when isolated nodes are present. It also does not support network objects from the network package (from the statnet family of packages) because these require loading many dependencies. You can convert a network bipartite graph into an incidence matrix using network::as.matrix.network(graph, type = "incidence"), and can convert a network unipartite graph into an adjacency matrix using network::as.matrix.network(graph, type = "adjacency").

back to Table of Contents

tl;dr (the `backbone()` wrapper)

OK, so you just want to extract the backbone from a network. This is possible using the backbone() wrapper.

my_network <- sample_gnp(100, .5)  #A random network
my_network
#> IGRAPH 4ab4276 U--- 100 2504 -- Erdos-Renyi (gnp) graph
#> + attr: name (g/c), type (g/c), loops (g/l), p (g/n)
#> + edges from 4ab4276:
#>  [1]  1-- 4  2-- 4  3-- 4  1-- 5  3-- 5  4-- 5  4-- 6  5-- 6  1-- 7  3-- 7
#> [11]  1-- 8  2-- 8  1-- 9  3-- 9  4-- 9  6-- 9  7-- 9  1--10  3--10  4--10
#> [21]  5--10  6--10  7--10  1--11  2--11  4--11  8--11  6--12 11--12  3--13
#> [31]  4--13  6--13  8--13  9--13 11--13  2--14  4--14  6--14  7--14  9--14
#> [41] 10--14 11--14 12--14  4--15  6--15  7--15  9--15 10--15  3--16  4--16
#> [51]  7--16  9--16 10--16 14--16 15--16  1--17  2--17  3--17  4--17  5--17
#> [61]  8--17  9--17 16--17  2--18  3--18  4--18  5--18 10--18 11--18 12--18
#> [71] 13--18 15--18  2--19  3--19  5--19  8--19  9--19 10--19 12--19 14--19
#> + ... omitted several edges
backbone <- backbone(my_network)  #Extract the backbone
#> The backbone package for R (v3.0.2; Neal, 2025) was used to extract the unweighted backbone of an unweighted network containing 100 nodes. Edges were selected for retention in the backbone using Satuluri et al's (2011) Local Sparsification backbone model (filtering parameter = 0.5), which reduced the number of edges by 80.07%.
#> 
#> Neal, Z. P. 2025. backbone: An R Package to Extract Network Backbones. CRAN. https://doi.org/10.32614/CRAN.package.backbone
#> 
#> Satuluri, V., Parthasarathy, S., & Ruan, Y. (2011, June). Local graph sparsification for scalable clustering. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data (pp. 721-732). https://doi.org/10.1145/1989323.1989399
backbone
#> IGRAPH 17f10dc U--- 100 499 -- lspar backbone of Erdos-Renyi (gnp) graph
#> + attr: name (g/c), type (g/c), loops (g/l), p (g/n), call (g/x),
#> | narrative (g/c)
#> + edges from 17f10dc:
#>  [1]  3-- 4  4-- 6  1--10  5--10  7--10  2--11  4--11  8--11  4--13 11--13
#> [11] 10--15  4--16 10--18 18--19  1--20  7--20  3--21  4--21  5--22 12--22
#> [21] 20--22 21--22 20--23  3--24 16--24 18--25  2--26 11--26 12--26 20--26
#> [31] 14--27 23--27  1--28 11--28  3--29  4--29  5--29  7--29 10--29 17--29
#> [41] 21--29 22--30 24--30  3--31  7--31 21--31  9--32  6--33 14--33 23--33
#> [51]  5--34 19--34 26--34 12--35 13--35 22--35  9--36 16--36 22--36 33--36
#> [61]  7--37 12--37 20--37 12--38 22--38 35--38 36--38  2--39 14--39 22--39
#> + ... omitted several edges

Given a network called my_network, running backbone() will extract the backbone using some reasonable defaults and will display a description of what it did.

Weighted networks

Background

A weighted unipartite network contains one type of node, and pairs of nodes are connected by edges that have weights. These weights might indicate the intensity or frequency of the relationship between them. A weighted unipartite network can be represented by an adjacency matrix W, where W_ij indicates the weight of the edge between node i and node j. Weighted networks can be undirected (i.e., W_ij = W_ji) or directed (i.e., W_ij != W_ji). The goal of backbone extraction from a weighted network is to identify edges in W that are strong and should be preserved in the backbone.

In the subsections below, we illustrate backbone extraction methods in an example weighted network W:

W <- matrix(c(0,10,10,10,10,75,0,0,0,0,  #Adjacency matrix of example network
              10,0,1,1,1,0,0,0,0,0,
              10,1,0,1,1,0,0,0,0,0,
              10,1,1,0,1,0,0,0,0,0,
              10,1,1,1,0,0,0,0,0,0,
              75,0,0,0,0,0,100,100,100,100,
              0,0,0,0,0,100,0,10,10,10,
              0,0,0,0,0,100,10,0,10,10,
              0,0,0,0,0,100,10,10,0,10,
              0,0,0,0,0,100,10,10,10,0),10)
W <- graph_from_adjacency_matrix(W, mode = "undirected", weighted = TRUE, diag = FALSE)  #Convert to an igraph object

This network has strongly heterogeneous edge weights. Some edges are quite weak (10) while others are quite strong (100), which gives the network a multi-scalar character. This network might represent the number of passengers flying between each of 10 airports, where there are two hub airports (one large, one small) that each serve their local areas.

Although it is possible to visualize this network by using thicker lines to represent stronger edges, in larger networks this often leads to cluttered visualizations. Additionally, many network analytic techniques (e.g., ERGM, SAOM) have not yet been extended for application to weighted. In such cases, it can be useful to focus on this network’s backbone.

plot(W, edge.width = sqrt(E(W)$weight), vertex.label = NA)

back to Table of Contents

Structural models

Structural models choose edges for retention based on edges’ structural properties, mainly the values of their edge weights. The backbone package implements one structural model, the Global Threshold, which simply preserves all edges whose weights are larger than a given threshold value.

We could extract a global threshold backbone using a threshold of 0 using the backbone_from_weighted() function and specifying model = "global" and parameter = 0. This preserves all the edges in the backbone and yields a simpler unweighted network, but one that fails to preserve the network’s true hub-and-spoke structure.

bb <- backbone_from_weighted(W, model = "global", parameter = 0)
plot(bb, vertex.label = NA)

We could choose the threshold value based on the edge weights. For example, we could use the average edge weight as the threshold to extract a backbone that preserves the stronger-than-average edges. However, this yields a network that is focused only on high-degree nodes (i.e., the larger hub airport), while ignoring the low-degree nodes (i.e., the smaller hub airport):

bb <- backbone_from_weighted(W, model = "global", parameter =  mean(E(W)$weight))
plot(bb, vertex.label = NA)

The choice of a global threshold is often arbitary, but can significantly impact the structure of the backbone. Moreover, backbones extracted using a global threshold fail to preserve network structure when edge weights are heterogeneous. For this reason, global threshold backbones should generally be avoided.

back to Table of Contents

Statistical models

Statistical models choose edges to retain in the backbone by testing whether their weights are statistically significantly stronger than expected under a null model, with the stringency of the statistical test controlled by the alpha parameter. Among the most widely used statistical backbone model is the disparity filter (Serrano, Boguná, and Vespignani 2009).

We can extract a disparity filter backbone at a conventional level of statistical significance using the backbone_from_weighted() function and specifying model = "disparity" and alpha = 0.05. Because this model evaluates the strength of edges from the local perspective of each node, it preserves the network’s true hub-and-spoke structure.

bb <- backbone_from_weighted(W, model = "disparity", alpha = 0.05)
plot(bb, vertex.label = NA)

Two additional statistical models are also available:

locally adaptive network sparsification using model = "lans" (Foti, Hughes, and Rockmore 2011)
marginal likelihood filter using model = "mlf" (Dianati 2016)

Despite differences in their null models, they operate similarly to the disparity filter. Currently there is limited guidance on model selection, however the disparity filter is the most widely known and thus is often the default choice unless there is a specific reason to choose a different model.

back to Table of Contents

Weighted projections

Background (Neal 2014)

The backbone models available using backbone_from_weighted() can be applied to any weighted network. However, specialized models exist for weighted networks that are the result of projecting a bipartite network or hypergraph.

A bipartite network contains two types of nodes, which we call agents and artifacts, and edges that exist only between nodes of different types (i.e., an edge may connect an agent to an artifact, but never an agent to another agent). For example, a bipartite network may represent authors (agents) connected to the papers (artifacts) they they wrote. A hypergraph contains nodes that we call agents. Unlike an ordinary network, edges in a hypergraph (i.e., hyperedges) can connect more than two agents simultaneously. For example, a single edge in a hypergraph (i.e., a hyperedge) can connect all the authors (agents) who wrote a given paper. Both bipartite networks and hypergraphs can be represented by an incidence matrix B, where B_ik = 1 if agent i is connected to artifact k or if agent i is connected to hyperedge k, and otherwise is 0 (e.g., author i wrote paper k).

A bipartite or hypergraph projection is a network of agents that are connected by the fact that they share artifacts or hyperedges in common. A projection can be represented by a square symmetric matrix P, which is formed as P = B * B’, where B’ is the transpose of B. In P, P_ij indicates the number of artifacts or hyperedges k that are shared by agents i and j, and can be viewed as the weight of the edge connecting them in the network (e.g., the number of papers co-authored by authors i and j). However, these weights can be noisy. The goal of backbone extraction from weighted projections is to identify edges in P whose weights are strong and should be preserved in the backbone.

In the subsections below, we illustrate backbone extraction methods in an example bipartite network B:

B <- rbind(cbind(matrix(rbinom(250,1,.8),10),  #Incidence matrix of example bipartite network
                 matrix(rbinom(250,1,.2),10),
                 matrix(rbinom(250,1,.2),10)),
           cbind(matrix(rbinom(250,1,.2),10),
                 matrix(rbinom(250,1,.8),10),
                 matrix(rbinom(250,1,.2),10)),
           cbind(matrix(rbinom(250,1,.2),10),
                 matrix(rbinom(250,1,.2),10),
                 matrix(rbinom(250,1,.8),10)))
B <- graph_from_biadjacency_matrix(B)  #Convert to an igraph object
plot(B, layout = layout_as_bipartite(B), vertex.label = NA, vertex.size = 1)

This network contains 30 agents and 75 artifacts, and might represent 30 authors’ participation in 75 papers across three overlapping disciplines.

We can construct this network’s weighted projection P and retain the shared number of artifacts between agents as edge weights. However, this yields a very dense network with no obvious structure:

P <- bipartite_projection(B, which = "true")
plot(P, vertex.label = NA, edge.width = sqrt(E(P)$weight), edge.color = rgb(0,0,0,.1))

back to Table of Contents

Statistical models

All of the backbone models implemented in the backbone package for weighted projections are statistical models, which choose edges to retain in the backbone by testing whether their weights are statistically significantly stronger than expected under a null model, with the stringency of the statistical test controlled by the alpha parameter. Among the fastest and most widely used backbone models is the Stochastic Degree Sequency Model (SDSM) (Neal, Domagalski, and Sagan 2021).

We can extract an SDSM backbone at a conventional level of statistical significance using the backbone_from_projection() function and specifying model = "sdsm" and alpha = 0.05. Importantly, weighted projection backbones are based on the original bipartite network or hypergraph, not on the weighted projection. Therefore, we must supply B and not P to the backbone_from_projection() function. Because this model uses information from the original network to identify strong edges, it preserves the network’s true three-community structure.

bb <- backbone_from_projection(B, model = "sdsm", alpha = 0.05)
plot(bb, vertex.label = NA)

Four additional statistical models are also available:

fixed degree sequence model (FDSM) using model = "fdsm" (Zweig and Kaufmann 2011)
fixed row model using model = "fixedrow" (Neal, Domagalski, and Sagan 2021)
fixed column model using model = "fixedcol" (Neal, Domagalski, and Sagan 2021)
fixed fill model using model = "fixedfill" (Neal, Domagalski, and Sagan 2021)

Each of these models uses different null models that constrain different features of B‘s incidence matrix when evaluating edges’ strength. The SDSM and FDSM are often most useful, but have trade-offs. The FDSM offers more statistical power, but requires computationally expensive monte carlo methods to estimate edges p-values. In contrast, the SDSM is less powerful (although this can be compensated by using a more liberal alpha), but can quickly compute edges’ exact p-values.

back to Table of Contents

Unweighted networks

Background

An unweighted unipartite network contains one type of node, and pairs of nodes are either connected or not. An unweighted unipartite network can be represented by an adjacency matrix U where U_ij = 1 if nodes i and j are connected, and otherwise is 0. Unweighted networks can be undirected (i.e., U_ij = U_ji) or directed (i.e., U_ij != U_ji), but currently the models implemented in the backbone package only allow undirected networks.

The goal of backbone extraction from an unweighted network is to identify edges in U that are important to the network’s structure, and thus should be preserved in the backbone. Because the network is already unweighted, and the goal is simply to reduce the number of edges, this process is sometimes also called sparsification. The challenge is that, unlike weighted networks and weighted projections, the edges in unweighted networks do not have weights that can be used to evaluate their importance. To overcome this challenge, backbone models designed for unweighted networks first assign each edge a score that is intended to capture its importance.

back to Table of Contents

Structural models

All of the backbone models implemented in the backbone package for unweighted networks are structural models. Different models are designed to preserve or uncover different structural features of the network, but all are characterized by four steps:

What metric is used for the edge score?
How is the edge score normalized?
How are edges filtered based on the (normalized) edge score?
Are edges in the union of minimum spanning trees (UMST) preserved to ensure connectivity?

Some models, such as the Local Sparsification (L-Spar) model (Satuluri, Parthasarathy, and Ruan 2011), are designed to uncover hidden clustering. This model involves assigning each edge a score using the Jaccard coefficient, normalizing these scores by ranking them from the perspective of each node, and filtering them based on the degree of each node. To illustrate, we can generate a dense unweighted network U that contains three hidden communities, then extract its backbone using the backbone_from_unweighted() function and specifying model = "lspar" and parameter = 0.5. The backbone reveals the network’s known three communities.

par(mfrow = c(1, 2), mar = c(0,0,2,0))  #Display plots side-by-side
U <- sample_sbm(60, matrix(c(.75,.25,.25,.25,.75,.25,.25,.25,.75),3,3), c(20,20,20))  #Example unweighted network
plot(U, main = "Original Network", vertex.label = NA)
bb <- backbone_from_unweighted(U, model = "lspar", parameter = 0.5)
plot(bb, main = "L-Spar Backbone", vertex.label = NA)  #Communities are clearly visible

Other models, such as the Local Degree model (Hamann et al. 2016), are designed to uncover hubs. This model involves assigning each edge a score using the degree of an endpoint node, normalizing these scores by ranking them from the perspective of each node, and filtering them based on the degree of each node. To illustrate, we can generate a dense network U that contains some hubs, then extract its backbone using the backbone_from_unweighted() function and specifying model = "degree" and parameter = 0.25. The backbone preserves the highest-degree nodes as hubs.

par(mfrow = c(1, 2), mar = c(0,0,2,0))  #Display plots side-by-side
U <- sample_pa(n = 60, m = 3, directed = FALSE)  #Example unweighted network
V(U)$name <- ""
V(U)$name[which(degree(U)>=sort(degree(U), decreasing = TRUE)[5])] <- LETTERS[1:5]  #Label five highest-degree nodes
#> Warning in V(U)$name[which(degree(U) >= sort(degree(U), decreasing = TRUE)[5])]
#> <- LETTERS[1:5]: number of items to replace is not a multiple of replacement
#> length
plot(U, vertex.size = degree(U), vertex.label.family = "sans", vertex.label.font = 2, main = "Original Network")
bb <- backbone_from_unweighted(U, model = "degree", parameter = 0.25)
plot(bb, vertex.size = degree(bb), vertex.label.family = "sans", vertex.label.font = 2, main = "Local Degree Backbone")

Eight additional structural models are also available:

Skeleton backbone using model = "skeleton" (Karger 1994)
Global sparsification using model = "gspar" (Satuluri, Parthasarathy, and Ruan 2011)
Simmelian backbone using model = "simmelian" (Nick et al. 2013)
Jaccard backbone using model = "jaccard" (Goldberg and Roth 2003)
Meetmin backbone using model = "meetmin" (Goldberg and Roth 2003)
Geometric backbone using model = "geometric" (Goldberg and Roth 2003)
Hypergeometric backbone using model = "hyper" (Goldberg and Roth 2003)
Quadrillateral Simmelian backbone using model = "quadrilateral" (Nocaj, Ortmann, and Brandes 2015).

Each of these models involve different ways of scoring, normalizing, and filtering edges. Currently there is limited guidance on model selection, however Hamann et al. (2016) compared some of these models to understand how they yield different backbone.

back to Table of Contents

Advanced functions

Reproducibility

By default, backbone_from_ functions return a backbone in the same class (matrix, Matrix, or igraph) as the original network. However, to facilitate transparency and reproducibility, specifying backbone_only = FALSE returns a backbone-class object that can be fully described using print().

For example:

B <- matrix(sample(c(0,1), 48, replace = TRUE), 6, 8)  #A simple incidence matrix
bb <- backbone(B, backbone_only = FALSE)  #Extract backbone, return backbone object
#> The backbone package for R (v3.0.2; Neal, 2025) was used to extract the unweighted backbone of the weighted projection of a bipartite network containing 6 agents and 8 artifacts. An edge was retained in the backbone if its weight was statistically significant (alpha = 0.05) using the stochastic degree sequence model (SDSM; Neal, Domagalski, and Sagan, 2021), which reduced the number of edges by 100%.
#> 
#> Neal, Z. P. 2025. backbone: An R Package to Extract Network Backbones. CRAN. https://doi.org/10.32614/CRAN.package.backbone
#> 
#> Neal, Z. P., Domagalski, R., and Sagan, B. (2021). Comparing Alternatives to the Fixed Degree Sequence Model for Extracting the Backbone of Bipartite Projections. Scientific Reports, 11, 23929. https://doi.org/10.1038/s41598-021-03238-3
bb
#> 
#> === SUMMARY OF BACKBONE OBJECT ===
#> ORIGINAL:   Bipartite with 6 agents, 8 artifacts, and 22 edges 
#> 
#> PROJECTION: Weighted with 6 nodes and 16 edges (weights: 1 - 6) 
#> 
#> BACKBONE:   Unweighted with 6 nodes and 0 edges 
#>    Model:   Stochastic Degree Sequence with alpha = 0.05 
#>     Call:   backbone_from_projection(B = N, narrative = TRUE, backbone_only = FALSE) 
#> 
#> DESCRIPTION -
#> The backbone package for R (v3.0.2; Neal, 2025) was used to extract the unweighted backbone of the weighted projection of a bipartite network containing 6 agents and 8 artifacts. An edge was retained in the backbone if its weight was statistically significant (alpha = 0.05) using the stochastic degree sequence model (SDSM; Neal, Domagalski, and Sagan, 2021), which reduced the number of edges by 100%.
#> 
#> Neal, Z. P. 2025. backbone: An R Package to Extract Network Backbones. CRAN. https://doi.org/10.32614/CRAN.package.backbone
#> 
#> Neal, Z. P., Domagalski, R., and Sagan, B. (2021). Comparing Alternatives to the Fixed Degree Sequence Model for Extracting the Backbone of Bipartite Projections. Scientific Reports, 11, 23929. https://doi.org/10.1038/s41598-021-03238-3 
#> ==================================

back to Table of Contents

Signed backbones

By default, statistical backbone models return a backbone that contains edges whose weights were statistically significantly stronger than expected under the corresponding null model using a one-tailed test. However, it is also possible to return a signed backbone that contains positive edges whose weights were statistically significantly stronger than expected, and negative edges whose weights were statistically significantly weaker than expected, using a two-tailed test. Signed backbones are obtained by setting signed = TRUE.

Returning to the example bipartite backbone extraction, we can compare the default non-signed backbone to a signed backbone. Signed backbone extraction relies on a two-tailed test for each edge weight, so to make the two backbones comparable in stringency, we double the alpha parameter (alpha = 0.1) when extracting the signed backbone. The example network is known to contain three communities, which are clearly visible in the non-signed backbone. In the signed backbone, these communities are connected by positive (green) edges, while negative (red) edges exist between communities. If the original bipartite network represents authors connected to papers they have written, then the signed backbone may represent relations of both collaboration (positive) and avoidance (negative).

par(mfrow = c(1, 2), mar = c(0,0,2,0))  #Display plots side-by-side
B <- rbind(cbind(matrix(rbinom(250,1,.8),10),  #Incidence matrix of example bipartite network
                 matrix(rbinom(250,1,.2),10),
                 matrix(rbinom(250,1,.2),10)),
           cbind(matrix(rbinom(250,1,.2),10),
                 matrix(rbinom(250,1,.8),10),
                 matrix(rbinom(250,1,.2),10)),
           cbind(matrix(rbinom(250,1,.2),10),
                 matrix(rbinom(250,1,.2),10),
                 matrix(rbinom(250,1,.8),10)))
B <- graph_from_biadjacency_matrix(B)  #Convert to an igraph object
bb1 <- backbone_from_projection(B, model = "sdsm", alpha = 0.05)  #Extract non-signed backbone
layout <- layout_nicely(bb1)
plot(bb1, vertex.labels = NA, layout = layout, main = "Non-signed")
bb2 <- backbone_from_projection(B, model = "sdsm", alpha = 0.1, signed = TRUE)  #Extract signed backbone
E(bb2)$color <- "green"
E(bb2)$color[which(E(bb2)$sign==-1)] <- "red"
plot(bb2, vertex.labels = NA, layout = layout, main = "Signed")

back to Table of Contents

Custom backbones

The model argument in the backbone_from_unweighted() function allows the extraction of backbones from unweighted networks using 10 different backbone models that have previously been described in the literature. However, it is also possible to specify custom backbone models with model = "custom" and by using different combinations of escore, normalize, filter, and umst. For example, consider a backbone defined by the fewest, strongest (weighted by Jaccard coefficient) edges necessary to ensure a connected network. Although such a backbone has not been described in the literature (and may not be meaningful), it can be extracted using backbone_from_unweighted() with model = "custom":

par(mfrow = c(1, 2), mar = c(0,0,2,0))  #Display plots side-by-side
U <- sample_gnp(100, .25)
plot(U, vertex.label = NA, main = "Original Network")
bb <- backbone_from_unweighted(U, model = "custom", escore = "jaccard", normalize = "none", filter = "proportion", parameter = 0, umst = TRUE)
plot(bb, vertex.label = NA, main = "Custom Backbone")

The model = "custom" argument specifies that a custom backbone model specification should be used. The escore = "jaccard" argument specifies that each unweighted edge should receive a score equal to the Jaccard coefficient of its two endpoints’ neighborhoods, and the normalize = "none" argument specifies that these scores should not be normalized. The filter = "proportion" and parameter = 0 arguments specify that 0% of these edges should be retained in the backbone, while the umst = TRUE argument specifies that the union of maximum spanning trees should be included in the backbone.

back to Table of Contents

Utility functions

Matrix randomization

Extracting a backbone from a bipartite projection using the fixed degree sequence model with backbone_from_projection(model = "fdsm") requires randomly sampling matrices from the space of all binary matrices with given row and column sums, which is performed using the fastball algorithm (Godard and Neal 2022). Given a matrix, fastball() returns a new matrix randomly sampled from the space of all binary matrices with the same row and column sums:

mat <- rbind(c(1,0,0), c(0,0,1), c(0,1,1))
mat
#>      [,1] [,2] [,3]
#> [1,]    1    0    0
#> [2,]    0    0    1
#> [3,]    0    1    1
fastball(mat)
#>      [,1] [,2] [,3]
#> [1,]    0    1    0
#> [2,]    0    0    1
#> [3,]    1    0    1

back to Table of Contents

Bipartite Configuration Model (BiCM)

Extracting a backbone from a bipartite projection using the stochastic degree sequence model with backbone_from_projection(model = "sdsm") requires determining the probability that B_ij = 1 in the space of all matrices with given row and column sums. For example, if the row sums of B are [1,1,2] and the column sums of B are [1,1,2], then in the space of all matrices with these row and column sums, the probability that B_ij = 1 is:

.2	.2	.6
.2	.2	.6
.6	.6	.8

These probabilities are usually unknown and must be estimated. The fastest and most accurate way to estimate them (Neal, Domagalski, and Sagan 2021) is using the Bipartite Configuration Model (Saracco et al. 2015). Given a matrix mat with specific row and column sums, bicm() estimates these probabilities:

mat <- rbind(c(1,0,0), c(0,0,1), c(0,1,1))
bicm(mat)
#>           [,1]      [,2]      [,3]
#> [1,] 0.2157793 0.2157793 0.5684414
#> [2,] 0.2157793 0.2157793 0.5684414
#> [3,] 0.5684414 0.5684414 0.8631171

back to Table of Contents

References

Dianati, N. 2016. “Unwinding the Hairball Graph: Pruning Algorithms for Weighted Complex Networks.” Physical Review E 93: 012304. https://doi.org/10.1103/PhysRevE.93.012304.

Foti, N. J., J. M. Hughes, and D. N. Rockmore. 2011. “Nonparametric Sparsification of Complex Multiscale Networks.” PLOS ONE 6: e16431. https://doi.org/10.1371/journal.pone.0016431.

Godard, Karl, and Zachary P Neal. 2022. “fastball: A Fast Algorithm to Sample Bipartite Graphs with Fixed Degree Sequences.” Journal of Complex Networks 10 (6): cnas049. https://doi.org/10.1093/comnet/cnac049.

Goldberg, Debra S, and Frederick P Roth. 2003. “Assessing Experimentally Derived Interactions in a Small World.” Proceedings of the National Academy of Sciences 100 (8): 4372–76. https://doi.org/10.1073/pnas.0735871100.

Hamann, Michael, Gerd Lindner, Henning Meyerhenke, Christian L Staudt, and Dorothea Wagner. 2016. “Structure-Preserving Sparsification Methods for Social Networks.” Social Network Analysis and Mining 6 (1): 22. https://doi.org/10.1007/s13278-016-0332-2.

Karger, David R. 1994. “Random Sampling in Cut, Flow, and Network Design Problems.” In Proceedings of the Twenty-Sixth Annual ACM Symposium on Theory of Computing, 648–57. https://doi.org/10.1145/195058.195422.

Neal, Zachary P. 2014. “The Backbone of Bipartite Projections: Inferring Relationships from Co-Authorship, Co-Sponsorship, Co-Attendance and Other Co-Behaviors.” Social Networks 39 (October): 84–97. https://doi.org/10.1016/j.socnet.2014.06.001.

Neal, Zachary P, Rachel Domagalski, and Bruce Sagan. 2021. “Comparing Alternatives to the Fixed Degree Sequence Model for Extracting the Backbone of Bipartite Projections.” Scientific Reports 11: 23929. https://doi.org/10.1038/s41598-021-03238-3.

Nick, Bobo, Conrad Lee, Pádraig Cunningham, and Ulrik Brandes. 2013. “Simmelian Backbones: Amplifying Hidden Homophily in Facebook Networks.” In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 525–32. https://doi.org/10.1145/2492517.2492569.

Nocaj, Arlind, Mark Ortmann, and Ulrik Brandes. 2015. “Untangling the Hairballs of Multi-Centered, Small-World Online Social Media Networks.” Journal of Graph Algorithms and Applications 19 (2): 595–618. https://doi.org/10.7155/jgaa.00370.

Saracco, Fabio, Riccardo Di Clemente, Andrea Gabrielli, and Tiziano Squartini. 2015. “Randomizing Bipartite Networks: The Case of the World Trade Web.” Scientific Reports 5 (1): 1–18. https://doi.org/10.1038/srep10595.

Satuluri, Venu, Srinivasan Parthasarathy, and Yiye Ruan. 2011. “Local Graph Sparsification for Scalable Clustering.” In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, 721–32. https://doi.org/10.1145/1989323.1989399.

Serrano, M Ángeles, Marián Boguná, and Alessandro Vespignani. 2009. “Extracting the Multiscale Backbone of Complex Weighted Networks.” Proceedings of the National Academy of Sciences 106 (16): 6483–88. https://doi.org/10.1073/pnas.0808904106.

Zweig, Katharina Anna, and Michael Kaufmann. 2011. “A Systematic Approach to the One-Mode Projection of Bipartite Graphs.” Social Network Analysis and Mining 1 (3): 187–218. https://doi.org/10.1007/s13278-011-0021-0.

Introduction to Backbone

Zachary Neal, Michigan State University, zpneal@msu.edu

Table of Contents

Introduction

Welcome

Installing and loading the package

Package overview

Supported data formats

tl;dr (the `backbone()` wrapper)

Weighted networks

Background

Structural models

Statistical models

Weighted projections

Background (Neal 2014)

Statistical models

Unweighted networks

Background

Structural models

Advanced functions

Reproducibility

Signed backbones

Custom backbones

Utility functions

Matrix randomization

Bipartite Configuration Model (BiCM)

References

Introduction to Backbone

Zachary Neal, Michigan State University, zpneal@msu.edu

Table of Contents

Introduction

Welcome

Installing and loading the package

Package overview

Supported data formats

tl;dr (the backbone() wrapper)

Weighted networks

Background

Structural models

Statistical models

Weighted projections

Background (Neal 2014)

Statistical models

Unweighted networks

Background

Structural models

Advanced functions

Reproducibility

Signed backbones

Custom backbones

Utility functions

Matrix randomization

Bipartite Configuration Model (BiCM)

References

tl;dr (the `backbone()` wrapper)