The rapid digitization of healthcare data, particularly through electronic health records (EHRs), has created unprecedented opportunities for biomedical research. EHRs contain rich, heterogeneous, and longitudinal data that, when analyzed at a systems level, can reveal complex patterns underlying disease progression, comorbidities, and patient trajectories. However, the high-dimensional and interdependent nature of these data poses significant analytical challenges, particularly when accounting for temporal dependencies and hierarchical structures inherent in longitudinal studies. Traditional methods, such as Gaussian Graphical Modeling and Vector Autoregression, often fall short in addressing these complexities due to strict assumptions of independence and stationarity, limiting their applicability to real-world EHR data. To overcome these limitations, we introduce MariNET, a novel methodology that leverages linear mixed models (LMM) to infer network relationships from longitudinal EHR data. By incorporating weights derived from LMMs, our method effectively handles correlated observations and provides a robust framework for analyzing dynamic interactions among clinical variables over time. This approach not only enhances the understanding of temporal dependencies in healthcare data but also offers a scalable and practical solution for uncovering clinically relevant insights.
MariNET 1.0.0
The MariNET
package provides tools for analyzing longitudinal clinical data using linear mixed models (LMM) and visualizing the results as networks. This vignette demonstrates how to use the package to perform longitudinal analysis and generate network plots.
The purpose of this vignette is to showcase the functionality of the package, including:
You can install MariNET
package from CRAN using:
#install.packages("MariNET")
library("MariNET")
In this section, we will load the dataset included in the package. Sample data was obtained from previous Assesment study about relationships between COVID-19 and clinical variables related to mental health and social contact (Fried, Papanikolaou, and Epskamp 2022).
# Load the dataset from the package
data(example_data)
# Display the first few rows
head(example_data)
#> id Relax Irritable Worry Nervous Future Anhedonia Tired Hungry Alone Angry
#> 1 1 1 1 2 1 1 1 2 3 1 2
#> 2 1 2 1 2 1 1 1 1 2 1 2
#> 3 1 1 1 3 1 1 1 2 3 1 1
#> 4 1 1 1 3 2 1 1 2 2 1 1
#> 5 1 1 1 2 1 1 1 2 3 1 1
#> 6 1 3 2 2 1 1 1 1 2 1 1
#> Social_offline Social_online Music Procrastinate Outdoors C19_occupied
#> 1 3 3 2 1 1 3
#> 2 5 3 2 2 2 2
#> 3 4 4 3 3 1 3
#> 4 5 3 1 3 1 3
#> 5 2 3 1 1 1 2
#> 6 4 4 2 2 2 2
#> C19_worry Home day beep conc
#> 1 2 5 1 0 1
#> 2 2 5 1 1 2
#> 3 1 5 1 2 3
#> 4 2 5 1 3 4
#> 5 1 5 2 0 5
#> 6 1 4 2 1 6
The present package is focused on the use of linear mixed models in the field of network construction. It should be noted that the described methodology could be applied to different fields of information, as the origin of the data itself makes no difference in the method’s applicability (Bates et al. 2015).
For network construction, a separate linear mixed model is created for each clinical variable, including the others as dependent ones. This process was repeated iteratively for each variable, as performed on previous studies (Velden et al. 2018).
# Extract column names from the dataset
# These represent all available variables in the dataset
varLabs <- colnames(example_data)
# Define a list of variables to be removed from the analysis
# These variables are not included as nodes in the network visualization
remove <- c("id", "day", "beep", "conc")
# Filter out the unwanted variables
# Keeps only the variables that are not in the "remove" list
varLabs <- varLabs[!varLabs %in% remove]
# Print the final list of selected variables to be used as nodes in the network
print(varLabs)
#> [1] "Relax" "Irritable" "Worry" "Nervous"
#> [5] "Future" "Anhedonia" "Tired" "Hungry"
#> [9] "Alone" "Angry" "Social_offline" "Social_online"
#> [13] "Music" "Procrastinate" "Outdoors" "C19_occupied"
#> [17] "C19_worry" "Home"
The function lmm_analysis() is the main tool of this package. It requires input data with the following conditions:
# Perform Linear Mixed Model (LMM) analysis on the dataset
# This function iterates over selected variables (varLabs) and models their relationships
# while accounting for individual-level variability using a random effect.
model <- lmm_analysis(
example_data, # Input dataset containing clinical/longitudinal data
varLabs, # List of selected variables to be analyzed in the model
random_effects = "(1|id)" # Specifies a random intercept for each individual (id)
)
# Print the model results (optional, useful for debugging or reviewing output)
# print(model)
In order to visualize the plot according to grouping factors, it is important to add a structure to the data. This means grouping or selecting colors to differentiate between correlated symptoms. Visualization is based on qgraph package (Epskamp et al. 2023).
# Define the community structure for the variables
# Assigns labels to different groups based on symptoms or categories
community_structure <- c(
rep("Stress", 8), # First 8 variables belong to the "Stress" group
rep("Social", 6), # Next 6 variables belong to the "Social" group
rep("Covid-19", 4) # Last 4 variables belong to the "Covid-19" group
)
# Create a dataframe linking variable names to their assigned community group
structure <- data.frame(varLabs, community_structure)
# Define labels for the network plot (using variable names)
labels <- varLabs
# Load the qgraph package for network visualization
library(qgraph)
# Generate the network plot using qgraph
qgraph(
model, # Adjacency matrix or network model input
groups = structure$community_structure, # Assign colors based on community groups
labels = labels, # Display variable names as node labels
legend = TRUE, # Include a legend in the plot
layout = "spring", # Use a force-directed "spring" layout for better visualization
color = c("orange", "lightblue", "#008080"), # Define colors for different groups
legend.cex = 0.3 # Adjust the size of the legend text
)
As the weighted matrix is built based on t-values, it is not contained between -1 and 1 values. This means that it is not comparable with usual network modeling methods, which rely on correlation and pairwise estimation. For comparability purposes, normalization is performed on the adjacency matrix, scaling values by range. Then, normalized weighted matrices are subtracted to see differences.
# Fit a second Linear Mixed Model (LMM) with a more complex random effects structure
# This model accounts for repeated measures within individuals (id) over different days (day)
# and also considers an additional random effect for the variable "conc" (context or condition)
model2 <- lmm_analysis(
example_data, # Input dataset containing clinical/longitudinal data
varLabs, # List of selected variables to be analyzed in the model
random_effects = "(1|id/day) + (1|conc)" # Random effects structure:
# (1|id/day) -> Nested random effect for each day within an individual
# (1|conc) -> Additional random effect for "conc" variable
)
#> boundary (singular) fit: see help('isSingular')
#> boundary (singular) fit: see help('isSingular')
#> boundary (singular) fit: see help('isSingular')
# Generate a network visualization from the second LMM model
qgraph(
model2, # Adjacency matrix or network model derived from LMM
groups = structure$community_structure, # Assign colors based on predefined symptom groups
labels = labels, # Display variable names as node labels
legend = TRUE, # Include a legend in the plot
layout = "spring", # Use a force-directed "spring" layout for better visualization
color = c("orange", "lightblue", "#008080"), # Define colors for different variable groups
legend.cex = 0.3 # Adjust the legend text size to avoid oversized labels
)
Subtraction is performed between adjacency matrices. Normalization between -1 and 1 is performed inside differentiation() function. This function requires two adjacency matrices as an input, both of them should have the same dimensions and node names.
# Compute the difference between the two Linear Mixed Model (LMM) networks
# This highlights changes in relationships when considering different random effect structures
difference <- differentiation(model, model2)
# Generate a network visualization of the differences between the two models
qgraph(
difference, # Adjacency matrix representing differences between model1 and model2
groups = structure$community_structure, # Assign colors based on predefined variable groups
labels = labels, # Display variable names as node labels
legend = TRUE, # Include a legend in the plot
layout = "spring", # Use a force-directed "spring" layout for better visualization
color = c("orange", "lightblue", "#008080"), # Define colors for different variable groups
legend.cex = 0.3 # Adjust legend text size to keep it readable
)
To check your R session information, including loaded packages, R version, and system details.
sessionInfo()
#> R version 4.4.3 (2025-02-28)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.5 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=es_ES.UTF-8 LC_COLLATE=C
#> [5] LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=es_ES.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Europe/Madrid
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] qgraph_1.9.8 MariNET_1.0.0 BiocStyle_2.32.1
#>
#> loaded via a namespace (and not attached):
#> [1] gtable_0.3.6 xfun_0.51 bslib_0.9.0
#> [4] ggplot2_3.5.1 htmlwidgets_1.6.4 psych_2.4.12
#> [7] lattice_0.22-5 quadprog_1.5-8 vctrs_0.6.5
#> [10] tools_4.4.3 Rdpack_2.6.3 generics_0.1.3
#> [13] stats4_4.4.3 parallel_4.4.3 tibble_3.2.1
#> [16] cluster_2.1.8 pkgconfig_2.0.3 Matrix_1.7-2
#> [19] data.table_1.17.0 checkmate_2.3.2 lifecycle_1.0.4
#> [22] compiler_4.4.3 stringr_1.5.1 tinytex_0.56
#> [25] mnormt_2.1.1 munsell_0.5.1 glasso_1.11
#> [28] htmltools_0.5.8.1 sass_0.4.9 fdrtool_1.2.18
#> [31] yaml_2.3.10 htmlTable_2.4.3 Formula_1.2-5
#> [34] pillar_1.10.1 nloptr_2.2.1 jquerylib_0.1.4
#> [37] MASS_7.3-64 cachem_1.1.0 Hmisc_5.2-3
#> [40] reformulas_0.4.0 abind_1.4-8 rpart_4.1.24
#> [43] boot_1.3-31 nlme_3.1-167 lavaan_0.6-19
#> [46] gtools_3.9.5 tidyselect_1.2.1 digest_0.6.37
#> [49] stringi_1.8.4 dplyr_1.1.4 reshape2_1.4.4
#> [52] bookdown_0.42 splines_4.4.3 fastmap_1.2.0
#> [55] grid_4.4.3 colorspace_2.1-1 cli_3.6.4
#> [58] magrittr_2.0.3 base64enc_0.1-3 pbivnorm_0.6.0
#> [61] withr_3.0.2 foreign_0.8-88 corpcor_1.6.10
#> [64] scales_1.3.0 backports_1.5.0 rmarkdown_2.29
#> [67] jpeg_0.1-10 igraph_2.1.4 nnet_7.3-20
#> [70] lme4_1.1-36 gridExtra_2.3 png_0.1-8
#> [73] pbapply_1.7-2 evaluate_1.0.3 knitr_1.50
#> [76] rbibutils_2.3 rlang_1.1.5 Rcpp_1.0.14
#> [79] glue_1.8.0 BiocManager_1.30.25 rstudioapi_0.17.1
#> [82] minqa_1.2.8 jsonlite_1.9.1 R6_2.6.1
#> [85] plyr_1.8.9