MariNET - A Novel Framework for Inferring Dynamic Network Relationships from Longitudinal EHRs Using Linear Mixed Models

Marina Vargas-Fernández1,2*, Jordi Martorell-Marugán2,3** and Pedro Carmona-Sáez1,2***

1Department of Statistics and Operational Research. University of Granada
2GENYO, Centre for Genomics and Oncological Research
3Andalusian Foundation for Biomedical Research in Eastern Andalusia (FIBAO)

*marina.vargas@genyo.es
**jordi.martorell@genyo.es
***pcarmona@ugr.es

19 marzo 2025

Abstract

The rapid digitization of healthcare data, particularly through electronic health records (EHRs), has created unprecedented opportunities for biomedical research. EHRs contain rich, heterogeneous, and longitudinal data that, when analyzed at a systems level, can reveal complex patterns underlying disease progression, comorbidities, and patient trajectories. However, the high-dimensional and interdependent nature of these data poses significant analytical challenges, particularly when accounting for temporal dependencies and hierarchical structures inherent in longitudinal studies. Traditional methods, such as Gaussian Graphical Modeling and Vector Autoregression, often fall short in addressing these complexities due to strict assumptions of independence and stationarity, limiting their applicability to real-world EHR data. To overcome these limitations, we introduce MariNET, a novel methodology that leverages linear mixed models (LMM) to infer network relationships from longitudinal EHR data. By incorporating weights derived from LMMs, our method effectively handles correlated observations and provides a robust framework for analyzing dynamic interactions among clinical variables over time. This approach not only enhances the understanding of temporal dependencies in healthcare data but also offers a scalable and practical solution for uncovering clinically relevant insights.

Package

MariNET 1.0.0

1 Introduction to MariNET

The MariNET package provides tools for analyzing longitudinal clinical data using linear mixed models (LMM) and visualizing the results as networks. This vignette demonstrates how to use the package to perform longitudinal analysis and generate network plots.

The purpose of this vignette is to showcase the functionality of the package, including:

Fitting linear mixed models to clinical data
Visualizing the results as networks
Comparing different network structures

2 Installation

You can install MariNET package from CRAN using:

#install.packages("MariNET")
library("MariNET")

3 Loading Data

In this section, we will load the dataset included in the package. Sample data was obtained from previous Assesment study about relationships between COVID-19 and clinical variables related to mental health and social contact (Fried, Papanikolaou, and Epskamp 2022).

# Load the dataset from the package
data(example_data)

# Display the first few rows
head(example_data)
#>   id Relax Irritable Worry Nervous Future Anhedonia Tired Hungry Alone Angry
#> 1  1     1         1     2       1      1         1     2      3     1     2
#> 2  1     2         1     2       1      1         1     1      2     1     2
#> 3  1     1         1     3       1      1         1     2      3     1     1
#> 4  1     1         1     3       2      1         1     2      2     1     1
#> 5  1     1         1     2       1      1         1     2      3     1     1
#> 6  1     3         2     2       1      1         1     1      2     1     1
#>   Social_offline Social_online Music Procrastinate Outdoors C19_occupied
#> 1              3             3     2             1        1            3
#> 2              5             3     2             2        2            2
#> 3              4             4     3             3        1            3
#> 4              5             3     1             3        1            3
#> 5              2             3     1             1        1            2
#> 6              4             4     2             2        2            2
#>   C19_worry Home day beep conc
#> 1         2    5   1    0    1
#> 2         2    5   1    1    2
#> 3         1    5   1    2    3
#> 4         2    5   1    3    4
#> 5         1    5   2    0    5
#> 6         1    4   2    1    6

4 Linear Mixed effects Model network

The present package is focused on the use of linear mixed models in the field of network construction. It should be noted that the described methodology could be applied to different fields of information, as the origin of the data itself makes no difference in the method’s applicability (Bates et al. 2015).

For network construction, a separate linear mixed model is created for each clinical variable, including the others as dependent ones. This process was repeated iteratively for each variable, as performed on previous studies (Velden et al. 2018).

# Extract column names from the dataset
# These represent all available variables in the dataset
varLabs <- colnames(example_data)

# Define a list of variables to be removed from the analysis
# These variables are not included as nodes in the network visualization
remove <- c("id", "day", "beep", "conc")

# Filter out the unwanted variables
# Keeps only the variables that are not in the "remove" list
varLabs <- varLabs[!varLabs %in% remove]

# Print the final list of selected variables to be used as nodes in the network
print(varLabs)
#>  [1] "Relax"          "Irritable"      "Worry"          "Nervous"       
#>  [5] "Future"         "Anhedonia"      "Tired"          "Hungry"        
#>  [9] "Alone"          "Angry"          "Social_offline" "Social_online" 
#> [13] "Music"          "Procrastinate"  "Outdoors"       "C19_occupied"  
#> [17] "C19_worry"      "Home"

The function lmm_analysis() is the main tool of this package. It requires input data with the following conditions:

clinical_data: Dataframe containing clinical and metadata for participants, including identifier as participant_id. Make sure this is the first column of the dataframe.
variables_to_scale: Character vector of variable names to be analyzed, must be numerical as they are scaled.
random_effects: A character string specifying the random effects formula (default: “(1 | participant_id)”).

# Perform Linear Mixed Model (LMM) analysis on the dataset
# This function iterates over selected variables (varLabs) and models their relationships
# while accounting for individual-level variability using a random effect.

model <- lmm_analysis(
  example_data,   # Input dataset containing clinical/longitudinal data
  varLabs,        # List of selected variables to be analyzed in the model
  random_effects = "(1|id)"  # Specifies a random intercept for each individual (id)
)

# Print the model results (optional, useful for debugging or reviewing output)
# print(model)

5 Network visualization

In order to visualize the plot according to grouping factors, it is important to add a structure to the data. This means grouping or selecting colors to differentiate between correlated symptoms. Visualization is based on qgraph package (Epskamp et al. 2023).

# Define the community structure for the variables
# Assigns labels to different groups based on symptoms or categories
community_structure <- c(
  rep("Stress", 8),   # First 8 variables belong to the "Stress" group
  rep("Social", 6),   # Next 6 variables belong to the "Social" group
  rep("Covid-19", 4)  # Last 4 variables belong to the "Covid-19" group
)

# Create a dataframe linking variable names to their assigned community group
structure <- data.frame(varLabs, community_structure)

# Define labels for the network plot (using variable names)
labels <- varLabs

# Load the qgraph package for network visualization
library(qgraph)

# Generate the network plot using qgraph
qgraph(
  model,                                # Adjacency matrix or network model input
  groups = structure$community_structure, # Assign colors based on community groups
  labels = labels,                        # Display variable names as node labels
  legend = TRUE,                           # Include a legend in the plot
  layout = "spring",                       # Use a force-directed "spring" layout for better visualization
  color = c("orange", "lightblue", "#008080"), # Define colors for different groups
  legend.cex = 0.3                          # Adjust the size of the legend text
)

6 Comparison between models

As the weighted matrix is built based on t-values, it is not contained between -1 and 1 values. This means that it is not comparable with usual network modeling methods, which rely on correlation and pairwise estimation. For comparability purposes, normalization is performed on the adjacency matrix, scaling values by range. Then, normalized weighted matrices are subtracted to see differences.

# Fit a second Linear Mixed Model (LMM) with a more complex random effects structure
# This model accounts for repeated measures within individuals (id) over different days (day)
# and also considers an additional random effect for the variable "conc" (context or condition)

model2 <- lmm_analysis(
  example_data,    # Input dataset containing clinical/longitudinal data
  varLabs,         # List of selected variables to be analyzed in the model
  random_effects = "(1|id/day) + (1|conc)"  # Random effects structure:
                                            # (1|id/day) -> Nested random effect for each day within an individual
                                            # (1|conc) -> Additional random effect for "conc" variable
)
#> boundary (singular) fit: see help('isSingular')
#> boundary (singular) fit: see help('isSingular')
#> boundary (singular) fit: see help('isSingular')

# Generate a network visualization from the second LMM model
qgraph(
  model2,                                # Adjacency matrix or network model derived from LMM
  groups = structure$community_structure, # Assign colors based on predefined symptom groups
  labels = labels,                        # Display variable names as node labels
  legend = TRUE,                           # Include a legend in the plot
  layout = "spring",                       # Use a force-directed "spring" layout for better visualization
  color = c("orange", "lightblue", "#008080"), # Define colors for different variable groups
  legend.cex = 0.3                          # Adjust the legend text size to avoid oversized labels
)

Subtraction is performed between adjacency matrices. Normalization between -1 and 1 is performed inside differentiation() function. This function requires two adjacency matrices as an input, both of them should have the same dimensions and node names.

# Compute the difference between the two Linear Mixed Model (LMM) networks
# This highlights changes in relationships when considering different random effect structures
difference <- differentiation(model, model2)

# Generate a network visualization of the differences between the two models
qgraph(
  difference,                            # Adjacency matrix representing differences between model1 and model2
  groups = structure$community_structure, # Assign colors based on predefined variable groups
  labels = labels,                        # Display variable names as node labels
  legend = TRUE,                           # Include a legend in the plot
  layout = "spring",                       # Use a force-directed "spring" layout for better visualization
  color = c("orange", "lightblue", "#008080"), # Define colors for different variable groups
  legend.cex = 0.3                          # Adjust legend text size to keep it readable
)

7 Additional information

To check your R session information, including loaded packages, R version, and system details.

sessionInfo()
#> R version 4.4.3 (2025-02-28)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.5 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=es_ES.UTF-8        LC_COLLATE=C              
#>  [5] LC_MONETARY=es_ES.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=es_ES.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: Europe/Madrid
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] qgraph_1.9.8     MariNET_1.0.0    BiocStyle_2.32.1
#> 
#> loaded via a namespace (and not attached):
#>  [1] gtable_0.3.6        xfun_0.51           bslib_0.9.0        
#>  [4] ggplot2_3.5.1       htmlwidgets_1.6.4   psych_2.4.12       
#>  [7] lattice_0.22-5      quadprog_1.5-8      vctrs_0.6.5        
#> [10] tools_4.4.3         Rdpack_2.6.3        generics_0.1.3     
#> [13] stats4_4.4.3        parallel_4.4.3      tibble_3.2.1       
#> [16] cluster_2.1.8       pkgconfig_2.0.3     Matrix_1.7-2       
#> [19] data.table_1.17.0   checkmate_2.3.2     lifecycle_1.0.4    
#> [22] compiler_4.4.3      stringr_1.5.1       tinytex_0.56       
#> [25] mnormt_2.1.1        munsell_0.5.1       glasso_1.11        
#> [28] htmltools_0.5.8.1   sass_0.4.9          fdrtool_1.2.18     
#> [31] yaml_2.3.10         htmlTable_2.4.3     Formula_1.2-5      
#> [34] pillar_1.10.1       nloptr_2.2.1        jquerylib_0.1.4    
#> [37] MASS_7.3-64         cachem_1.1.0        Hmisc_5.2-3        
#> [40] reformulas_0.4.0    abind_1.4-8         rpart_4.1.24       
#> [43] boot_1.3-31         nlme_3.1-167        lavaan_0.6-19      
#> [46] gtools_3.9.5        tidyselect_1.2.1    digest_0.6.37      
#> [49] stringi_1.8.4       dplyr_1.1.4         reshape2_1.4.4     
#> [52] bookdown_0.42       splines_4.4.3       fastmap_1.2.0      
#> [55] grid_4.4.3          colorspace_2.1-1    cli_3.6.4          
#> [58] magrittr_2.0.3      base64enc_0.1-3     pbivnorm_0.6.0     
#> [61] withr_3.0.2         foreign_0.8-88      corpcor_1.6.10     
#> [64] scales_1.3.0        backports_1.5.0     rmarkdown_2.29     
#> [67] jpeg_0.1-10         igraph_2.1.4        nnet_7.3-20        
#> [70] lme4_1.1-36         gridExtra_2.3       png_0.1-8          
#> [73] pbapply_1.7-2       evaluate_1.0.3      knitr_1.50         
#> [76] rbibutils_2.3       rlang_1.1.5         Rcpp_1.0.14        
#> [79] glue_1.8.0          BiocManager_1.30.25 rstudioapi_0.17.1  
#> [82] minqa_1.2.8         jsonlite_1.9.1      R6_2.6.1           
#> [85] plyr_1.8.9

References

Bates, Douglas, Martin Mächler, Ben Bolker, and Steve Walker. 2015. “Fitting Linear Mixed-Effects Models Using lme4.” Journal of Statistical Software 67 (1): 1–48. https://doi.org/10.18637/jss.v067.i01.

Epskamp, Sacha, Giulio Costantini, Jonas Haslbeck, and Adela Isvoranu. 2023. Qgraph: Graph Plotting Methods, Psychometric Data Visualization and Graphical Model Estimation. https://CRAN.R-project.org/package=qgraph.

Fried, Eiko I, Faidra Papanikolaou, and Sacha Epskamp. 2022. “Mental Health and Social Contact During the COVID-19 Pandemic: An Ecological Momentary Assessment Study.” Clinical Psychological Science 10 (2): 340–54.

Velden, Rachel MJ van der, Anne EP Mulders, Marjan Drukker, Mark L Kuijf, and Albert FG Leentjens. 2018. “Network Analysis of Symptoms in a Parkinson Patient Using Experience Sampling Data: An n= 1 Study.” Movement Disorders 33 (12): 1938–44.