Batch Effect Diagnostics

In this section, we will use a sample data included in this package to provide a thorough introduction to the key functionalities of the ComBatFamQC package for batch effect diagnostics.

Data


The sample dataset in the package is the longitudinal cortical thickness data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) study (Jack et al. (2008)), which con- tains cortical thickness measures of 62 regions from 663 unique participants with images collected longitudinally between 2-6 visits. The processed data can be downloaded from GitHub.

The key variables of interest are presented below:

  • Batch variable:
    • manufac: MRI manufacturer
  • Covariates
    • timedays: time related variable
    • Age
    • SEX
    • DIAGNOSIS: (AD, CN, LMCI)
    (We consider an interaction effect between timedays and DIAGNOSIS)
  • Features to harmonize
    • 62 cortical thickness regions
  • Random Effect
    • subid : Subject IDs.

Interactive Batch Effect Diagnostics


Users can launch the Shiny app using the following code:

features <- colnames(adni)[c(43:104)]
covariates <- c("timedays", "AGE", "SEX", "DIAGNOSIS")
interaction <- c("timedays,DIAGNOSIS")
batch <- "manufac"
result_lmer <- visual_prep(type = "lmer", features = features, batch = batch, covariates = covariates, interaction = interaction, smooth = NULL, random = "subid", df = adni)
comfam_shiny(result_lmer)

Alternatively, users can utilize the command-line interface via ComBatQC_CLI.R. The essential steps have been streamlined, significantly simplifying the workflow and making it easier to get started. The following parameters must be assigned values:

  • --diagnosis/-d: TRUE (by default)
  • --visualization/-v: TRUE (by default)
  • --features/-f: position of features/rois data (column numbers)
  • --covariates/-c: column position of the covariates whose effects should be preserved
  • --batch/-b: column position of the batch variable
  • --smooth/-s: column position of the smooth term (if a gam model is used)
  • --random/-r: column position of the random effect term (if a lmer model is used)
Rscript path/to/combatQC_CLI.R path/to/unharmonized_data.csv -d TRUE -v TRUE --features 43-104 --covariates 9,11,13-14 -b 16 -i 9*14 -r 3 -m lmer 

Data Overview

Figure 1: ComBatFamQC: Data Overview

Summary

Figure 2: ComBatFamQC: Summary

Residial Plot

Figure 3: ComBatFamQC: Residual Plot

Diagnosis of Global Batch Effect

Figure 4: ComBatFamQC: Diagnosis of Global Batch Effect

Diagnosis of Individual Batch Effect

Figure 5: ComBatFamQC: Diagnosis of Individual Batch Effect

Interactive Harmonization

Figure 6: ComBatFamQC: Interactive Harmonization

Export Batch Effect Diagnosis Result


Users can save the batch effect diagnosis results using the following code. There are two ways to export the results: 1) as a combined EXCEL file, or 2) as a Quarto report.

  • Export to a combined EXCEL file
diag_save(path = "path/to/dir", result = result_lmer, use_quarto = FALSE)
  • Export to a Quarto Report
library(quarto)
diag_save(path = "path/to/dir", result = result_lmer, use_quarto = TRUE)

Using the command-line interface requires users to set the following parameter:

  • --visualization/-v: FALSE
  • --outdir: Path to a directory for saving the batch effect diagnosis result
  • --quarto/-q: Whether to generate a Quarto report.
Rscript path/to/combatQC_CLI.R path/to/unharmonized_data.csv -d TRUE -v FALSE --features 43-104 --covariates 9,11,13-14 -b 16 -i 9*14 -r 3 -m lmer --outdir /path/to/dir --quarto TRUE