A Very Basic Intro to SLCMA

Joshua A. Goode

2024-04-17

99 Problems

In the words of Jay-Z, “We got 99 problems and data ain’t one.”

(Or something like that.)

What We Have
  • Several large panel studies
  • Many years of data
  • A plethora of data types
  • Really interesting questions
What We Need
  • Way to use our vast data to test hypotheses across the life course
    • Systematic to avoid false-positive results
    • Efficient to accommodate analysis of high-dimensional data
    • Easy to use

What Is SLCMA?

Structured
Life
Course
Modeling
Approach

Which Life Course Hypothesis Best Fits Our Data?

SLCMA R Package

  • R Package (slcma)
    • Created by Andrew Smith
    • Under review by CRAN
  • I’ll be talking in the broadest terms
    • No code
    • Reach out for more info

SLCMA Steps

  1. Fit a regression model for each single life course hypothesis of interest, as well as groups of hypotheses
    • Manually: Fit a model for each hypothesis, as well as compound hypotheses
    • slcma: Uses LARS to fit each model
  2. Measure the goodness-of-fit of each model and select the best one
    • Manually: Create an elbow plot
    • slcma: Generates an elbow plot with a single command
  3. Calculate appropriate p-values for the selected model
    • Manually: Apply a Bonferroni correction
    • slcma: Uses fixed LASSO inference or max-|t| test to correct p-values

High Throughput Methylation Data

  • A LOT OF MODELS!!
    • 850K CpGs × 5 Hypotheses = 4.25M Regression Models
  • SLCMA can select the best-fitting hypothesis for each probe
    • Compound hypotheses are difficult
  • SLCMA can typically find more associations than cross-sectional or ever-exposed EWAS models
    • Multiple-testing correction raises significance threshold significantly