Tools to Generate Predicted Values of DNAm Surrogates in R
June 19, 2025
Copying Code Blocks
All code blocks can be copied by clicking the clipboard icon in the upper right corner. If the icon is hidden, hovering your mouse cursor in the area should reveal it.
Scrolling Content
In come cases, content may wrap beyond the limits of the slide. It may be necessary to scroll up/down or left/right.
This package is still in development and not yet ready for general use.
Proceed with caution!
Simple set of user-friendly functions for generating predicted values from existing DNA methylation surrogates
MethylSurroGetR is available on GitHubremotes::install_github()
samp1 samp2 samp3 samp4 samp5
cg01 0.1028646 0.32037324 0.48290240 NA 0.36948887
cg02 0.2875775 NA NA 0.8830174 NA
cg04 0.4348927 0.18769112 0.89035022 0.1422943 0.98421920
cg05 0.9849570 0.78229430 0.91443819 0.5492847 0.15420230
cg07 0.8998250 0.24608773 NA 0.3279207 0.95450365
cg08 0.8895393 0.69280341 0.64050681 0.9942698 0.65570580
cg09 NA 0.09359499 0.60873498 0.9540912 NA
cg10 0.8864691 NA NA 0.5854834 0.14190691
cg12 0.1750527 NA 0.14709469 NA 0.69000710
cg13 0.9630242 0.90229905 0.69070528 0.7954674 0.02461368
cg14 0.1306957 NA 0.93529980 0.6478935 NA
cg16 0.6531019 0.33282354 0.30122890 0.3198206 0.89139412
cg17 0.1428000 0.41454634 0.41372433 0.3688455 0.15244475
cg19 0.3435165 0.48861303 0.06072057 0.3077200 NA
cg20 0.6567581 0.95447383 0.94772694 0.2197676 NA
wt_lin wt_prb wt_cnt
cg02 -0.009083377 0.16511519 0.050895032
cg03 -0.001155999 -0.40515934 0.025844226
cg06 0.005978497 -0.11603036 0.042036480
cg07 -0.007562015 -0.22561636 -0.099875045
cg08 0.001218960 0.31464004 -0.004936685
cg11 -0.005869372 -0.05148366 -0.055976223
cg13 -0.007449367 0.31006435 -0.024036692
cg15 0.005066157 0.31238951 0.022554201
cg17 0.007900907 0.29434232 -0.029640418
cg18 -0.002510744 -0.06016831 -0.077772915
Intercept 1.211000000 0.01900000 0.937000000
mehtyl_surro Objectmethyl: Numeric matrix of methylation data
weights: Named numeric vector of surrogate weights.intercept: Optional chacracter string to identify the name of the intercept in the weights objectmehtyl_surro Objectlin_surrogate <- surro_set(methyl = beta_matrix_miss,
weights = wts_vec_lin,
intercept = "Intercept")
print(lin_surrogate)$methyl
samp1 samp2 samp3 samp4 samp5
cg02 0.2875775 NA NA 0.8830174 NA
cg07 0.8998250 0.2460877 NA 0.3279207 0.95450365
cg08 0.8895393 0.6928034 0.6405068 0.9942698 0.65570580
cg13 0.9630242 0.9022990 0.6907053 0.7954674 0.02461368
cg17 0.1428000 0.4145463 0.4137243 0.3688455 0.15244475
cg03 NA NA NA NA NA
cg06 NA NA NA NA NA
cg11 NA NA NA NA NA
cg15 NA NA NA NA NA
cg18 NA NA NA NA NA
$weights
cg02 cg03 cg06 cg07 cg08 cg11
-0.009083377 -0.001155999 0.005978497 -0.007562015 0.001218960 -0.005869372
cg13 cg15 cg17 cg18
-0.007449367 0.005066157 0.007900907 -0.002510744
$intercept
Intercept
1.211
attr(,"class")
[1] "methyl_surro"
| samp1 | samp2 | samp3 | samp4 | samp5 | |
|---|---|---|---|---|---|
| cg02 | 0.288 | NA | NA | 0.883 | NA |
| cg07 | 0.900 | 0.246 | NA | 0.328 | 0.955 |
| cg08 | 0.890 | 0.693 | 0.641 | 0.994 | 0.656 |
| cg13 | 0.963 | 0.902 | 0.691 | 0.795 | 0.025 |
| cg17 | 0.143 | 0.415 | 0.414 | 0.369 | 0.152 |
| cg03 | NA | NA | NA | NA | NA |
| cg06 | NA | NA | NA | NA | NA |
| cg11 | NA | NA | NA | NA | NA |
| cg15 | NA | NA | NA | NA | NA |
| cg18 | NA | NA | NA | NA | NA |
methyl_surro: methyl_surro object created with surro_set()
Missing Data Summary for methyl_surro Object
============================================
Total probes: 10
Total samples: 5
Complete probes: 3 (30.0%)
Probes with missing observations: 2 (20.0%)
Completely missing probes: 5 (50.0%)
Overall missing rate: 58.0%
Probes with partial missing data:
cg02 cg07
0.6 0.2
Completely missing probes:
cg03, cg06, cg11, cg15, cg18
methyl_surro: methyl_surro objectmethod: Character string indicating the imputation method
min_nonmiss_prop: Optional minimum proportion of non-missing data required in a probe for imputation to proceedlin_surrogate <- impute_obs(methyl_surro = lin_surrogate,
method = "mean",
min_nonmiss_prop = 0)
print(lin_surrogate)$methyl
samp1 samp2 samp3 samp4 samp5
cg02 0.2875775 0.5852975 0.5852975 0.8830174 0.58529746
cg07 0.8998250 0.2460877 0.6070843 0.3279207 0.95450365
cg08 0.8895393 0.6928034 0.6405068 0.9942698 0.65570580
cg13 0.9630242 0.9022990 0.6907053 0.7954674 0.02461368
cg17 0.1428000 0.4145463 0.4137243 0.3688455 0.15244475
cg03 NA NA NA NA NA
cg06 NA NA NA NA NA
cg11 NA NA NA NA NA
cg15 NA NA NA NA NA
cg18 NA NA NA NA NA
$weights
cg02 cg03 cg06 cg07 cg08 cg11
-0.009083377 -0.001155999 0.005978497 -0.007562015 0.001218960 -0.005869372
cg13 cg15 cg17 cg18
-0.007449367 0.005066157 0.007900907 -0.002510744
$intercept
Intercept
1.211
attr(,"class")
[1] "methyl_surro"
methyl_surro: methyl_surro objectreference: Named numeric vector of methylation reference valuestype: Character string to identify which probes to fill mean median
cg01 0.3992451 0.3694889
cg02 0.6616689 0.7883051
cg03 0.4948262 0.5281055
cg04 0.5278895 0.4348927
cg05 0.6770353 0.7822943
cg06 0.5526592 0.5726334
cg07 0.4940793 0.3279207
cg08 0.7745650 0.6928034
cg09 0.5281033 0.6087350
cg10 0.4982656 0.4667790
cg11 0.4566024 0.5440660
cg12 0.3856340 0.4045103
cg13 0.6752219 0.7954674
cg14 0.5866269 0.6192565
cg15 0.4004940 0.3181810
cg16 0.4996738 0.3328235
cg17 0.2984722 0.3688455
cg18 0.3923206 0.2659726
cg19 0.3747138 0.3435165
cg20 0.7031609 0.7370777
lin_surrogate <- reference_fill(methyl_surro = lin_surrogate,
reference = ref_vec_mean,
type = "probes")
print(lin_surrogate)$methyl
samp1 samp2 samp3 samp4 samp5
cg02 0.2875775 0.5852975 0.5852975 0.8830174 0.58529746
cg07 0.8998250 0.2460877 0.6070843 0.3279207 0.95450365
cg08 0.8895393 0.6928034 0.6405068 0.9942698 0.65570580
cg13 0.9630242 0.9022990 0.6907053 0.7954674 0.02461368
cg17 0.1428000 0.4145463 0.4137243 0.3688455 0.15244475
cg03 0.4948262 0.4948262 0.4948262 0.4948262 0.49482616
cg06 0.5526592 0.5526592 0.5526592 0.5526592 0.55265924
cg11 0.4566024 0.4566024 0.4566024 0.4566024 0.45660238
cg15 0.4004940 0.4004940 0.4004940 0.4004940 0.40049405
cg18 0.3923206 0.3923206 0.3923206 0.3923206 0.39232059
$weights
cg02 cg03 cg06 cg07 cg08 cg11
-0.009083377 -0.001155999 0.005978497 -0.007562015 0.001218960 -0.005869372
cg13 cg15 cg17 cg18
-0.007449367 0.005066157 0.007900907 -0.002510744
$intercept
Intercept
1.211
attr(,"class")
[1] "methyl_surro"
methyl_surro: methyl_surro objecttransform: Character string specifying the transformation to apply
"linear": For surrogates estimated with Gaussian models"count": For surrogates estimated with Poisson models"probability": For surrogates estimated with binomial modelsestimates <- beta_matrix_miss |>
surro_set(weights = wts_vec_lin, intercept = "Intercept") |>
impute_obs(method = "mean") |>
reference_fill(reference = ref_vec_mean, type = "probes") |>
surro_calc(transform = "linear")
print(estimates) samp1 samp2 samp3 samp4 samp5
1.197718 1.202317 1.201093 1.199796 1.201382