I'm looking at DNA methylation (DNAm) data such as TCGA (e.g., BRCA, KIRP, KIRC, etc.). Currently trying to build use my model to predict DNAm age on test sets, but many of the data sets are missing key probe values used in my clock/model. I have inspected approximately 30 GEO data series and the following TCGA cancer samples: KIRP, KIRC, LUAD, LUSC, BRCA, THCA. They are each missing key probes I am using. Is there a common way to impute the missing values in R?
Asked
Active
Viewed 28 times
3
-
1Could you please clarify what "my clock/model" means, and what specifically is missing? How do you know that probes are missing? If this is from methylation chip data, have you confirmed that the probes exist in the probes used for that chip? – gringer Apr 05 '22 at 00:33
-
Hi, could you show what kind of data do you have and how you identify these missing probes? This might help having some code to build upon and answer your question. I think there isn't any general way to impute missing values, so knowing the data and its sparsity might help proposing imputation methods. – llrs Apr 06 '22 at 10:23