First, let's load the libraries
library("ADNIMERGE") library(cusp) library(psych) #for composite scores
Now, we want to do a composite score of a cognitive domain. Let's try the Delay Recall on Memory. we are going to do a composite score with the result of delay recall test for ADAS-Cog and Rey AVLT. So we must get the data,
tmp <- merge(adas, neurobat, by=c("RID", "VISCODE") ) m <- merge(tmp, adnimerge, by=c("RID", "VISCODE") ) rm(tmp)
Also the independent variables, as well as the covariables must be defined this soon. Why? Because, since we are going to ask for a complete cases set, the whole set of variables define by itself the amount of available data. That is, if FDG and whole brain volume are going to be the independent variables the amount of subjects is determined by the availability of those measurements in the sample.
Important Remark: The adnimerge table already provides a useful merged environment for a large set of variables. There are two amyloid burden measurements (PiB and AV45), some morphometry measurements (Ventricles, Hippocampus, WholeBrain, Entorhinal, Fusiform, MidTemp, ICV) and one metabolic measurement (FDG) for neuroimaging markers. But also groups the age, gender, education, APoE-4 and diagnostic of the subjects. Even so, There are several diagnostic related metrics as FAQ (Functional Activities Questionnaire), MMSE (MiniMental State Examination), ADAS (Alzheimer's Disease Assessment Scale), CDRSB (Clinical Dementia Rating Sum of Boxes), MoCA (Montreal Cognitive Assessment) or RAVLT (Rey Auditory Verbal Learning Test).
Let's check first for FDG and WholeBrain as the independent variables. Let be age, education, gender and APoE the covariables for the model.
m$cAGE = m$AGE + m$Years #calculate the right Age data <- data.frame(m$FDG, m$WholeBrain, m$ICV, m$cAGE, m$PTGENDER, m$PTEDUCAT, m$APOE4, m$AVDEL30MIN, m$Q4SCORE) #select the proper data datac <- data[complete.cases(data),] #choose only complete cases # Look here, since the test going in opposite directions, I must choose one direction as the proper one (no matter which) datac$zavd = (datac$m.AVDEL30MIN - mean(datac$m.AVDEL30MIN))/sd(datac$m.AVDEL30MIN) datac$zdr = (mean(datac$m.Q4SCORE) - datac$m.Q4SCORE)/sd(datac$m.Q4SCORE) #Now I got the composite scores gfam <- data.frame(datac$zavd, datac$zdr) famod <- fa(gfam, scores="regression") datac$drcs <- famod$scores #Convert the gender into an integer datac$Gender <- as.integer(factor(datac$m.PTGENDER, levels=c("Male","Female"), labels=c(1,2))) #Normalize the brain volume by ICV (There is another ways to do this) datac$wb = datac$m.WholeBrain/datac$m.ICV #and roll the dice fit <- cusp(y ~ drcs + m.cAGE + Gender + m.PTEDUCAT + m.APOE4, alpha ~ m.FDG + wb, beta ~ m.FDG + wb, datac)