I tried to draw some restricted spline plots to test the linearity assumption of logistic regression, but the for loop doesn't seem to treat the variables in the "a" vector as row names in the "data". What is the solution?
my code:
dd <- datadist(data)
dd$limits$Age[2] <- 0
dd$limits$SBP[2] <- 0
dd$limits$FT4[2] <- 0
dd$limits$Glu[2] <- 0
options(datadist="dd")
a <- c("Age","SBP","DBP","TSH", "FT4", "FT3","Glu")
for (x in a) {
b <- lrm(outcomes_at_discharge~ rcs(x,3)+ Age+Sex+SBP+Glu+NIHSS+TOAST,data=data,x=TRUE,y=TRUE)
plot(rms::Predict(b,x,ref.zero=TRUE))
anova(b)
}
rcs is a linear tail-restricted cubic spline function (natural spline, for which the rcspline.eval function generates the design matrix ...
The argument x in the for loop is set to pass character values begining with Age. It's unlikely that a cubic spline of character values can have any role in linear regression.
Compare with this example.
library(rms)
#> Loading required package: Hmisc
#>
#> Attaching package: 'Hmisc'
#> The following objects are masked from 'package:base':
#>
#> format.pval, units
#> Loading required package: survival
#> Loading required package: lattice
#> Loading required package: ggplot2
#> Loading required package: SparseM
#>
#> Attaching package: 'SparseM'
#> The following object is masked from 'package:base':
#>
#> backsolve
getHdata(titanic3)
v <- c('pclass', 'survived', 'age', 'sex', 'sibsp', 'parch')
t3 <- titanic3[, v]
units(t3$age) <- 'years'
describe(t3)
#> t3
#>
#> 6 Variables 1309 Observations
#> --------------------------------------------------------------------------------
#> pclass
#> n missing distinct
#> 1309 0 3
#>
#> Value 1st 2nd 3rd
#> Frequency 323 277 709
#> Proportion 0.247 0.212 0.542
#> --------------------------------------------------------------------------------
#> survived : Survived
#> n missing distinct Info Sum Mean Gmd
#> 1309 0 2 0.708 500 0.382 0.4725
#>
#> --------------------------------------------------------------------------------
#> age : Age [years]
#> n missing distinct Info Mean Gmd .05 .10
#> 1046 263 98 0.999 29.88 16.06 5 14
#> .25 .50 .75 .90 .95
#> 21 28 39 50 57
#>
#> lowest : 0.1667 0.3333 0.4167 0.6667 0.7500
#> highest: 70.5000 71.0000 74.0000 76.0000 80.0000
#> --------------------------------------------------------------------------------
#> sex
#> n missing distinct
#> 1309 0 2
#>
#> Value female male
#> Frequency 466 843
#> Proportion 0.356 0.644
#> --------------------------------------------------------------------------------
#> sibsp : Number of Siblings/Spouses Aboard
#> n missing distinct Info Mean Gmd
#> 1309 0 7 0.67 0.4989 0.777
#>
#> lowest : 0 1 2 3 4, highest: 2 3 4 5 8
#>
#> Value 0 1 2 3 4 5 8
#> Frequency 891 319 42 20 22 6 9
#> Proportion 0.681 0.244 0.032 0.015 0.017 0.005 0.007
#> --------------------------------------------------------------------------------
#> parch : Number of Parents/Children Aboard
#> n missing distinct Info Mean Gmd
#> 1309 0 8 0.549 0.385 0.6375
#>
#> lowest : 0 1 2 3 4, highest: 3 4 5 6 9
#>
#> Value 0 1 2 3 4 5 6 9
#> Frequency 1002 170 113 8 6 6 2 2
#> Proportion 0.765 0.130 0.086 0.006 0.005 0.005 0.002 0.002
#> --------------------------------------------------------------------------------
f1 <- lrm(survived ~ sex*pclass * rcs(age,5) + rcs(age,5)*(sibsp + parch ), data = t3)
anova(f1)
#> Warning in anova.rms(f1): tests of nonlinear interaction with respect to single component
#> variables ignore 3-way interactions
#> Wald Statistics Response: survived
#>
#> Factor Chi-Square d.f. P
#> sex (Factor+Higher Order Factors) 187.15 15 <.0001
#> All Interactions 59.74 14 <.0001
#> pclass (Factor+Higher Order Factors) 100.10 20 <.0001
#> All Interactions 46.51 18 0.0003
#> age (Factor+Higher Order Factors) 56.20 32 0.0052
#> All Interactions 34.57 28 0.1826
#> Nonlinear (Factor+Higher Order Factors) 28.66 24 0.2331
#> sibsp (Factor+Higher Order Factors) 19.67 5 0.0014
#> All Interactions 12.13 4 0.0164
#> parch (Factor+Higher Order Factors) 3.51 5 0.6217
#> All Interactions 3.51 4 0.4761
#> sex * pclass (Factor+Higher Order Factors) 42.43 10 <.0001
#> sex * age (Factor+Higher Order Factors) 15.89 12 0.1962
#> Nonlinear (Factor+Higher Order Factors) 14.47 9 0.1066
#> Nonlinear Interaction : f(A,B) vs. AB 4.17 3 0.2441
#> pclass * age (Factor+Higher Order Factors) 13.47 16 0.6385
#> Nonlinear (Factor+Higher Order Factors) 12.92 12 0.3749
#> Nonlinear Interaction : f(A,B) vs. AB 6.88 6 0.3324
#> age * sibsp (Factor+Higher Order Factors) 12.13 4 0.0164
#> Nonlinear 1.76 3 0.6235
#> Nonlinear Interaction : f(A,B) vs. AB 1.76 3 0.6235
#> age * parch (Factor+Higher Order Factors) 3.51 4 0.4761
#> Nonlinear 1.80 3 0.6147
#> Nonlinear Interaction : f(A,B) vs. AB 1.80 3 0.6147
#> sex * pclass * age (Factor+Higher Order Factors) 8.34 8 0.4006
#> Nonlinear 7.74 6 0.2581
#> TOTAL NONLINEAR 28.66 24 0.2331
#> TOTAL INTERACTION 75.61 30 <.0001
#> TOTAL NONLINEAR + INTERACTION 79.49 33 <.0001
#> TOTAL 241.93 39 <.0001