Restricted Cubic Spline Function - Summary Interpretation

I am trying to incorporate spline transformation into my logistic regression and finally piece together the following (working) R code (pls see them below). However, I am struggling with interpreting some results. My outcome is a binary variable (disease; yes/no) and my predictor is a spline-transformed continuous variable (percentage).

1) Does this mean the odds ratio of my spline-transformed predictor is 15.3 at the 1st quartile and 38 at the 3rd quartile (pls see the first output)?
2) How to interpret percentage' and percentage'' (see the second output)? Which R code should I use so I can obtain the odds ratio and the 95% CI for all percentage, percentage', percentage''?
3) What does the effect and S.E. in the first output tell us?

library(rms) 
fit <- lrm(formula = disease ~ rcs(percentage, 4), data = final)
summary (fit)  
             Effects              Response : disease 

 Factor      Low  High Diff. Effect   S.E.    Lower 0.95 Upper 0.95
 percentage  15.3 38   22.7  -0.29743 0.77288 -1.81220   1.2174    
  Odds Ratio 15.3 38   22.7   0.74273      NA  0.16329   3.3784    
fit
 Logistic Regression Model
  
  lrm(formula = disease ~ rcs(percentage, 4), data = final)
  
                        Model Likelihood    Discrimination    Rank Discrim.    
                              Ratio Test           Indexes          Indexes    
  Obs            47    LR chi2      2.75    R2       0.078    C       0.608    
   none          17    d.f.            3    g        0.631    Dxy     0.216    
   disease       30    Pr(> chi2) 0.4315    gr       1.880    gamma   0.222    
  max |deriv| 9e-06                         gp       0.107    tau-a   0.102    
                                            Brier    0.221                     
  
               Coef    S.E.   Wald Z Pr(>|Z|)
  Intercept     0.7586 1.0646  0.71  0.4761  
  percentage    0.0028 0.0927  0.03  0.9758  
  percentage'  -0.1540 0.2829 -0.54  0.5860  
  percentage''  0.5206 0.7076  0.74  0.4619  

Here is my data:

structure(list(percentage = c(5.5, 72.1, 7.9, 80.6, 56.3, 11.5, 
15.3, 12.3, 30.9, 27.5, 0.3, 5.3, 19.6, 19.8, 0.3, 40.5, 16.8, 
38, 13.8, 29.9, 15.8, 15.3, 22.8, 17.2, 41.2, 17.2, 31.6, 41.2, 
19.6, 38, 41.2, 29.9, 15.3, 29.9, 38, 30.9, 31.6, 15.3, 15.3, 
38, 31.6, 41.3, 21.4, 0.4, 41.2, 7.6, 29.9), 
    disease = structure(c(1L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 
    1L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 
    2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L), .Label = c("none", "disease"), class = "factor")), row.names = c(NA, 
-47L), class = "data.frame")

I get an error here, which is why it's good practice always to share code with data in the form of a reprex. See the FAQ: How to do a minimal reproducible example reprex for beginners.

library(rms)
#> Loading required package: Hmisc
#> Loading required package: lattice
#> Loading required package: survival
#> Loading required package: Formula
#> Loading required package: ggplot2
#> 
#> Attaching package: 'Hmisc'
#> The following objects are masked from 'package:base':
#> 
#>     format.pval, units
#> Loading required package: SparseM
#> 
#> Attaching package: 'SparseM'
#> The following object is masked from 'package:base':
#> 
#>     backsolve
#> Warning in .recacheSubclasses(def@className, def, env): undefined subclass
#> "numericVector" of class "Mnumeric"; definition not updated
final <- structure(list(
  percentage = c(
    5.5, 72.1, 7.9, 80.6, 56.3, 11.5,
    15.3, 12.3, 30.9, 27.5, 0.3, 5.3, 19.6, 19.8, 0.3, 40.5, 16.8,
    38, 13.8, 29.9, 15.8, 15.3, 22.8, 17.2, 41.2, 17.2, 31.6, 41.2,
    19.6, 38, 41.2, 29.9, 15.3, 29.9, 38, 30.9, 31.6, 15.3, 15.3,
    38, 31.6, 41.3, 21.4, 0.4, 41.2, 7.6, 29.9
  ),
  disease = structure(c(
    1L,
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L,
    1L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L,
    2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
    2L
  ), .Label = c("none", "disease"), class = "factor")
), row.names = c(
  NA,
  -47L
), class = "data.frame")

fit <- lrm(formula = disease ~ rcs(percentage, 4), data = final)
summary(fit)
#> Error in summary.rms(fit): adjustment values not defined here or with datadist for percentage

@technocrat Thank you for pointing this out. I have fixed (updated) the code in my original post. Hope this works now and I look forward to your valuable feedback

I'm going to have to take a deeper look into Harrell's Regression Modeling Strategies Chapter 10 in the print version (chapter 8 online).

1 Like

Hi - I am following up with you re: my question. Would you be able to assist me with my question?

1 Like

See if this makes more sense.

library(rms)
#> Loading required package: Hmisc
#> Loading required package: lattice
#> Loading required package: survival
#> Loading required package: Formula
#> Loading required package: ggplot2
#> 
#> Attaching package: 'Hmisc'
#> The following objects are masked from 'package:base':
#> 
#>     format.pval, units
#> Loading required package: SparseM
#> 
#> Attaching package: 'SparseM'
#> The following object is masked from 'package:base':
#> 
#>     backsolve
#> Warning in .recacheSubclasses(def@className, def, env): undefined subclass
#> "numericVector" of class "Mnumeric"; definition not updated

final <- structure(list(
  percentage = c(
    5.5, 72.1, 7.9, 80.6, 56.3, 11.5,
    15.3, 12.3, 30.9, 27.5, 0.3, 5.3, 19.6, 19.8, 0.3, 40.5, 16.8,
    38, 13.8, 29.9, 15.8, 15.3, 22.8, 17.2, 41.2, 17.2, 31.6, 41.2,
    19.6, 38, 41.2, 29.9, 15.3, 29.9, 38, 30.9, 31.6, 15.3, 15.3,
    38, 31.6, 41.3, 21.4, 0.4, 41.2, 7.6, 29.9
  ),
  disease = structure(c(
    1L,
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 2L, 2L,
    1L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L,
    2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
    2L
  ), .Label = c("none", "disease"), class = "factor")
), row.names = c(
  NA,
  -47L
), class = "data.frame")

dd <- datadist(final); options(datadist = 'dd')
fit <- lrm(disease ~ rcs(percentage, 4), data = final)
# adjustment values not defined here or with datadist for percentage
summary(fit)
#>              Effects              Response : disease 
#> 
#>  Factor      Low  High Diff. Effect   S.E.    Lower 0.95 Upper 0.95
#>  percentage  15.3 38   22.7  -0.29743 0.77288 -1.81220   1.2174    
#>   Odds Ratio 15.3 38   22.7   0.74273      NA  0.16329   3.3784

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.