Creating a multiple regression

Hi all,
I have some basic questions about the set up of a multiple regression and how to sort my dependent variable.

The below photo is an extract of a large dataset which measures donor loyalty and demographics. I wish to create a multiple regression that tests IV's such as gender, age and average donation amount against a DV which measures donor length (Age_by_Month).
My issue here is that the donors are classified into specific 'Cohorts' depending on the month they started offering financial support (in terms of monthly payments) to the NGO. Naturally, then some donors who started donating in January 2018, will have been a member of the organisation for longer than donors who became members in November 2018. I am looking for suggestions on how to 'sort' my dependent variable so that it takes the cohort into consideration, when running a multiple regression that looks at gender, age etc. My worry is that if I include cohort as an IV, the significance of other variables drop. Is there any way to get around this? What are your suggestions?

Many thanks,
Naja

Hi,

Model design in ordinary least square linear regression requires making a choice among three principal approaches for a base model:

  1. Begin with a fully saturated model y ~ x_i ... x_n and subtract.
  2. Begin with a single covariate y ~ x_i and add.
  3. Purposely select covariates.

In your situation, you have covariates that are summary measures, such as Age_by_Month and the one you mention, Cohort, that make Age_by_Day lumpier or, even, perhaps, smoother. Two others, Internal_recruit and External_recruit simply recode Recruit.

I'm going to make what I hope is a reasonable assumption: Your dependent variable is average_amount and you want to know how length of membership, age, gender and the manner of recruitment are associated. If that is the case:

  1. Pick the \alpha you will use for p-value
  2. Run fit <- lm(average_amount ~ Age_by_Day + age + gender + Recruit, data = name_of_your_data_frame)
  3. Check the F statistic, coefficient p-values and adjusted R^2 and plot(fit)
    (See my post on how to interpret these results.
  4. Re-run adding Age_by_Month to see if you get an improved result and then Cohort

Hi Technocrat,
Thank you for your reply my dependent variable (DV) is age_by_month, as I am trying to figure out what factors affect the retention rate.

model.1 <- lm(Age_by_Month ~ average_amount + age + gender + Recruit, data=donors)

My question goes on the effect of the independent variable (IV) 'Cohort', and whether there is any way to hold this variable stable to measure the effect of my other IV's. 'Cohort' will inevitable have a large effect on the variable age_by_month since it takes the joindate of the clients into account. I want to see what other factors affect age_by_month. Is there any way to do this?
Many Thanks,
Naja

For common data, let's motivate the problem with mtcars, a built-in data set that we can make structurally similar to your situation.

Using the transformation shown in the help page and adding cohort, which like your Cohort,categorizes records, this time month of manufacture, we can do a stepwise selection, successively leaving out variables that do not improve the p-value of the F Statistic from the fully saturated model. In this example, cohort is eliminated on that basis. That shouldn't be surprising, given that it is randomly assigned. Your Cohort, the month of whichever year the customer first joined may not be. But I'd still recommend going through the process, rather than assuming that Cohort is necessarily useful. An alternative is to run separate models on each Cohort group.

mtcars2 <- within(mtcars, {
   vs <- factor(vs, labels = c("V", "S"))
   am <- factor(am, labels = c("automatic", "manual"))
   cyl  <- ordered(cyl)
   gear <- ordered(gear)
   carb <- ordered(carb)
})

set.seed(137)
month_mfg <- factor(c("JAN", "FEB", "MAR", "APR", "MAY", "JUN", "JUL", "AUG", "SEP", "OCT", "NOV", "DEC"))
cohort <- sample(month_mfg, 32, replace = TRUE)
mtcars2 <- cbind(mtcars2, cohort)

saturated_fit <- lm(mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am + gear + carb + cohort, data = mtcars2)
summary(saturated_fit)
#> 
#> Call:
#> lm(formula = mpg ~ cyl + disp + hp + drat + wt + qsec + vs + 
#>     am + gear + carb + cohort, data = mtcars2)
#> 
#> Residuals:
#>           Mazda RX4       Mazda RX4 Wag          Datsun 710      Hornet 4 Drive 
#>          -1.203e-02           8.630e-17           1.874e-16          -1.202e+00 
#>   Hornet Sportabout             Valiant          Duster 360           Merc 240D 
#>          -5.161e-01          -3.447e-01          -7.302e-01          -1.532e+00 
#>            Merc 230            Merc 280           Merc 280C          Merc 450SE 
#>          -2.658e-02           4.868e-16           1.559e+00          -3.487e-01 
#>          Merc 450SL         Merc 450SLC  Cadillac Fleetwood Lincoln Continental 
#>           1.087e+00          -7.382e-01          -5.918e-01          -7.411e-01 
#>   Chrysler Imperial            Fiat 128         Honda Civic      Toyota Corolla 
#>           1.547e+00           2.520e+00          -5.918e-01           1.116e+00 
#>       Toyota Corona    Dodge Challenger         AMC Javelin          Camaro Z28 
#>          -8.266e-16          -3.839e-01          -1.559e+00           5.161e-01 
#>    Pontiac Firebird           Fiat X1-9       Porsche 914-2        Lotus Europa 
#>           4.006e+00          -2.090e+00           1.203e-02           1.535e+00 
#>      Ford Pantera L        Ferrari Dino       Maserati Bora          Volvo 142E 
#>          -1.547e+00          -6.061e-16           1.435e-16          -9.431e-01 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)
#> (Intercept)  92.05920   48.59510   1.894    0.131
#> cyl.L        -9.53505   12.39861  -0.769    0.485
#> cyl.Q         3.08191    3.44601   0.894    0.422
#> disp          0.06208    0.06064   1.024    0.364
#> hp           -0.08646    0.06715  -1.288    0.267
#> drat         -3.36513    6.27561  -0.536    0.620
#> wt           -6.95625    4.65615  -1.494    0.209
#> qsec         -1.56006    2.17737  -0.716    0.513
#> vsS           1.13629    5.53658   0.205    0.847
#> ammanual     -3.73157    9.96853  -0.374    0.727
#> gear.L        0.02536    5.85367   0.004    0.997
#> gear.Q       -3.39014    5.35238  -0.633    0.561
#> carb.L       12.74030   11.03873   1.154    0.313
#> carb.Q        6.97853    8.00820   0.871    0.433
#> carb.C        2.71334    5.14883   0.527    0.626
#> carb^4        2.41526    5.16384   0.468    0.664
#> carb^5       -4.44901    5.18699  -0.858    0.439
#> cohortAUG   -13.88802    7.33987  -1.892    0.131
#> cohortDEC    -4.15082    4.92709  -0.842    0.447
#> cohortFEB    -4.10415    7.77175  -0.528    0.625
#> cohortJAN    -6.14826    7.23151  -0.850    0.443
#> cohortJUL    -5.40789    6.40341  -0.845    0.446
#> cohortJUN    -6.04268    5.66448  -1.067    0.346
#> cohortMAR    -9.28431    6.57149  -1.413    0.231
#> cohortMAY    -7.10596    6.33567  -1.122    0.325
#> cohortNOV    -6.73959    6.19751  -1.087    0.338
#> cohortOCT    -4.12539    8.46815  -0.487    0.652
#> cohortSEP   -12.46692    9.61299  -1.297    0.264
#> 
#> Residual standard error: 3.504 on 4 degrees of freedom
#> Multiple R-squared:  0.9564, Adjusted R-squared:  0.6619 
#> F-statistic: 3.248 on 27 and 4 DF,  p-value: 0.1299

saturated_ex_cohort <- lm(mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am + gear + carb, data = mtcars2)
summary(saturated_ex_cohort)
#> 
#> Call:
#> lm(formula = mpg ~ cyl + disp + hp + drat + wt + qsec + vs + 
#>     am + gear + carb, data = mtcars2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.5087 -1.3584 -0.0948  0.7745  4.6251 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)  
#> (Intercept) 26.57171   19.56616   1.358   0.1945  
#> cyl.L       -0.23770    5.06256  -0.047   0.9632  
#> cyl.Q        2.02541    2.14952   0.942   0.3610  
#> disp         0.03555    0.03190   1.114   0.2827  
#> hp          -0.07051    0.03943  -1.788   0.0939 .
#> drat         1.18283    2.48348   0.476   0.6407  
#> wt          -4.52978    2.53875  -1.784   0.0946 .
#> qsec         0.36784    0.93540   0.393   0.6997  
#> vsS          1.93085    2.87126   0.672   0.5115  
#> ammanual     1.21212    3.21355   0.377   0.7113  
#> gear.L       1.78785    2.64200   0.677   0.5089  
#> gear.Q       0.12235    2.40896   0.051   0.9602  
#> carb.L       6.06156    6.72822   0.901   0.3819  
#> carb.Q       1.78825    2.80043   0.639   0.5327  
#> carb.C       0.42384    2.57389   0.165   0.8714  
#> carb^4       0.93317    2.45041   0.381   0.7087  
#> carb^5      -2.46410    2.90450  -0.848   0.4096  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.833 on 15 degrees of freedom
#> Multiple R-squared:  0.8931, Adjusted R-squared:  0.779 
#> F-statistic:  7.83 on 16 and 15 DF,  p-value: 0.000124

saturated_ex_carb <- lm(mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am + gear, data = mtcars2)
summary(saturated_ex_carb)
#> 
#> Call:
#> lm(formula = mpg ~ cyl + disp + hp + drat + wt + qsec + vs + 
#>     am + gear, data = mtcars2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.4785 -1.3834 -0.0234  1.2211  4.2132 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)  
#> (Intercept) 17.310552  16.249080   1.065    0.299  
#> cyl.L        1.762848   3.339081   0.528    0.603  
#> cyl.Q        1.632148   1.192410   1.369    0.186  
#> disp         0.005862   0.015225   0.385    0.704  
#> hp          -0.039460   0.021394  -1.844    0.080 .
#> drat         0.824468   1.960118   0.421    0.679  
#> wt          -2.853681   1.660172  -1.719    0.101  
#> qsec         0.643871   0.726451   0.886    0.386  
#> vsS          1.698405   2.292846   0.741    0.467  
#> ammanual     2.936464   2.201030   1.334    0.197  
#> gear.L       1.384525   1.951508   0.709    0.486  
#> gear.Q       0.717851   1.649410   0.435    0.668  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.588 on 20 degrees of freedom
#> Multiple R-squared:  0.881,  Adjusted R-squared:  0.8155 
#> F-statistic: 13.46 on 11 and 20 DF,  p-value: 6.004e-07


saturated_ex_gear <- lm(mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am, data = mtcars2)
summary(saturated_ex_gear)
#> 
#> Call:
#> lm(formula = mpg ~ cyl + disp + hp + drat + wt + qsec + vs + 
#>     am, data = mtcars2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.9978 -1.3551 -0.3108  1.1992  4.1102 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)  
#> (Intercept) 19.540985  14.146419   1.381   0.1810  
#> cyl.L        0.342558   2.764833   0.124   0.9025  
#> cyl.Q        1.388429   1.112097   1.248   0.2250  
#> disp         0.006688   0.013512   0.495   0.6255  
#> hp          -0.029141   0.017182  -1.696   0.1040  
#> drat         0.588059   1.503111   0.391   0.6994  
#> wt          -3.155246   1.420235  -2.222   0.0369 *
#> qsec         0.523235   0.690130   0.758   0.4564  
#> vsS          1.237800   2.106056   0.588   0.5627  
#> ammanual     3.000910   1.853400   1.619   0.1197  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.514 on 22 degrees of freedom
#> Multiple R-squared:  0.8765, Adjusted R-squared:  0.826 
#> F-statistic: 17.35 on 9 and 22 DF,  p-value: 4.814e-08

saturated_ex_am <- lm(mpg ~ cyl + disp + hp + drat + wt + qsec + vs, data = mtcars2)
summary(saturated_ex_am)
#> 
#> Call:
#> lm(formula = mpg ~ cyl + disp + hp + drat + wt + qsec + vs, data = mtcars2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -4.0403 -1.3502 -0.3096  1.1330  5.0843 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)  
#> (Intercept) 26.176702  14.008939   1.869   0.0745 .
#> cyl.L       -1.530866   2.598096  -0.589   0.5615  
#> cyl.Q        1.227223   1.146013   1.071   0.2953  
#> disp         0.005948   0.013972   0.426   0.6743  
#> hp          -0.023866   0.017455  -1.367   0.1848  
#> drat         1.144625   1.513984   0.756   0.4573  
#> wt          -3.545791   1.448101  -2.449   0.0224 *
#> qsec         0.171817   0.677814   0.253   0.8021  
#> vsS          0.447950   2.119775   0.211   0.8345  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.601 on 23 degrees of freedom
#> Multiple R-squared:  0.8618, Adjusted R-squared:  0.8138 
#> F-statistic: 17.93 on 8 and 23 DF,  p-value: 3.535e-08

with_am_ex_vs <- lm(mpg ~ cyl + disp + hp + drat + wt + qsec + am, data = mtcars2)
summary(with_am_ex_vs)
#> 
#> Call:
#> lm(formula = mpg ~ cyl + disp + hp + drat + wt + qsec + am, data = mtcars2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.7362 -1.3973 -0.3513  1.4311  4.2267 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)  
#> (Intercept) 16.735984  13.126182   1.275   0.2150  
#> cyl.L       -0.369506   2.449630  -0.151   0.8814  
#> cyl.Q        1.172300   1.034497   1.133   0.2688  
#> disp         0.007572   0.013236   0.572   0.5728  
#> hp          -0.025353   0.015699  -1.615   0.1200  
#> drat         0.633293   1.479625   0.428   0.6726  
#> wt          -3.426768   1.323748  -2.589   0.0164 *
#> qsec         0.718097   0.596598   1.204   0.2410  
#> ammanual     2.748596   1.777155   1.547   0.1356  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.478 on 23 degrees of freedom
#> Multiple R-squared:  0.8746, Adjusted R-squared:  0.831 
#> F-statistic: 20.05 on 8 and 23 DF,  p-value: 1.206e-08

with_am_ex_qsec <- lm(mpg ~ cyl + disp + hp + drat + wt + am, data = mtcars2)
summary(with_am_ex_qsec)
#> 
#> Call:
#> lm(formula = mpg ~ cyl + disp + hp + drat + wt + am, data = mtcars2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.8267 -1.4366 -0.4153  1.1649  5.0671 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 30.755744   6.108432   5.035  3.8e-05 ***
#> cyl.L       -1.797442   2.163142  -0.831   0.4142    
#> cyl.Q        1.433586   1.020878   1.404   0.1730    
#> disp         0.004395   0.013090   0.336   0.7400    
#> hp          -0.033038   0.014476  -2.282   0.0316 *  
#> drat         0.326616   1.471086   0.222   0.8262    
#> wt          -2.726729   1.200207  -2.272   0.0323 *  
#> ammanual     1.681130   1.554386   1.082   0.2902    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.501 on 24 degrees of freedom
#> Multiple R-squared:  0.8667, Adjusted R-squared:  0.8278 
#> F-statistic: 22.29 on 7 and 24 DF,  p-value: 4.768e-09

with_qsec_ex_wt <- lm(mpg ~ cyl + disp + hp + drat + qsec + am, data = mtcars2)
summary(with_qsec_ex_wt)
#> 
#> Call:
#> lm(formula = mpg ~ cyl + disp + hp + drat + qsec + am, data = mtcars2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -4.8814 -1.5577 -0.3294  1.2071  5.0712 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)  
#> (Intercept) 25.09113   14.15396   1.773   0.0890 .
#> cyl.L       -0.44501    2.72491  -0.163   0.8716  
#> cyl.Q        2.18396    1.06555   2.050   0.0515 .
#> disp        -0.01453    0.01125  -1.292   0.2087  
#> hp          -0.03901    0.01645  -2.371   0.0261 *
#> drat         0.49570    1.64495   0.301   0.7657  
#> qsec         0.03955    0.59620   0.066   0.9477  
#> ammanual     3.22115    1.96655   1.638   0.1145  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.756 on 24 degrees of freedom
#> Multiple R-squared:  0.8381, Adjusted R-squared:  0.7908 
#> F-statistic: 17.74 on 7 and 24 DF,  p-value: 4.565e-08

with_wt_ex_drat <- lm(mpg ~ cyl + disp + hp + wt + qsec + am, data = mtcars2)
summary(with_wt_ex_drat)
#> 
#> Call:
#> lm(formula = mpg ~ cyl + disp + hp + wt + qsec + am, data = mtcars2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.9501 -1.4335 -0.1542  1.3632  4.1917 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)  
#> (Intercept) 19.77177   10.85529   1.821   0.0810 .
#> cyl.L       -0.69084    2.29172  -0.301   0.7657  
#> cyl.Q        1.22117    1.01053   1.208   0.2386  
#> disp         0.00680    0.01289   0.528   0.6026  
#> hp          -0.02477    0.01537  -1.612   0.1201  
#> wt          -3.40642    1.30019  -2.620   0.0150 *
#> qsec         0.67413    0.57760   1.167   0.2546  
#> ammanual     2.91836    1.70260   1.714   0.0994 .
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.435 on 24 degrees of freedom
#> Multiple R-squared:  0.8736, Adjusted R-squared:  0.8367 
#> F-statistic:  23.7 on 7 and 24 DF,  p-value: 2.566e-09

with_wt_ex_hp <- lm(mpg ~ cyl + disp + wt + qsec + am, data = mtcars2)
summary(with_wt_ex_hp)
#> 
#> Call:
#> lm(formula = mpg ~ cyl + disp + wt + qsec + am, data = mtcars2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.8692 -1.3132 -0.3668  1.4187  4.5144 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)   
#> (Intercept) 11.55295    9.88420   1.169  0.25350   
#> cyl.L       -1.54479    2.29975  -0.672  0.50792   
#> cyl.Q        0.76403    1.00041   0.764  0.45218   
#> disp         0.00688    0.01329   0.518  0.60927   
#> wt          -4.11995    1.26093  -3.267  0.00315 **
#> qsec         1.07389    0.53803   1.996  0.05694 . 
#> ammanual     2.54221    1.73957   1.461  0.15636   
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.512 on 25 degrees of freedom
#> Multiple R-squared:  0.8599, Adjusted R-squared:  0.8263 
#> F-statistic: 25.58 on 6 and 25 DF,  p-value: 1.584e-09

with_hp_ex_disp <- lm(mpg ~ cyl + disp + hp + wt + qsec + am, data = mtcars2)
summary(with_hp_ex_disp)
#> 
#> Call:
#> lm(formula = mpg ~ cyl + disp + hp + wt + qsec + am, data = mtcars2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.9501 -1.4335 -0.1542  1.3632  4.1917 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)  
#> (Intercept) 19.77177   10.85529   1.821   0.0810 .
#> cyl.L       -0.69084    2.29172  -0.301   0.7657  
#> cyl.Q        1.22117    1.01053   1.208   0.2386  
#> disp         0.00680    0.01289   0.528   0.6026  
#> hp          -0.02477    0.01537  -1.612   0.1201  
#> wt          -3.40642    1.30019  -2.620   0.0150 *
#> qsec         0.67413    0.57760   1.167   0.2546  
#> ammanual     2.91836    1.70260   1.714   0.0994 .
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.435 on 24 degrees of freedom
#> Multiple R-squared:  0.8736, Adjusted R-squared:  0.8367 
#> F-statistic:  23.7 on 7 and 24 DF,  p-value: 2.566e-09

with_hp_ex_cyl <- lm(mpg ~ disp + hp + wt + qsec + am, data = mtcars2)
summary(with_hp_ex_cyl)
#> 
#> Call:
#> lm(formula = mpg ~ disp + hp + wt + qsec + am, data = mtcars2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.5399 -1.7398 -0.3196  1.1676  4.5534 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)   
#> (Intercept) 14.36190    9.74079   1.474  0.15238   
#> disp         0.01124    0.01060   1.060  0.29897   
#> hp          -0.02117    0.01450  -1.460  0.15639   
#> wt          -4.08433    1.19410  -3.420  0.00208 **
#> qsec         1.00690    0.47543   2.118  0.04391 * 
#> ammanual     3.47045    1.48578   2.336  0.02749 * 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.429 on 26 degrees of freedom
#> Multiple R-squared:  0.8637, Adjusted R-squared:  0.8375 
#> F-statistic: 32.96 on 5 and 26 DF,  p-value: 1.844e-10

ex_cyl_ex_hp <- lm(mpg ~ disp + wt + qsec + am, data = mtcars2)
summary(ex_cyl_ex_hp)
#> 
#> Call:
#> lm(formula = mpg ~ disp + wt + qsec + am, data = mtcars2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.5078 -1.4121 -0.6645  1.3611  4.7150 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  6.44238    8.25723   0.780 0.442054    
#> disp         0.00769    0.01053   0.730 0.471709    
#> wt          -4.58828    1.16677  -3.932 0.000529 ***
#> qsec         1.41696    0.39149   3.619 0.001200 ** 
#> ammanual     3.31015    1.51241   2.189 0.037448 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.48 on 27 degrees of freedom
#> Multiple R-squared:  0.8526, Adjusted R-squared:  0.8307 
#> F-statistic: 39.04 on 4 and 27 DF,  p-value: 7.465e-11

ex_cyl_ex_hp_ex_disp <- lm(mpg ~ wt + qsec + am, data = mtcars2)
summary(ex_cyl_ex_hp_ex_disp)
#> 
#> Call:
#> lm(formula = mpg ~ wt + qsec + am, data = mtcars2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.4811 -1.5555 -0.7257  1.4110  4.6610 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)   9.6178     6.9596   1.382 0.177915    
#> wt           -3.9165     0.7112  -5.507 6.95e-06 ***
#> qsec          1.2259     0.2887   4.247 0.000216 ***
#> ammanual      2.9358     1.4109   2.081 0.046716 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.459 on 28 degrees of freedom
#> Multiple R-squared:  0.8497, Adjusted R-squared:  0.8336 
#> F-statistic: 52.75 on 3 and 28 DF,  p-value: 1.21e-11

fitted <- ex_cyl_ex_hp_ex_disp
summary(fitted)
#> 
#> Call:
#> lm(formula = mpg ~ wt + qsec + am, data = mtcars2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.4811 -1.5555 -0.7257  1.4110  4.6610 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)   9.6178     6.9596   1.382 0.177915    
#> wt           -3.9165     0.7112  -5.507 6.95e-06 ***
#> qsec          1.2259     0.2887   4.247 0.000216 ***
#> ammanual      2.9358     1.4109   2.081 0.046716 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.459 on 28 degrees of freedom
#> Multiple R-squared:  0.8497, Adjusted R-squared:  0.8336 
#> F-statistic: 52.75 on 3 and 28 DF,  p-value: 1.21e-11

fitted_w_cohort <- lm(mpg ~ wt + qsec + am + cohort, data = mtcars2)
summary(fitted_w_cohort)
#> 
#> Call:
#> lm(formula = mpg ~ wt + qsec + am + cohort, data = mtcars2)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.8324 -1.0646  0.0000  0.6143  3.7044 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  16.1874    10.0917   1.604 0.127124    
#> wt           -4.7057     0.9895  -4.756 0.000183 ***
#> qsec          1.1512     0.3938   2.924 0.009477 ** 
#> ammanual      1.4463     2.0472   0.706 0.489450    
#> cohortAUG    -5.3411     3.4340  -1.555 0.138272    
#> cohortDEC    -1.0956     2.5620  -0.428 0.674281    
#> cohortFEB    -2.6990     3.2768  -0.824 0.421525    
#> cohortJAN    -4.3491     2.9897  -1.455 0.163969    
#> cohortJUL    -2.0652     2.6523  -0.779 0.446892    
#> cohortJUN    -1.0003     2.5467  -0.393 0.699373    
#> cohortMAR    -1.4996     2.4070  -0.623 0.541535    
#> cohortMAY    -1.9723     2.5074  -0.787 0.442340    
#> cohortNOV    -3.3148     2.5280  -1.311 0.207215    
#> cohortOCT    -1.8676     3.5052  -0.533 0.601061    
#> cohortSEP    -6.1242     3.9160  -1.564 0.136260    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.594 on 17 degrees of freedom
#> Multiple R-squared:  0.8984, Adjusted R-squared:  0.8148 
#> F-statistic: 10.74 on 14 and 17 DF,  p-value: 7.968e-06

Created on 2019-11-28 by the reprex package (v0.3.0)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.