--- title: "Partial Regression" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{PartialRegression} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(Keng) data(depress) ``` Aiming to help researchers to understand the role of *PRE* in regression, this vignette will present several ways of examining the unique effect of problem-focused coping(`pm1`) on depression(`dm1`) controlling for emotion-focused coping(`em1`) and avoidance coping(`am1`) using the first-wave data subset in internal data `depress`. Four ways will be present in the following: - Multiple regression with *t*-test. - Hierarchical regression with *F*-test. - The *PRE* of the single parameter, the partial regression coefficient of problem-focused coping. - One-predictor regression using the residuals. ## Multiple regression with *t*-test Firstly, examine the unique effect of `pm1` using *t*-test. Model C (Compact model) regresses `dm1` on `em1` and `am1`. Model A(Augmented model) regresses `dm1` on `pm1`, `em1`, and `am1`. ```{r} # multiple regression fitC <- lm(dm1 ~ em1 + am1, depress) fitA <- lm(dm1 ~ pm1 + em1 + am1, depress) summary(fitA) ``` As shown, the partial regression coefficient of `pm1` is -0.16705, *t*(90) = -3.349, *p* = 0.00119. ## Hierarchical regression with *F*-test Secondly, examine the unique effect of *pm1* using hierarchical regression and its *F*-test. In SPSS, this *F*-test is presented as the *F*-test for *R*^2^ change. ```{r} anova(fitC, fitA) ``` As shown, *F* (1, 90) = 11.217, *p* = 0.001185. This *F*-test is equivalent to the *t*-test above, since they both examine the unique effect of *pm1*. In the case that the *df* of *F*'s numerator is 1, *F* = *t*^2^, and *t*'s *df* equals to the *df* of *F*'s denominator. ## The *PRE* of the single parameter Thirdly, examine the unique effect of *pm1* using *PRE*. ```{r} format(compare_lm(fitC, fitA), digits = 3, nsmall = 3) ``` As shown, *F* (1, 90) = 11.217, *p* = 0.00119. The *F*-test of *PRE* is equivalent to the *F*-test of anova above. ## One-predictor regression using the residuals Fourthly, examine the unique effect of `pm1` using residuals. Regress `dm1` on `em1` and `am1`, and attain the residuals of `dm1`, `dm_res`, which partials out the effect of `em1` and `am1` on `dm1`. Regress `pm1` on `em1` and `am1`, and attain the residuals of `pm1`, `pm_res`, which partials out the effect of `em1` and `am1` on `pm1`. Correlate `dm_res` with `pm_res`, we attain the partial correlation of `dm1` and `pm1`. ```{r} dm_res <- lm(dm1 ~ em1 + am1, depress)$residuals pm_res <- lm(pm1 ~ em1 + am1, depress)$residuals resDat <- data.frame(dm_res, pm_res) cor(dm_res, pm_res) ``` As shown, the partial correlation of `dm1` and `pm1` is -0.3329009. Regress `dm_res` on `pm_res`, and we attain the unique effect of `pm1` on `dm1`. ```{r} summary(lm(dm_res ~ pm_res, data.frame(dm_res, pm_res))) ``` As shown, the regression coefficient of `pm_res` equals the partial regression coefficients of `pm1` in `fitA`. However, their *t*s, as well as *p*s, are different. Why? Let's examine the unique effect of `pm_res` using *PRE*. Note that the *F*-test of one parameter's *PRE* is equivalent to the *t*-test of this parameter. In addition, Model A is relative to Model C. With your statistical purpose changing, the referents of Model C and Model A change. ```{r} fitC <- lm(dm_res ~ 1, resDat) fitA <- lm(dm_res ~ pm_res, resDat) format(compare_lm(fitC, fitA), digits = 3, nsmall = 3) ``` Compare the *PRE* of `pm_res` with the *PRE* of `pm1`. It's shown that two *PRE*s are equivalent. However, *df2*s are different, which make *F*s, as well as *p*s, different. In other words, though the unique effect of `pm1` is constant, the compact models and augmented models used to evaluate its significance are different, which lead to different comparison conclusions (i.e., *F*-test and *t*-test results). Rethinking the *F*-test formula of *PRE*, we reach the following conclusion: With *PRE* being equal, the significance of *PRE* is determined by the *df* of Model C and the df-change of Model A against Model C. Therefore, given the *PRE* of a specific set of predictor(s), the power of this specific set of predictor(s) are determined by the sample size *n* and the number of parameters [and hence the total number of predictor(s)] in the regression model. Similarly, given the *PRE* of a specific set of predictor(s), the required power for this specific set of predictor(s), and the number of parameters [and hence the total number of predictor(s)] in the regression model, we could compute the required sample size *n*.