Skip to ContentGo to accessibility pageKeyboard shortcuts menu
OpenStax Logo




A measure of the degree to which variation of one variable is related to variation in one or more other variables. The most commonly used correlation coefficient indicates the degree to which variation in one variable is described by a straight line relation with another variable.

Suppose that sample information is available on family income and Years of schooling of the head of the household. A correlation coefficient = 0 would indicate no linear association at all between these two variables. A correlation of 1 would indicate perfect linear association (where all variation in family income could be associated with schooling and vice versa).


a. 81% of the variation in the money spent for repairs is explained by the age of the auto


b. 16


The coefficient of determination is r··2 with 0 ≤ r··2 ≤ 1, since -1 ≤ r ≤ 1.




d. on a scale from -1 to +1, the degree of linear relationship between the two variables is +.10


d. there exists no linear relationship between X and Y


Approximately 0.9


d. neither of the above changes will affect r.



A t test is obtained by dividing a regression coefficient by its standard error and then comparing the result to critical values for Students' t with Error df. It provides a test of the claim that βi=0βi=0 when all other variables have been included in the relevant regression model.


Suppose that 4 variables are suspected of influencing some response. Suppose that the results of fitting Yi=β0+β1X1i+β2X2i+β3X3i+ β4X4i+eiYi=β0+β1X1i+β2X2i+β3X3i+β4X4i+ei include:

Variable Regression coefficient Standard error of regular coefficient
.5 1 -3
.4 2 +2
.02 3 +1
.6 4 -.5
Table 13.6

t calculated for variables 1, 2, and 3 would be 5 or larger in absolute value while that for variable 4 would be less than 1. For most significance levels, the hypothesis β1=0β1=0 would be rejected. But, notice that this is for the case when X2X2, X3X3, and X4X4 have been included in the regression. For most significance levels, the hypothesis β4=0β4=0 would be continued (retained) for the case where X1X1, X2X2, and X3X3 are in the regression. Often this pattern of results will result in computing another regression involving only X1X1, X2X2, X3X3, and examination of the t ratios produced for that case.


c. those who score low on one test tend to score low on the other.


False. Since H0:β=−1H0:β=−1 would not be rejected at α=0.05α=0.05, it would not be rejected at α=0.01α=0.01.






Some variables seem to be related, so that knowing one variable's status allows us to predict the status of the other. This relationship can be measured and is called correlation. However, a high correlation between two variables in no way proves that a cause-and-effect relation exists between them. It is entirely possible that a third factor causes both variables to vary together.






d. there is a perfect negative relationship between Y and X in the sample.


b. low


The precision of the estimate of the Y variable depends on the range of the independent (X) variable explored. If we explore a very small range of the X variable, we won't be able to make much use of the regression. Also, extrapolation is not recommended.




Most simply, since −5 is included in the confidence interval for the slope, we can conclude that the evidence is consistent with the claim at the 95% confidence level.

Using a t test:

H0H0: B1=−5B1=−5

HAHA: B1−5B1−5

t calculated = −5 ( −4 ) 1 = −1 t calculated = −5 ( −4 ) 1 = −1

t critical = −1.96 t critical = −1.96

Since tcalctcalc < tcrittcrit we retain the null hypothesis that B1=−5B1=−5.



t(critical, df = 23, two-tailed, α = .02) = ± 2.5

tcritical, df = 23, two-tailed, α = .01 = ± 2.8

  1. 80+1.54=8680+1.54=86
  2. No. Most business statisticians would not want to extrapolate that far. If someone did, the estimate would be 110, but some other factors probably come into play with 20 years.

d. one quarter


b. r=−.77r=−.77

  1. −.72, .32
  2. the t value
  3. the t value
  1. The population value for β2β2, the change that occurs in Y with a unit change in X2X2, when the other variables are held constant.
  2. The population value for the standard error of the distribution of estimates of β2β2.
  3. .8, .1, 16 = 20 − 4.
Order a print copy

As an Amazon Associate we earn from qualifying purchases.


This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute OpenStax.

Attribution information
  • If you are redistributing all or part of this book in a print format, then you must include on every physical page the following attribution:
    Access for free at
  • If you are redistributing all or part of this book in a digital format, then you must include on every digital page view the following attribution:
    Access for free at
Citation information

© Jun 23, 2022 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.