Skip to ContentGo to accessibility pageKeyboard shortcuts menu
OpenStax Logo
Principles of Finance

14.1 Correlation Analysis

Principles of Finance14.1 Correlation Analysis

Menu
Table of contents
  1. Preface
  2. 1 Introduction to Finance
    1. Why It Matters
    2. 1.1 What Is Finance?
    3. 1.2 The Role of Finance in an Organization
    4. 1.3 Importance of Data and Technology
    5. 1.4 Careers in Finance
    6. 1.5 Markets and Participants
    7. 1.6 Microeconomic and Macroeconomic Matters
    8. 1.7 Financial Instruments
    9. 1.8 Concepts of Time and Value
    10. Summary
    11. Key Terms
    12. Multiple Choice
    13. Review Questions
    14. Video Activity
  3. 2 Corporate Structure and Governance
    1. Why It Matters
    2. 2.1 Business Structures
    3. 2.2 Relationship between Shareholders and Company Management
    4. 2.3 Role of the Board of Directors
    5. 2.4 Agency Issues: Shareholders and Corporate Boards
    6. 2.5 Interacting with Investors, Intermediaries, and Other Market Participants
    7. 2.6 Companies in Domestic and Global Markets
    8. Summary
    9. Key Terms
    10. CFA Institute
    11. Multiple Choice
    12. Review Questions
    13. Video Activity
  4. 3 Economic Foundations: Money and Rates
    1. Why It Matters
    2. 3.1 Microeconomics
    3. 3.2 Macroeconomics
    4. 3.3 Business Cycles and Economic Activity
    5. 3.4 Interest Rates
    6. 3.5 Foreign Exchange Rates
    7. 3.6 Sources and Characteristics of Economic Data
    8. Summary
    9. Key Terms
    10. CFA Institute
    11. Multiple Choice
    12. Review Questions
    13. Problems
    14. Video Activity
  5. 4 Accrual Accounting Process
    1. Why It Matters
    2. 4.1 Cash versus Accrual Accounting
    3. 4.2 Economic Basis for Accrual Accounting
    4. 4.3 How Does a Company Recognize a Sale and an Expense?
    5. 4.4 When Should a Company Capitalize or Expense an Item?
    6. 4.5 What Is “Profit” versus “Loss” for the Company?
    7. Summary
    8. Key Terms
    9. Multiple Choice
    10. Review Questions
    11. Problems
    12. Video Activity
  6. 5 Financial Statements
    1. Why It Matters
    2. 5.1 The Income Statement
    3. 5.2 The Balance Sheet
    4. 5.3 The Relationship between the Balance Sheet and the Income Statement
    5. 5.4 The Statement of Owner’s Equity
    6. 5.5 The Statement of Cash Flows
    7. 5.6 Operating Cash Flow and Free Cash Flow to the Firm (FCFF)
    8. 5.7 Common-Size Statements
    9. 5.8 Reporting Financial Activity
    10. Summary
    11. Key Terms
    12. CFA Institute
    13. Multiple Choice
    14. Review Questions
    15. Problems
    16. Video Activity
  7. 6 Measures of Financial Health
    1. Why It Matters
    2. 6.1 Ratios: Condensing Information into Smaller Pieces
    3. 6.2 Operating Efficiency Ratios
    4. 6.3 Liquidity Ratios
    5. 6.4 Solvency Ratios
    6. 6.5 Market Value Ratios
    7. 6.6 Profitability Ratios and the DuPont Method
    8. Summary
    9. Key Terms
    10. CFA Institute
    11. Multiple Choice
    12. Review Questions
    13. Problems
    14. Video Activity
  8. 7 Time Value of Money I: Single Payment Value
    1. Why It Matters
    2. 7.1 Now versus Later Concepts
    3. 7.2 Time Value of Money (TVM) Basics
    4. 7.3 Methods for Solving Time Value of Money Problems
    5. 7.4 Applications of TVM in Finance
    6. Summary
    7. Key Terms
    8. CFA Institute
    9. Multiple Choice
    10. Review Questions
    11. Problems
    12. Video Activity
  9. 8 Time Value of Money II: Equal Multiple Payments
    1. Why It Matters
    2. 8.1 Perpetuities
    3. 8.2 Annuities
    4. 8.3 Loan Amortization
    5. 8.4 Stated versus Effective Rates
    6. 8.5 Equal Payments with a Financial Calculator and Excel
    7. Summary
    8. Key Terms
    9. CFA Institute
    10. Multiple Choice
    11. Problems
    12. Video Activity
  10. 9 Time Value of Money III: Unequal Multiple Payment Values
    1. Why It Matters
    2. 9.1 Timing of Cash Flows
    3. 9.2 Unequal Payments Using a Financial Calculator or Microsoft Excel
    4. Summary
    5. Key Terms
    6. CFA Institute
    7. Multiple Choice
    8. Review Questions
    9. Problems
    10. Video Activity
  11. 10 Bonds and Bond Valuation
    1. Why It Matters
    2. 10.1 Characteristics of Bonds
    3. 10.2 Bond Valuation
    4. 10.3 Using the Yield Curve
    5. 10.4 Risks of Interest Rates and Default
    6. 10.5 Using Spreadsheets to Solve Bond Problems
    7. Summary
    8. Key Terms
    9. CFA Institute
    10. Multiple Choice
    11. Review Questions
    12. Problems
    13. Video Activity
  12. 11 Stocks and Stock Valuation
    1. Why It Matters
    2. 11.1 Multiple Approaches to Stock Valuation
    3. 11.2 Dividend Discount Models (DDMs)
    4. 11.3 Discounted Cash Flow (DCF) Model
    5. 11.4 Preferred Stock
    6. 11.5 Efficient Markets
    7. Summary
    8. Key Terms
    9. CFA Institute
    10. Multiple Choice
    11. Review Questions
    12. Problems
    13. Video Activity
  13. 12 Historical Performance of US Markets
    1. Why It Matters
    2. 12.1 Overview of US Financial Markets
    3. 12.2 Historical Picture of Inflation
    4. 12.3 Historical Picture of Returns to Bonds
    5. 12.4 Historical Picture of Returns to Stocks
    6. Summary
    7. Key Terms
    8. Multiple Choice
    9. Review Questions
    10. Video Activity
  14. 13 Statistical Analysis in Finance
    1. Why It Matters
    2. 13.1 Measures of Center
    3. 13.2 Measures of Spread
    4. 13.3 Measures of Position
    5. 13.4 Statistical Distributions
    6. 13.5 Probability Distributions
    7. 13.6 Data Visualization and Graphical Displays
    8. 13.7 The R Statistical Analysis Tool
    9. Summary
    10. Key Terms
    11. CFA Institute
    12. Multiple Choice
    13. Review Questions
    14. Problems
    15. Video Activity
  15. 14 Regression Analysis in Finance
    1. Why It Matters
    2. 14.1 Correlation Analysis
    3. 14.2 Linear Regression Analysis
    4. 14.3 Best-Fit Linear Model
    5. 14.4 Regression Applications in Finance
    6. 14.5 Predictions and Prediction Intervals
    7. 14.6 Use of R Statistical Analysis Tool for Regression Analysis
    8. Summary
    9. Key Terms
    10. Multiple Choice
    11. Review Questions
    12. Problems
    13. Video Activity
  16. 15 How to Think about Investing
    1. Why It Matters
    2. 15.1 Risk and Return to an Individual Asset
    3. 15.2 Risk and Return to Multiple Assets
    4. 15.3 The Capital Asset Pricing Model (CAPM)
    5. 15.4 Applications in Performance Measurement
    6. 15.5 Using Excel to Make Investment Decisions
    7. Summary
    8. Key Terms
    9. CFA Institute
    10. Multiple Choice
    11. Review Questions
    12. Problems
    13. Video Activity
  17. 16 How Companies Think about Investing
    1. Why It Matters
    2. 16.1 Payback Period Method
    3. 16.2 Net Present Value (NPV) Method
    4. 16.3 Internal Rate of Return (IRR) Method
    5. 16.4 Alternative Methods
    6. 16.5 Choosing between Projects
    7. 16.6 Using Excel to Make Company Investment Decisions
    8. Summary
    9. Key Terms
    10. CFA Institute
    11. Multiple Choice
    12. Review Questions
    13. Problems
    14. Video Activity
  18. 17 How Firms Raise Capital
    1. Why It Matters
    2. 17.1 The Concept of Capital Structure
    3. 17.2 The Costs of Debt and Equity Capital
    4. 17.3 Calculating the Weighted Average Cost of Capital
    5. 17.4 Capital Structure Choices
    6. 17.5 Optimal Capital Structure
    7. 17.6 Alternative Sources of Funds
    8. Summary
    9. Key Terms
    10. CFA Institute
    11. Multiple Choice
    12. Review Questions
    13. Problems
    14. Video Activity
  19. 18 Financial Forecasting
    1. Why It Matters
    2. 18.1 The Importance of Forecasting
    3. 18.2 Forecasting Sales
    4. 18.3 Pro Forma Financials
    5. 18.4 Generating the Complete Forecast
    6. 18.5 Forecasting Cash Flow and Assessing the Value of Growth
    7. 18.6 Using Excel to Create the Long-Term Forecast
    8. Summary
    9. Key Terms
    10. Multiple Choice
    11. Review Questions
    12. Problems
    13. Video Activity
  20. 19 The Importance of Trade Credit and Working Capital in Planning
    1. Why It Matters
    2. 19.1 What Is Working Capital?
    3. 19.2 What Is Trade Credit?
    4. 19.3 Cash Management
    5. 19.4 Receivables Management
    6. 19.5 Inventory Management
    7. 19.6 Using Excel to Create the Short-Term Plan
    8. Summary
    9. Key Terms
    10. Multiple Choice
    11. Review Questions
    12. Video Activity
  21. 20 Risk Management and the Financial Manager
    1. Why It Matters
    2. 20.1 The Importance of Risk Management
    3. 20.2 Commodity Price Risk
    4. 20.3 Exchange Rates and Risk
    5. 20.4 Interest Rate Risk
    6. Summary
    7. Key Terms
    8. CFA Institute
    9. Multiple Choice
    10. Review Questions
    11. Problems
    12. Video Activity
  22. Index

Learning Outcomes

By the end of this section, you will be able to:

  • Calculate a correlation coefficient.
  • Interpret a correlation coefficient.
  • Test for the significance of a correlation coefficient.

Calculate a Correlation Coefficient

In correlation analysis, we study the relationship between bivariate data, which is data collected on two variables where the data values are paired with one another.

Correlation is the measure of association between two numeric variables. For example, we may be interested to know if there is a correlation between bond prices and interest rates or between the age of a car and the value of the car. To investigate the correlation between two numeric quantities, the first step is to create a scatter plot that will graph the (x, y) ordered pairs. The independent, or explanatory, quantity is labeled as the x-variable, and the dependent, or response, quantity is labeled as the y-variable.

For example, we may be interested to know if the price of Nike stock is correlated with the value of the S&P 500 (Standard & Poor’s 500 stock market index). To investigate this, monthly data can be collected for Nike stock prices and value of the S&P 500 for a period of time, and a scatter plot can be created and examined. A scatter plot, or scatter diagram, is a graphical display intended to show the relationship between two variables. The setup of the scatter plot is that one variable is plotted on the horizontal axis and the other variable is plotted on the vertical axis. Each pair of data values is considered as an (x, y) point, and the various points are plotted on the diagram. A visual inspection of the plot is then made to detect any patterns or trends on the scatter diagram. Table 14.1 shows the relationship between the Nike stock price and its S&P value over a one-year time period.

To assess linear correlation, the graphical trend of the data points is examined on the scatter plot to determine if a straight-line pattern exists. If a linear pattern exists, the correlation may indicate either a positive or a negative correlation. A positive correlation indicates that as the independent variable increases, the dependent variable tends to increase as well, or, as the independent variable decreases, the dependent variable tends to decrease (the two quantities move in the same direction). A negative correlation indicates that as the independent variable increases, the dependent variable decreases, or, as the independent variable decreases, the dependent variable increases (the two quantities move in opposite directions). If there is no relationship or association between the two quantities, where one quantity changing does not affect the other quantity, we conclude that there is no correlation between the two variables.

Date S&P 500

Nike

Stock Price

4/1/2020 2,912.43 87.18
5/1/2020 3,044.31 98.58
6/1/2020 3,100.29 98.05
7/1/2020 3,271.12 97.61
8/1/2020 3,500.31 111.89
9/1/2020 3,363.00 125.54
10/1/2020 3,269.96 120.08
11/1/2020 3,621.63 134.70
12/1/2020 3,756.07 141.47
1/1/2021 3,714.24 133.59
2/1/2021 3,811.15 134.78
3/1/2021 3,943.34 140.45
3/12/2021 3,943.34 140.45
Table 14.1 Nike Stock Price ($) and Value of S&P 500 over a One-Year Time Period (source: Yahoo! Finance)

From the scatter plot in the Nike stock versus S&P 500 example (see Figure 14.2), we note that the trend reflects a positive correlation in that as the value of the S&P 500 increases, the price of Nike stock tends to increase as well.

A scatter plot showing a positive correlation between the Nike stock price and the value of the S&P 500 over 12 months. As the Nike stock price rises from approximately $87 to $140 per share, the S&P 500 rises from approximately 2,900 to 4,000.
Figure 14.2 Scatter Plot of Nike Stock Price ($) and Value of S&P 500 (data source: Yahoo! Finance)

When inspecting a scatter plot, it may be difficult to assess a correlation based on a visual inspection of the graph alone. A more precise assessment of the correlation between the two quantities can be obtained by calculating the numeric correlation coefficient (referred to using the symbol r).

The correlation coefficient, which was developed by statistician Karl Pearson in the early 1900s, is a measure of the strength and direction of the correlation between the independent variable x and the dependent variable y.

The formula for r is shown below; however, technology, such as Excel or the statistical analysis program R, is typically used to calculate the correlation coefficient.

r=nxy-xynx2-x2ny2-y2r=nxy-xynx2-x2ny2-y2

where n refers to the number of data pairs and the symbol xx indicates to sum the x-values.

Table 14.2 provides a step-by-step procedure on how to calculate the correlation coefficient r.

Step Representation in Symbols
1. Calculate the sum of the x-values. xx
2. Calculate the sum of the y-values. yy
3. Multiply each x-value by the corresponding y-value and calculate the sum of these xy products. xyxy
4. Square each x-value and then calculate the sum of these squared values. x2x2
5. Square each y-value and then calculate the sum of these squared values. y2y2
6. Determine the value of n, which is the number of data pairs. n
7. Use these results to then substitute into the formula for the correlation coefficient. r=nxy-xynx2-x2ny2-y2r=nxy-xynx2-x2ny2-y2
Table 14.2 Steps for Calculating the Correlation Coefficient

Note that since r is calculated using sample data, r is considered a sample statistic used to measure the strength of the correlation for the two population variables. Sample data indicates data based on a subset of the entire population.

Given the complexity of this calculation, Excel or other software is typically used to calculate the correlation coefficient.

The Excel command to calculate the correlation coefficient uses the following format:

=CORREL(A1:A10, B1:B10)

where A1:A10 are the cells containing the x-values and B1:B10 are the cells containing the y-values.

Download the spreadsheet file containing key Chapter 14 Excel exhibits.

Interpret a Correlation Coefficient

Once the value of r is calculated, this measurement provides two indicators for the correlation:

  1. the strength of the correlation based on the value of r
  2. the direction of the correlation based on the sign of r

The value of r gives us this information:

  • The value of r is always between -1-1 and +1+1: -1  r  1-1  r  1.
  • The size of the correlation r indicates the strength of the linear relationship between the two variables. Values of r close to -1-1 or to +1+1 indicate a stronger linear relationship.
  • If r =0r =0, there is no linear relationship between the two variables (no linear correlation).
  • If r=1r=1, there is perfect positive correlation.
  • If r=-1,r=-1, there is perfect negative correlation. In both of these cases, all the original data points lie on a straight line.

The sign of r gives us this information:

  • A positive value of r means that when x increases, y tends to increase, and when x decreases, y tends to decrease (positive correlation).
  • A negative value of r means that when x increases, y tends to decrease, and when x decreases, y tends to increase (negative correlation).

The Excel command used to find the value of the correlation coefficient for the Nike stock versus S&P 500 example (refer back to Table 14.1) is

=CORREL(B2:B14,C2:C14)

In this example, the value of rr is calculated by Excel to be r=0.928r=0.928.

Since this is a positive value close to 1, we conclude that the relationship between Nike stock and the value of the S&P 500 over this time period represents a strong, positive correlation.

The correlation coefficient r can also be determined using the statistical capability on the financial calculator:

  • Step 1 is to enter the data in the calculator (using the [DATA] function that is located above the 7 key).
  • Step 2 is to access the statistical results provided by the calculator (using the [STAT] function that is located above the 8 key) and scroll to the correlation coefficient results.

Follow the steps in Table 14.3 for calculating the correlation data for the data set of Nike stock price and value of the S&P 500 shown previously.

Step Description Enter Display
1 Enter [DATA] entry mode 2ND [DATA] X01 0.00
2 Clear any previous data 2ND [CLR WORK] X01 0.00
3 Enter first x-value of 2912.43 2912.43 ENTER X01 = 2,912.43
4 Move to next data entry Y01 = 1.00
5 Enter first y-value of 87.18 87.18 ENTER Y01 = 87.18
6 Move to next data entry X02 0.00
7 Enter second x-value of 3044.31 3044.31 ENTER X02 = 3,044.31
8 Move to next data entry Y02 = 1.00
9 Enter second y-value of 98.58 98.58 ENTER Y02 = 98.58
10 Move to next data entry X03 0.00
11 Continue to enter remaining data values
12 Enter [STAT] mode 2ND [STAT]
13 Press [SET] until LIN appears 2ND [SET] LIN
14 Move to 1st statistical result n=n= 13.00
15 Move to next statistical result x¯=x¯= 3,480.86
16 Continue to scroll down until the value of r is displayed r=r= 0.93
Table 14.3 Calculator Steps for Finding the Relationship between Nike Stock Price and Value of S&P 5001

From the statistical results shown on the calculator display, the correlation coefficient r is 0.93, which indicates that the relationship between Nike stock and the value of the S&P 500 over this time period represents a strong, positive correlation.

Note: A strong correlation does not suggest that x causes y or y causes x. We must remember that correlation does not imply causation.

Test a Correlation Coefficient for Significance

The correlation coefficient, r, tells us about the strength and direction of the linear relationship between x and y. The sample data are used to compute r, the correlation coefficient for the sample. If we had data for the entire population (that is, all measurements of interest), we could find the population correlation coefficient, which is labeled as the Greek letter ρ (pronounced “rho”). But because we have only sample data, we cannot calculate the population correlation coefficient. The sample correlation coefficient, r, is our estimate of the unknown population correlation coefficient.

  • ρ = population correlation coefficient (unknown)
  • r = sample correlation coefficient (known; calculated from sample data)

An important step in the correlation analysis is to determine if the correlation is significant. By this, we are asking if the correlation is strong enough to allow meaningful predictions for y based on values of x. One method to test the significance of the correlation is to employ a hypothesis test. The hypothesis test lets us decide whether the value of the population correlation coefficient ρ is close to zero or significantly different from zero. We decide this based on the sample correlation coefficient r and the sample size n.

If the test concludes that the correlation coefficient is significantly different from zero, we say that the correlation coefficient is significant.

  • Conclusion: There is sufficient evidence to conclude that there is a significant linear relationship between x and y variables because the correlation coefficient is significantly different from zero.
  • What the conclusion means: There is a significant linear relationship between the x and y variables. If the test concludes that the correlation coefficient is not significantly different from zero (it is close to zero), we say that the correlation coefficient is not significant.

A hypothesis test can be performed to test if the correlation is significant. A hypothesis test is a statistical method that uses sample data to test a claim regarding the value of a population parameter. In this case, the hypothesis test will be used to test the claim that the population correlation coefficient ρ is equal to zero.

Use these hypotheses when performing the hypothesis test:

  • Null hypothesis: H0: ρ=0H0: ρ=0
  • Alternate hypothesis: Ha: ρ0Ha: ρ0

The hypotheses can be stated in words as follows:

  • Null hypothesis H0H0: The population correlation coefficient is not significantly different from zero. There is not a significant linear relationship (correlation) between x and y in the population.
  • Alternate hypothesis HaHa: The population correlation coefficient is significantly different from zero. There is a significant linear relationship (correlation) between x and y in the population.

A quick shorthand way to test correlations is the relationship between the sample size and the correlation. If  r   2n, r   2n, then this implies that the correlation between the two variables demonstrates that a linear relationship exists and is statistically significant at approximately the 0.05 level of significance. As the formula indicates, there is an inverse relationship between the sample size and the required correlation for significance of a linear relationship. With only 10 observations, the required correlation for significance is 0.6325; for 30 observations, the required correlation for significance decreases to 0.3651; and at 100 observations, the required level is only 0.2000.

NOTE:

  • If r is significant and the scatter plot shows a linear trend, the line can be used to predict the value of y for values of x that are within the domain of observed x-values.
  • If r is not significant OR if the scatter plot does not show a linear trend, the line should not be used for prediction.
  • If r is significant and the scatter plot shows a linear trend, the line may not be appropriate or reliable for prediction outside the domain of observed x-values in the data.

Think It Through

Determining If a Correlation Is Significant

Suppose that the chief financial officer (CFO) of a corporation is investigating the correlation between stock prices and unemployment rate over a period of 10 years and finds the correlation coefficient to be -0.68. There are 10 (x, y) data points in the data set. Should the CFO conclude that the correlation is significant for the relationship between stock prices and unemployment rate based on a level of significance of 0.05?

Correlations may be helpful in visualizing the data, but they are not appropriately used to explain a relationship between two variables. Perhaps no single statistic is more misused than the correlation coefficient. Citing correlations between health conditions and everything from place of residence to eye color have the effect of implying a cause-and-effect relationship. This simply cannot be accomplished with a correlation coefficient. The correlation coefficient is, of course, innocent of this misinterpretation. It is the duty of analysts to use a statistic that is designed to test for cause-and-effect relationships and to report only those results, if they are intending to make such a claim. The problem is that passing this more rigorous test is difficult, therefore lazy and/or unscrupulous researchers fall back on correlations when they cannot make their case legitimately.

Footnotes

  • 1The specific financial calculator in these examples is the Texas Instruments BA II Plus TM Professional model, but you can use other financial calculators for these types of calculations.
Do you know how you learn best?
Kinetic by OpenStax offers access to innovative study tools designed to help you maximize your learning potential.
Order a print copy

As an Amazon Associate we earn from qualifying purchases.

Citation/Attribution

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute OpenStax.

Attribution information
  • If you are redistributing all or part of this book in a print format, then you must include on every physical page the following attribution:
    Access for free at https://openstax.org/books/principles-finance/pages/1-why-it-matters
  • If you are redistributing all or part of this book in a digital format, then you must include on every digital page view the following attribution:
    Access for free at https://openstax.org/books/principles-finance/pages/1-why-it-matters
Citation information

© May 20, 2022 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.