Skip to ContentGo to accessibility pageKeyboard shortcuts menu
OpenStax Logo

14.1 Correlation Analysis

Correlation is the measure of association between two numeric variables. A correlation coefficient called r is used to assess the strength and direction of the correlation. The value of r is always between -1-1 and +1+1. The size of the correlation r indicates the strength of the linear relationship between the two variables. Values of r close to -1-1 or to +1+1 indicate a stronger linear relationship. A positive value of r means that when x increases, y tends to increase and when x decreases, y tends to decrease (positive correlation). A negative value of r means that when x increases, y tends to decrease and when x decreases, y tends to increase (negative correlation).

14.2 Linear Regression Analysis

Linear regression analysis uses a straight-line fit to model the relationship between the two variables. Once a straight-line model is developed, this model can then be used to predict the value of the dependent variable for a specific value of the independent variable. Two parameters are calculated for the linear model, the slope of the best-fit line and the y-intercept of the best-fit line. The method of least squares is used to generate these parameters; this method is based on minimizing the squared differences between the predicted values and observed values for y.

14.3 Best-Fit Linear Model

Once a correlation has been deemed significant, a linear regression model is developed. The goal in the regression analysis is to determine the coefficients a and b in the following regression equation: y^=a+bxy^=a+bx. Typically some technology, such as Excel, R statistical tool, or a calculator, is used to generate the coefficients a and b since manual calculations are cumbersome.

14.4 Regression Applications in Finance

Regression analysis is used extensively in finance-related applications. Many typical applications involve determining if there is a correlation between various stock market indices such as the S&P 500, the DJIA, and the Russell 2000 index. The procedure is to first generate a scatter plot to determine if a visual trend is observed, then calculate a correlation coefficient and check for significance. If the correlation coefficient is significant, a linear model can then be generated and used for predictions.

14.5 Predictions and Prediction Intervals

A key aspect of generating the linear regression model is to then use the model for predictions, provided that the correlation is significant. To generate predictions or forecasts using the linear regression model, substitute the value of the independent variable (x) in the regression equation and solve the equation for the dependent variable (y). When making predictions using the linear model, it is generally recommended to only predict values for y using values of x that are in the original range of the data collection.

14.6 Use of R Statistical Analysis Tool for Regression Analysis

R is an open-source statistical analysis tool that is widely used in the finance industry and can be found online. R provides an integrated suite of functions for data analysis, graphing, and correlation and regression analysis. R is increasingly being used as a data analysis and statistical tool because it is an open-source language and additional features are constantly being added by the user community. The tool can be used on many different computing platforms.

Order a print copy

As an Amazon Associate we earn from qualifying purchases.


This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute OpenStax.

Attribution information
  • If you are redistributing all or part of this book in a print format, then you must include on every physical page the following attribution:
    Access for free at
  • If you are redistributing all or part of this book in a digital format, then you must include on every digital page view the following attribution:
    Access for free at
Citation information

© Jan 8, 2024 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.