Skip to ContentGo to accessibility pageKeyboard shortcuts menu
OpenStax Logo
Principles of Data Science

5.4 Forecast Evaluation Methods

Principles of Data Science5.4 Forecast Evaluation Methods

Learning Outcomes

By the end of this section, you should be able to:

  • 5.4.1 Explain the nature of error in forecasting a time series.
  • 5.4.2 Compute common error measures for time series models.
  • 5.4.3 Produce prediction intervals in a forecasting example.

A time series forecast essentially predicts the middle of the road values for future terms of the series. For the purposes of prediction, the model considers only the trend and any cyclical or seasonal variation that could be detected within the known data. Although noise and random variation certainly do influence future values of the time series, their precise effects cannot be predicted (by their very nature). Thus, a forecasted value is the best guess of the model in the absence of an error term. On the other hand, error does affect how certain we can be of the model’s forecasts. In this section, we focus on quantifying the error in forecasting using statistical tools such as prediction intervals.

Forecasting Error

Suppose that you have a model (x^n)(x^n) for a given time series (xn)(xn). Recall from Components of Time Series Analysis that the residuals quantify the error between the time series the model and serve as an estimate of the noise or random variation.

εn=xnx^nεn=xnx^n

The better the fit of the model to the observed data, the smaller the values of εnεn will be. However, even very good models can turn out to be poor predictors of future values of the time series. Ironically, the better a model does at fitting to known data, the worse it may be at predicting future data. There is a danger of overfitting. (This term is defined in detail in Decision-Making Using Machine Learning Basics, but for now we do not need to go deeply into this topic.) A common technique to avoid overfitting is to build the model using only a portion of the known data, and then the model’s accuracy can be tested on the remaining data that was held out.

To discuss the accuracy of a model, we should ask the complementary question: How far away are the predictions from the true values? In other words, we should try to quantify the total error. There are several measures of error, or measures of fit, the metrics used to assess how well a model's predictions align with the observed data. We will only discuss a handful of them here that are most useful for time series. In each formula, xixi refers to the ithith term of the time series, and εiεi is the error between the actual ithith term and the predicted ithith term of the series.

  1. Mean absolute error (MAE): 1ni=1n|εi|1ni=1n|εi|. A measure of the average magnitude of errors.
  2. Root mean squared error (RMSE): 1ni=1n(εi2)1ni=1n(εi2). A measure of the standard deviation of errors, penalizing larger errors more heavily than MAE does.
  3. Mean absolute percentage error (MAPE): 1ni=1n|εixi|1ni=1n|εixi|. A measure of the average relative errors—that is, a percentage between the predicted values and the actual values (on average).
  4. Symmetric mean absolute percentage error (sMAPE): 1ni=1n2|εi||xi|+|x^i|1ni=1n2|εi||xi|+|x^i|. Similar to MAPE, a measure of the average relative errors, but scaled so that errors are measured in relation to both actual values and predicted values.

Of these error measures, MAE and RMSE are scale-dependent, meaning that the error is in direct proportion to the data itself. In other words, if all terms of the time series and the model were scaled by a factor of kk, then the MAE and RMSE would both be multiplied by kk as well. On the other hand, MAPE and sMAPE are not scale-dependent. These measures of errors are often expressed as percentages (by multiplying the result of the formula by 100%). However, neither MAPE nor sMAPE should be used for data that is measured on a scale containing 0 and negative numbers. For example, it would not be wise to use MAPE or sMAPE as a measure of error for a time series model of Celsius temperature readings. For all of these measures of error, lower values indicate less error and hence more accuracy of the model. This is useful when comparing two or more models.

Example 5.9

Problem

Compute the MAE, RMSE, MAPE, and sMAPE for the EMA smoothing model for the S&P Index time series, shown in Table 5.3.

Prediction Intervals

A forecast is often accompanied by a prediction interval giving a range of values the variable could take with some level of probability. For example, if the prediction interval of a forecast is indicated to be 80%, then the prediction interval contains a range of values that should include the actual future value with a probability of 0.8. Prediction intervals were introduced in Analysis of Variance (ANOVA) in the context of linear regression, so we won’t get into the details of the mathematics here. The key point is that we want a measure of margin of error, EnEn (depending on nn), such that future observations of the data will be within the interval x^n±Enx^n±En with probability αα, where αα is a chosen level of confidence.

Here, we will demonstrate how to use Python to obtain prediction intervals.

The Python library statsmodels.tsa.arima.model contains functions for finding confidence intervals as well. Note that the command get_forecast() is used here rather than forecast(), as the former contains more functionality.

Python Code

    ### Please run all code from previous section before running this ###
    
    # Set alpha to 0.2 for 80% confidence interval
    forecast_steps = 24
    forecast_results = results.get_forecast(steps=forecast_steps, alpha=0.2) 
    
    # Extract forecast values and confidence intervals
    forecast_values = forecast_results.predicted_mean
    confidence_intervals = forecast_results.conf_int()
    
    # Plot the results
    plt.figure(figsize=(10, 6))
    
    # Plot original time series
    plt.plot(df['Value'], label='Original Time Series')
    
    # Plot fitted values
    plt.plot(results.fittedvalues, color='red', label='Fitted Values')
    
    # Plot forecasted values with confidence intervals
    plt.plot(forecast_values, color='red', linestyle='dashed', label='Forecasted Values')
    plt.fill_between(
      range(len(df), len(df) + forecast_steps),
      confidence_intervals.iloc[:, 0],
      confidence_intervals.iloc[:, 1],
      color='red', alpha=0.2,
      label='80% Confidence Interval' 
    )
    
    # Set labels and legend
    plt.xlabel('Months')
    plt.title('Monthly Consumption of Coal for Electricity Generation in the United States from 2016 to 2022')
    plt.legend()
    
    # Apply the formatter to the Y-axis
    plt.gca().yaxis.set_major_formatter(FuncFormatter(y_format))
    plt.show()
    

The resulting output will look like this:

Time series plot titled Monthly consumption of coal for electricity generation in the United States from 2016 to 2022. Y-axis ranges from -50,000 to 125,000, x-axis from 0 to 100. The blue line represents the actual coal consumption, which fluctuates seasonally. The red line represents the fitted values, which smooth out the seasonal fluctuations, and the dashed red line represents the forecasted values. The shaded area represents the 80% confidence interval for the forecasted values. Coal consumption shows a general downward trend from around 125,000 tons per month in 2016 to around 0 tons per month in 2022.

The forecast data (dashed curve) is now surrounded by a shaded region. With 80% probability, all future observations should fit into the shaded region. Of course, the further into the future we try to go, the more uncertain our forecasts will be, which is indicated by the wider and wider confidence interval region.

Datasets

Note: The primary datasets referenced in the chapter code may also be downloaded here.

Citation/Attribution

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution-NonCommercial-ShareAlike License and you must attribute OpenStax.

Attribution information
  • If you are redistributing all or part of this book in a print format, then you must include on every physical page the following attribution:
    Access for free at https://openstax.org/books/principles-data-science/pages/1-introduction
  • If you are redistributing all or part of this book in a digital format, then you must include on every digital page view the following attribution:
    Access for free at https://openstax.org/books/principles-data-science/pages/1-introduction
Citation information

© Dec 19, 2024 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.