- alternative hypothesis
- a complementary statement regarding an unknown population parameter used in hypothesis testing
- analysis of variance (ANOVA)
- statistical method to compare three or more means and determine if the means are all statistically the same or if at least one mean is different from the others
- best-fit linear equation
- an equation of the form that provides the best-fit straight line to the data points
- bivariate data
- data collected on two variables where the data values are paired with one another
- bootstrapping
- a method to construct a confidence interval that is based on repeated sampling and does not rely on any assumptions regarding the underlying distribution
- central limit theorem
- describes the relationship between the sample distribution of sample means and the underlying population
- confidence interval
- an interval where sample data is used to provide an estimate for a population parameter
- confidence level
- the probability that the interval estimate will contain the population parameter, given that the estimation process on the parameter is repeated over and over
- correlation
- a measure of association between two numeric variables
- correlation analysis
- a statistical method used to evaluate and quantify the strength and direction of the linear relationship between two quantitative variables
- correlation coefficient
- a measure of the strength and direction of the linear relationship between two variables
- critical value
- z-score that cuts off an area under the normal curve corresponding to a specified confidence level
- dependent samples
- samples from one population that can be paired or matched to the samples taken from the second population
- dependent variable
- in correlation analysis, the variable being studied or measured; the dependent variable is the outcome that is measured or observed to determine the impact of changes in the independent variable
- F distribution
- a skewed probability distribution that arises in statistical hypothesis testing, such as ANOVA analysis
- hypothesis testing
- a statistical method to test claims regarding population parameters using sample data
- independent samples
- the sample from one population that is not related to the sample taken from the second population
- independent variable
- in correlation analysis, the variable that is manipulated or changed in an experiment or study; the value of the independent variable is controlled or chosen by the experimenter to observe its effect on the dependent variable
- inferential statistics
- statistical methods that allow researchers to infer or generalize observations from samples to the larger population from which they were selected
- least squares method
- a method used in linear regression that generates a straight line fit to the data values such that the sum of the squares of the residual is the least sum possible
- level of significance ()
- the maximum allowed probability of making a Type I error; the level of significance is the probability value used to determine when the sample data indicates significant evidence against the null hypothesis
- linear correlation
- a measure of the association between two variables that exhibit an approximate straight-line fit when plotted on a scatterplot
- margin of error
- an indication of the maximum error of the estimate
- matched pairs
- samples from one population that can be paired or matched to the samples taken from the second population
- method of least squares
- a mathematical method to generate a linear equation that is the “best fit” to the points on the scatterplot in the sense that the line minimizes the differences between the predicted values and observed values for y
- modeling
- the process of creating a mathematical representation that describes the relationship between different variables in a dataset; the model is then used to understand, explain, and predict the behavior of the data
- nonparametric methods
- statistical methods that do not rely on any assumptions regarding the underlying distribution
- null hypothesis
- statement of no effect or no change in the population
- p-value
- the probability of obtaining a sample statistic with a value as extreme as (or more extreme than) the value determined by the sample data under the assumption that the null hypothesis is true
- parametric methods
- statistical methods that assume a specific form for the underlying distribution
- point estimate
- a sample statistic used to estimate a population parameter
- prediction
- a forecast for the dependent variable based on a specific value of the independent variable generated using the linear model
- proportion
- a measure that expresses the relationship between a part and the whole; a proportion represents the fraction or percentage of a dataset that exhibits a particular characteristic or falls into a specific category
- regression analysis
- a statistical technique used to model the relationship between a dependent variable and one or more independent variables
- residual
- the difference between an observed y-value and the predicted y-value obtained from the linear regression equation
- sample mean
- a point estimate for the unknown population mean chosen as the most unbiased estimate of the population
- sample proportion
- chosen as the most unbiased estimate of the population, calculated as the number of successes divided by the sample size:
- sample statistic
- a numerical summary or measure that describes a characteristic of a sample, such as a sample mean or sample proportion
- sampling distribution
- a probability distribution of a sample statistic based on all possible random samples of a certain size from a population or the distribution of a statistic (such as the mean) that would result from taking random samples from the same population repeatedly and calculating the statistic for each sample
- scatterplot (or scatter diagram)
- graphical display that shows values of the independent variable plotted on the -axis and values of the dependent variable plotted on the -axis
- standard error of the mean
- the standard deviation of the sample mean, calculated as the population standard deviation divided by the square root of the sample size
- standardized test statistic
- a numerical measure that describes how many standard deviations a particular value is from the mean of a distribution; a standardized test statistic is typically used to assess whether an observed sample statistic is significantly different from what would be expected under a null hypothesis
- t-distribution
- a bell-shaped, symmetric distribution similar to the normal distribution, though the t-distribution has “thicker tails” as compared to the normal distribution
- test statistic
- a numerical value used to assess the strength of evidence against a null hypothesis, calculated from sample data that is used in hypothesis testing
- Type I error
- an error made in hypothesis testing where a researcher rejects the null hypothesis when in fact the null hypothesis is actually true
- Type II error
- an error made in hypothesis testing where a researcher fails to reject the null hypothesis when the null hypothesis is actually false
- unbiased estimator
- a statistic that provides a valid estimate for the corresponding population parameter without overestimating or underestimating the parameter
- variable
- a characteristic or attribute that can be measured or observed.