Skip to ContentGo to accessibility pageKeyboard shortcuts menu
OpenStax Logo
Principles of Data Science

A | Appendix A: Review of Excel for Data Science

Principles of Data ScienceA | Appendix A: Review of Excel for Data Science

Microsoft® Excel® is a popular spreadsheet program developed by Microsoft Corporation. Excel is one of the first and simplest spreadsheet tools developed to help with the manipulation and analysis of data. It includes a range of functionalities that cater to the diverse needs of many data science applications. You are likely to have used Excel in some form or other—perhaps to organize the possible roommate combinations in your dorm room or to plan for a party or in some instructional context.

Excel was first released by Microsoft in 1987, and it has become the most popular choice for loading and analyzing tabular data.

Basic Data Analysis Using Excel

A screenshot of the main Excel menu when it is open is provided in Figure A1. As you see, Excel consists of a series of submenus such as Home, Insert, Draw, Page Layout, Formulas, etc. Clicking on each Menu item opens a submenu with additional functions. For example, clicking on the Formulas menu item opens a submenu as shown in Figure A2.

A screenshot of the main Excel menu displaying a series of submenus including File, Home, Insert, Draw, Page Layout, Formulas, Data, Review, and View. Home is underlined in green.
Figure A1 MS Excel Menu (Used with permission from Microsoft)
A screenshot of the main Excel menu with Formulas underlined in green. The Formulas submenu is open, showing additional functions.
Figure A2 Formulas Menu in MS Excel (Used with permission from Microsoft)

Excel presents data in the form of table with rows and columns. We call each rectangle, defined by the combination of row number and column name (e.g., A1, B3), a cell. We can use a mouse pointer to select a specific cell and then edit the data inside. Once a cell is selected, Namebox shows the identifier of the selected cell Figure A3.

Formula Bar is used to read/write a formula for a cell, which is identified by Namebox.

A screenshot of a MS Excel document showing the home menu and rows and columns. Cell A1 is highlighted in green and the cursor is in the Formula bar.
Figure A3 MS Excel Showing Rows and Columns (Used with permission from Microsoft)

Load Data Using Excel

Loading a dataset to Excel can be done simply by clicking File > Open File. That will open up a file explorer window, and you can simply select the CSV file you want to load. We are going to use a part of a public dataset, titled “SOCR- 1035 Records of Heights (in) and Weights (lbs) of Major League Baseball (MLB) Players” (ch1-SOCR-small.csv).

Once loaded, the dataset will show up on the Worksheet Window, as shown in Figure A4.

A screenshot of an MS Excel worksheet. It contains a dataset with 13 rows and 6 columns. The columns are labeled name, team, position, height (inches), weight (pounds), and age.
Figure A4 SOCR-small Dataset Opened with MS Excel (Used with permission from Microsoft)

Summarize Data Using Excel

We will continue to refer to the same dataset about the MLB players. For simplicity, only 12 items of SD Outfielders and Pitchers are included in Figure A5.

A table that contains 13 rows and 6 columns. The columns are labeled name, team, position, height (inches), weight (pounds), and age.
Figure A5 ch1-SOCR-small.csv Opened with MS Excel

EXAMPLE A.1

Problem

On ch1-SOCR-small.csv, write a formula that counts the number of Outfielders.

Solution

The formula is

=COUNTIF(C2:C13, "Outfielder")”

This will count the rows whose position, located at column C, is “Outfielder.” Note that double quotation marks (i.e., “ ”) are used to indicate “Outfielder” is a string. The result is 7.

Search Data Using Excel

You can search data in a table in multiple ways with Excel. We will first explore ways without using Excel formulas, and then we’ll look at some useful formulas.

You can search for the cells containing a certain number or text by using “Find,” as in Figure A6. The Find pop-up window will have a text box in which you can type in any number or text, and then Excel will search and find all the cells that include number/text you typed in.

A screenshot of the Find feature in MS Excel
Figure A6 Find Feature in MS Excel (Used with permission from Microsoft)

If you are more interested in viewing all the rows that fulfill certain criteria, you can use “Filter,” as in Figure A7. Once Filter is enabled, you will see drop-down arrows on each column of the table. The arrows allow you to select specific values that you are searching for and then presents only the rows with such values.

A screenshot of the Filter Feature in MS Excel with options to Sort A to Z, Sort Z to A, Custom Sort, and Filter.
Figure A7 Filter Feature in MS Excel (Used with permission from Microsoft)

For example, if you click “Outfielder” in the Position window, the resulting table will only have the rows whose Position is Outfielder, as shown in Figure A8.

A screenshot of the Filter by Position menu in MS Excel is popped up over a dataset. It has options to sort Ascending or Descending. The filter menu has four items with green check marks: Select All, Outfielder, Pitcher, and Blanks.
Figure A8 Filter by Position on MS Excel (Used with permission from Microsoft)

Now let’s do some more sophisticated searches using some Excel formulas.

INDEX

Index returns a cell value at a specific location, given a range of cells. The location is identified as a relative row and column number within the range. For example, the following formula on ch1-SOCR-small.csv will return 72 since the cell located at the fifth row and second column within the range of C8:E13 will refer to the row 12 and column D—so the cell D12. The value at D12 is 72.

=INDEX(C8:E13,5,2)

EXAMPLE A.2

Problem

On ch1-SOCR-small.csv, what value does the following formula return?

=INDEX(A3:D9,7,3)

Solution

This formula finds the seventh row and third column within the range of A3:D9, which indicates C9. Thus, the resulting value is “Outfielder.”

MATCH

Given a range of single-column cells, MATCH returns the relative row number at which the cell value matches the value you specify. MATCH takes three arguments—the value to match, the range of search, and the match type. This example illustrates how to locate the value 75 within the cell D2, D3, …, through D13. The third argument, match type, is set to 0 so that it seeks for an exact match (i.e., exactly 75, not 74 or 75.5).

=MATCH(75, D2:D13, 0)

The formula above returns 5, a relative row number. The relative row number 5 indicates the fifth row within the range, and this corresponds to the absolute row number 6. Notice that D6 has 75.

What happens if there are multiple matching rows?

If there are multiple rows with a match, MATCH only returns the very first instance. For example, this formula will simply return 2 (i.e., the second row within D2:D13, which is D3) even though there are three occurrences of 72 at rows 3, 5, and 12.

=MATCH(72, D2:D13, 0)

EXAMPLE A.3

Problem

Using MATCH, find the absolute row number of a first player whose weight is 185 lb on ch1-SOCR-small.csv.

Solution

Weight is located in column E. The correct use of MATCH is provided next. It looks through E2, E3, …, E13 and finds the first exact match of 185. The first occurrence of 185 is in the second row in the range E2:E13, which is E3, so this formula will return 2.

=MATCH(185, E2:E13, 0)

To get the absolute row number, you need to offset the output above with a constant number. Notice that the search range starts from E2, so you should offset the output by +1. Therefore, the final answer is:

=MATCH(185, E2:E13, 0) + 1

VLOOKUP

Given the search range, VLOOKUP locates a specific search key within the first column in terms of the row number and then returns a value at the column of your choice along that row. It requires four arguments—the search key, the range of search, the relative position of the column of your interest, and the match type. For example, the formula below locates the first pitcher along column C from C2 to C13 first. Similarly to MATCH, VLOOKUP supports multiple match types and FALSE indicates the exact match (i.e., not the “partial” match) for VLOOKUP.

=VLOOKUP(“Pitcher”, C2:E13, 3, FALSE)

Then, the formula reads a specific cell value in the same row. The column number is indicated as the third argument in the formula, called the relative position. It refers to how many columns the search key is away to the right from the column that is specified in the first argument. The first exact match is found at row 2, and the formula reads values in the third column to the right from column C: E. Thus, the value 220 in cell E2 is the result.

EXAMPLE A.4

Problem

Using VLOOKUP, find the age of Mike Thompson on ch1-SOCR-small.csv.

Solution

First you need to search for the player Mike Thompson. His row can be located by his name, “Mike_Thompson.” Then, the relative position argument of VLOOKUP allows you to read his age. The age column, F, is 6 columns to the left of Name, so the relative position argument should be 6. The fourth argument remains FALSE as you are looking for an exact match.

=VLOOKUP(“Mike_Thompson”, A2:F13, 6, FALSE)

The formula above will return 26.31.

Basic Visualization and Graphing Using Excel

Excel provides many options for generating graphs and other visualizations to assist in data science investigations.

The menu in Figure A9 shows some of the many options available within Excel for graphing.

A screenshot of a menu of various graphs available within Excel including Column, Line, Pie, Bar, Area, X Y (Scatter), and more.
Figure A9 A Menu of Various Graphs Available within MS Excel (Used with permission from Microsoft)

Graphically Represent Data Using Excel

Graphical representation of data, or data visualization, can be done easily with Excel. Let’s draw a bar chart as an example. You can start by simply clicking Insert > Bar chart > Clustered Column icon (Figure A10).

A screenshot of Excel’s Insert menu with a 2-D column Bar Chart Icon selected and highlighted with a red outline.
Figure A10 MS Excel’s Bar Chart Icon (Used with permission from Microsoft)

Once the icon is clicked, you will see an arbitrary bar chart. It could be just a plain whitespace too—it really depends on what cell was selected when you clicked the icon.

Now click the chart area and it will reveal two new tabs—Chart Design and Format. Go to the Chart Design tab and click Select Data (Figure A11).

A screenshot of Excel's Chart Design tab with the option to Select Data.
Figure A11 Select Data Menu on MS Excel (Used with permission from Microsoft)

In the Select Data window, define Chart data range by highlighting A1:F13 (i.e., the entire table). This will auto-fill the rest of the pop-up window (Figure A12).

A screenshot of Excel's Select Data Pop-Up Window. The menu is labeled Range Details. Chart data range is filled in with =Sheet1!$A$1:$F$13. Height (inches) is selected on the Legend entries (Series) menu.
Figure A12 Select Data Pop-Up Window on MS Excel (Used with permission from Microsoft)

Three entries—Height, Weight, Age—refer to each group of bars. Each group will have 13 bars, each of which represents different baseball players. For now, let’s plot only Height. You can do so by removing the other two by clicking the “-” button under the entry list.

Click Ok and the bar chart is generated as Figure A13. Notice that columns A through C are displayed as the x-axis labels, as indicated on the pop-up window (see the Horizontal [Category] axis labels part in Figure A12).

A bar chart labeled “Heights of Various Baseball Players.” The X axis has three kinds of labels (Position, Team, and Name). The Y axis is labeled Height (inches) and ranges from 64 to 84. The chart has 12 blue bars of varying heights.
Figure A13 The First Bar Chart with Three Kinds of x-Axis Labels (Position, Team, and Name)

If you wish to show only the player names, go back to the Select Data window and update the Horizontal axis labels to be just A2:A13 (Figure A14).

A screenshot of Excel's Select Data window with the Horizontal (category) axis labels option open and filled in with: =Sheet1!$A$2:$A$13. The final A is highlighted in green.
Figure A14 Keeping Only Name on the x-Axis Label (Used with permission from Microsoft)

The resulting plot will only have player names along the x-axis (Figure A15).

A bar chart labeled “Heights of Various Baseball Players.” The X axis has player names. The Y axis is labeled Height (inches) and ranges from 64 to 84. The chart has 12 blue bars of varying heights.
Figure A15 The Second Bar Chart with Only Name Displayed on the x-Axis

Changing the formatting of different chart elements is easy in Excel as well. You can 1) click the element to change the text or drag to move the position of the element or 2) double-click the element in the corresponding pop-up menu. Figure A16 shows the pop-up menu for the chart title element. Notice that you can change different aspects of the presentation with the title element, such as Fill and Border.

A screenshot of a bar chart in Excel with the title highlighted. The Format Chart Title pop-up menu is open. Title Options is selected and under the Fill menu, No fill is selected with a green button.
Figure A16 Changing the Formatting of the Bar Chart (Used with permission from Microsoft)

EXAMPLE A.5

Problem

Using MS Excel, draw a scatterplot between Weight (x-axis) and Height (y-axis). A scatterplot is a plot that depicts each data point as a dot. The location of a dot is typically determined by the x- and y-axis.

Solution

There is an icon for scatterplots under Insert (see Figure A17). On the Select Data pop-up window, select D1:E13 since the weights and heights are in column D and E, as shown in Figure A17 and Figure A18.

A screenshot of the Insert menu in Excel with the Scatter Icon selected.
Figure A17 Scatterplot Icon (Used with permission from Microsoft)
A screenshot of the Select Data Source pop-up window in Excel. The menu is labeled Range Details. Chart data range is filled in with =Sheet1!$D$1:$E$13. Weight (pounds) is selected on the Legend entries (Series) menu.
Figure A18 Select Data for Drawing a Scatterplot (Used with permission from Microsoft)

The resulting scatterplot is shown in Figure A19. Notice that the numbers along the x-axis are heights, while those along the y-axis are weights.

A scatterplot labeled “Scatterplot of Heights and Weights.” The X axis is labeled Height (inches) and ranges from 68 to 84. The Y axis is labeled Weight (pounds) and ranges from 0 to 300. The chart has 12 data points with most clustered between 70,200 and 76, 200. There is one outlier at 82,250.
Figure A19 The Initial Scatterplot

You can change the chart title and the axis titles by double-clicking the title text and the axes themselves, as in Figure A20.

A screenshot of a scatterplot in Excel. The title is highlighted and says: Height (inches) - Weight (pounds).
Figure A20 Changing the Title of the Scatterplot (Used with permission from Microsoft)

Once either the x- or y-axis is double-clicked, a different menu will show up and there you can add axis titles, as illustrated in Figure A21.

A screenshot of Excel's Chart Design menu. A submenu has Axis Titles highlighted in green and another submenu branches off of that with Primary Horizontal and Primary Value as the two choices for changing the Axis Titles of the scatterplot.
Figure A21 Changing the Axis Title of the Scatterplot (Used with permission from Microsoft)

Basic Statistical Analysis Using Excel

Excel provides many statistical functions and statistical distributions that are used in data science applications.

Excel Analysis for Measures of Center and Dispersion

Excel allows you to type a formula into a cell. Using a formula, you can refer to a value from another cell or produce an output based on a value in a cell (or values from multiple cells). A formula always starts with an equal sign (=). Following is an example formula, calculating an average of three numbers in cells A1, A2, and A3.

= AVERAGE(A1, A2, A3)

Suppose you have numbers in cells A1, A2, and A3 and want to show the average of the three numbers in cell B1. Figure A22 shows how to use AVERAGE in this scenario.

A screenshot of an Excel spreadsheet showing columns A, B, and C and rows 1, 2 and 3. Cell A1 is 10, cell A2 is 20 , and cell A3 is 30. Cell B2 has an AVERAGE formula that says =AVERAGE(A1, A2, A3).
Figure A22 Use of AVERAGE in MS Excel (Used with permission from Microsoft)

Once you are done typing the formula in B1, press enter. The formula in B1 will change to the result of the formula – 20 (see Figure A23).

A screenshot of an Excel spreadsheet showing columns A, B, and C and rows 1, 2 and 3. Cell A1 is 10, cell A2 is 20 , and cell A3 is 30. Cell B2 is 20, which shows the result of using the formula =AVERAGE(A1, A2, A3).
Figure A23 The Result of Using AVERAGE in MS Excel (Used with permission from Microsoft)

Excel does NOT save the formula when the file is saved as CSV

You can use Excel formulas with no problem on a CSV file, but Excel does not save those when the file is saved in the CSV format. If you want to keep the formula you have used, make sure to save your file as XLSX (File > Save As… > Excel Workbook (XLSX)). XLSX stores more information beyond text, such as formulas and drawings.

EXAMPLE A.6

Problem

On ch1-SOCR-small.csv, calculate the average height of the MLB players in the dataset using AVERAGE.

Solution

On the worksheet where the MLB Players dataset is open, type this formula in any empty cell and press the enter key (Figure A24).

=AVERAGE(D2:D13)

This will fill the average across cells D2, D3, D4, …, through D13. The resulting average height is 73.3333.

A screenshot of an MS Excel worksheet. It contains a dataset with 13 rows and 6 columns. The columns are labeled name, team, position, height (inches), weight (pounds), and age. The data in the Height column is highlighted and in row 14 the formula =AVERAGE(D2:D13) is being used to calculate the average height.
Figure A24 Calculate the Average Height Using AVERAGE in MS Excel (Used with permission from Microsoft)

Other Useful Formulas for Summarizing Data

There are a lot of other ways to use simple Excel formulas to analyze data values. You can type each of the example formulas listed in Table A1 and see how it works.

Formula Description
=MEDIAN(D2:D13) Compute median of D2, D3, …, and D13. Returns 72.5 as a result.
=STDEV(D2:D13) Compute standard deviation of D2, D3, …, and D13. Returns 3.28 as a result.
=MIN(D2:D13) Find the minimum value among D2, D3, …, and D13. Returns 70 as a result.
=MAX(D2:D13) Find the maximum value among D2, D3, …, and D13. Returns 82 as a result.
=SUM(D2:D13) Sum the values of D2, D3, …, through D13. Returns 880 as a result.
=COUNTIF(D2:D13, 72) Count the number of occurrences of “72” within the range of D2, D3, …, through D13. Returns 3 as a result (i.e., there are three 72s in D2:D13).
Table A1 Example MS Excel Formulas

EXAMPLE A.7

Problem

On ch1-SOCR-small.csv, write a formula that calculates the median weight of the players.

Solution

The formula is

=MEDIAN(E2:E13)

This will find the median across cell E2, E3, E4, …, through E13, which is 200. If you do not know what median is, don’t worry! You will learn more about it in Measures of Position.

Similar to the built-in function for the mean (=AVERAGE), Excel provides a function to calculate the sample standard deviation of a dataset =STDEV.S (for the sample standard deviation). To calculate these statistical results in Excel, enter the data values in a column. Let’s assume the data values are placed in cells A2 through A11. In any cell, type the Excel command =AVERAGE(A2:A11) and press enter. Excel will calculate the arithmetic mean in this cell. Then, in any other cell, type the Excel command =STDEV.S(A2:A11) and press enter. Excel will calculate the sample standard deviation in this cell. Figure A25 shows the mean and standard deviation for the 10 ages.

A screenshot of an Excel worksheet with column A labeled Ages. There are 10 numbers in the column. Overlaid text says: Excel command to calculate sample mean: =AVERAGE(A2:A11). Result: 46. Excel command to calculate sample standard deviation: =STDEV.S(A2:A11). Result: 6.50
Figure A25 Mean and Standard Deviation in MS Excel (Used with permission from Microsoft)

Here is a more comprehensive listing of various statistical functions available within Excel. The reader is encouraged to go to the Formulas menu in Excel and select any of the submenus such as “Financial,” “Math & Trig,” Lookup & Reference,” etc. to see various Excel formulas. Under the option for “More Functions,” select STATISTICAL to see a menu of various statistical functions available within Excel. Table A2 shows a sample of some statistical functions available within Excel:

Function Purpose
=AVERAGE(A1:A10) Find the mean of a set of numbers.
=MEDIAN(A1:A10) Find the median of a set of numbers.
=STDEV.S(A1:A10) Find the standard deviation for a set of numbers representing a sample.
=STDEV.P(A1:A10) Find the standard deviation for a set of numbers representing a population.
=VAR.S(A1:A10) Find the variance for a set of numbers representing a sample.
=VAR.P(A1:A10) Find the variance for a set of numbers representing a population.
=MIN(A1:A10) Find the minimum of a set of numbers.
=MAX(A1:A10) Find the maximum of a set of numbers.
=MAX(A1:A10) – MIN(A1:A10) Find the range of a set of numbers.
=CORREL(A1:A10, B1:B10) Find the correlation coefficient “r” for (x, y) data.
x-data is in cells A1 to A10; y-data is in cells B1 to B10.
=QUARTILE (A1:A10, 1) Find the first quartile for a set of numbers.
=QUARTILE (A1:A10, 2) Find the second quartile for a set of numbers (note that the second quartile is the same as the median).
=QUARTILE (A1:A10, 3) Find the third quartile for a set of numbers.
Table A2 Statistical Functions Available within MS Excel

Excel Analysis for Probability Distributions

As discussed in Measures of Position, data scientists are often interested in various probability distributions such as the normal distribution, binomial distribution, and Poisson distribution. Excel provides built-in functions to analyze many probability distributions.

Excel uses the command =NORM.DIST to find the area under the normal curve to the left of a specified value:

=NORM.DIST(XVALUE, MEAN, STANDARD_DEV, TRUE)

For example, suppose that at a financial consulting company, the mean employee salary is $60,000 with a standard deviation of $7,500. A normal curve can be drawn to represent this scenario, in which the mean of $60,000 would be plotted on the horizontal axis, corresponding to the peak of the curve. Then, to find the probability that an employee earns more than $75,000, you would calculate the area under the normal curve to the right of the data value $75,000.

For example, at the financial consulting company mentioned previously, the mean employee salary is $60,000 with a standard deviation of $7,500. To find the probability that a random employee’s salary is less than $55,000 using Excel, this is the command you would use:

=NORM.DIST(55000, 60000, 7500, TRUE)

Result: 0.25249

Thus, there is a probability of about 25% that a random employee has a salary less than $55,000.

Excel also provides built-in functions for binomial and Poisson distributions as shown in Table A3:

Probability Distribution Excel Command Usage
Binomial =BINOMIAL.DIST =BINOMIAL.DIST(Number_s, Trials, Probability_S, Cumulative)
Poisson =POISSON =POSSON(X, Mean, Cumulative)
Table A3 Built-In Functions for Binomial and Poisson Distributions in MS Excel

Excel Analysis for Correlation and Regression

In Inferential Statistics and Regression Analysis, the concepts of correlation and regression were discussed.

In correlation analysis, the data scientist is often interested in calculating the correlation coefficient, which is a numerical measure of the direction and strength of the correlation between two numeric quantities.

The Excel command to calculate the correlation coefficient uses the following format:

=CORREL(A1:A10, B1:B10)

In regression analysis, the data scientist is interested in calculating the equation of the best-fit linear model for two numeric quantities (assuming the correlation is significant).

Recall the equation of the best fit line is

y^=a+bxy^=a+bx

where m is the slope of the line and b is the y-intercept of the line.

To calculate the slope and y-intercept of the linear model using Excel, start by entering the (x, y) data in two columns in Excel. Then the Excel commands =SLOPE and =INTERCEPT can be used to calculate the slope and intercept, respectively.

The dataset in Table A4 will be used as an example: the monthly amount spent on advertising and the monthly revenue for a Fortune 500 company for 12 months.

Month Amount Spent on Advertising Revenue
Jan 49 12210
Feb 145 17590
Mar 57 13215
Apr 153 19200
May 92 14600
Jun 83 14100
Jul 117 17100
Aug 142 18400
Sep 69 14100
Oct 106 15500
Nov 109 16300
Dec 121 17020
Table A4 Revenue versus Advertising for a Fortune 500 Company ($000s) Monthly Amount Spent on Advertising and the Monthly Revenue for a Fortune 500 Company

To calculate the slope of the regression model, use the Excel command

=SLOPE(y-data range, x-data range)

It’s important to note that this Excel command expects that the y-data range is entered first and the x-data range is entered second. Since revenue depends on amount spent on advertising, revenue is considered the y-variable and amount spent on advertising is considered the x-variable. Notice the y-data is contained in cells C2 through C13 and the x-data is contained in cells B2 through B13. Thus the Excel command for slope would be entered as

=SLOPE(C2:C13, B2:B13)

In the same way, the Excel command to calculate the y-intercept of the regression model is

=INTERCEPT(y-data range, x-data range)

For the dataset shown in the above table, the Excel command would be

=INTERCEPT(C2:C13, B2:B13)

The results are as follows (see Figure A26):

slope b=61.8intercept a=9,376.7slope b=61.8intercept a=9,376.7
A screenshot of an MS Excel worksheet. A table showing monthly advertising expenditure and revenue for a year. The table also shows the calculation of slope and intercept. It contains a dataset with 13 rows and 3 columns. The columns are labelled Month, Advertising Expenditure, and Revenue. In cell B15 is the formula =SLOPE(C2:C13, B2:B13). Cell B16 is 61.8. In cell C15 is the formula =INTERCEPT(C2:C13, B2:B13). Cell C16 is 9,376.7.
Figure A26 Revenue versus Advertising for a Fortune 500 Company ($000s) Showing Slope and y-Intercept Calculation in MS Excel (Used with permission from Microsoft)

Based on this, the regression equation can be written as

y^=a+bxy^=9,376.7+61.8xy^=a+bxy^=9,376.7+61.8x

where x represents the amount spent on advertising (in thousands of dollars) and y represents the amount of revenue (in thousands of dollars).

Exploring Further

Excel Resources

For additional resources, refer to the chapters “Advanced Excel Formulas, Functions, and Techniques” and “Advanced Excel Spreadsheets: Statistical and Data Analysis” in Bolling, T., Mitchell, A., Scott, T., & Wheeler, N. (2023). Workplace Software and Skills. OpenStax. Retrieved from https://openstax.org/books/workplace-software-skills/pages/1-chapter-scenario.

Citation/Attribution

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution-NonCommercial-ShareAlike License and you must attribute OpenStax.

Attribution information
  • If you are redistributing all or part of this book in a print format, then you must include on every physical page the following attribution:
    Access for free at https://openstax.org/books/principles-data-science/pages/1-introduction
  • If you are redistributing all or part of this book in a digital format, then you must include on every digital page view the following attribution:
    Access for free at https://openstax.org/books/principles-data-science/pages/1-introduction
Citation information

© Dec 19, 2024 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.