Skip to ContentGo to accessibility pageKeyboard shortcuts menu
OpenStax Logo

1 .
Discuss how different ratios of training versus testing data can affect the model in terms of underfitting and overfitting. How does the testing set provide a means to identify issues with underfitting and overfitting?
2 .
A university admissions office would like to use a multiple linear regression model with students’ high school GPA and scores on both the SAT and ACT (standardized tests) as input variables to predict whether the student would eventually graduate university if admitted. Assuming the following statements are all accurate, which statement would be a reason not to use multiple linear regression?
  1. Students' SAT and ACT scores are often highly correlated with one another.
  2. GPA scores are measured on a different scale than either SAT or ACT scores.
  3. Scores of 0 are impossible to obtain on the SAT or ACT.
  4. Students can have high GPA but sometimes not do well on standardized tests like the SAT or ACT.
3 .
Using the data about words found in news articles from Example 6.12, classify an article that contains all three words, today, disaster, and police, as real or fake.
Citation/Attribution

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution-NonCommercial-ShareAlike License and you must attribute OpenStax.

Attribution information
  • If you are redistributing all or part of this book in a print format, then you must include on every physical page the following attribution:
    Access for free at https://openstax.org/books/principles-data-science/pages/1-introduction
  • If you are redistributing all or part of this book in a digital format, then you must include on every digital page view the following attribution:
    Access for free at https://openstax.org/books/principles-data-science/pages/1-introduction
Citation information

© Dec 19, 2024 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.