Skip to ContentGo to accessibility pageKeyboard shortcuts menu
OpenStax Logo
Introduction to Python Programming

14.3 Files in different locations and working with CSV files

Introduction to Python Programming14.3 Files in different locations and working with CSV files

Learning objectives

By the end of this section you should be able to

  • Demonstrate how to access files within a file system.
  • Demonstrate how to process a CSV file.

Opening a file at any location

When only the filename is used as the argument to the open() function, the file must be in the same folder as the Python file that is executing. Ex: For fileobj = open("file1.txt") in files.py to execute successfully, the file1.txt file should be in the same folder as files.py.

Often a programmer needs to open files from folders other than the one in which the Python file exists. A path uniquely identifies a folder location on a computer. The path can be used along with the filename to open a file in any folder location. Ex: To open a file named logfile.log located in /users/turtle/desktop the following can be used:

fileobj = open("/users/turtle/desktop/logfile.log")

Operating System File location

open() function example

Mac

/users/student/

fileobj = open("/users/student/output.txt")

Linux

/usr/code/

fileobj = open("/usr/code/output.txt")

Windows

c:\projects\code\

fileobj = open("c:/projects/code/output.txt")
or
fileobj = open("c:\\projects\\code\\output.txt")

Table 14.2 Opening files on different paths. In each of the following cases, a file called output.txt is located in a different folder than the Python folder. Windows uses backslash \ characters instead of forward slash / characters for the path. If the backslash is included directly in
open()
, then an additional backslash is needed for Python to understand the location correctly.

Concepts in Practice

Opening files at different locations

For each question, assume that the Python file executing the open() function is not in the same folder as the out.txt file.

Each question indicates the location of out.txt, the type of computer, and the desired mode for opening the file. Choose which option is best for opening out.txt.

1.
/users/turtle/files on a Mac for reading
  1. fileobj = open("out.txt")
  2. fileobj = open("/users/turtle/files/out.txt")
  3. fileobj = open("/users/turtle/files/out.txt", 'w')
2.
c:\documents\ on a Windows computer for reading
  1. fileobj = open("out.txt")
  2. fileobj = open("c:/documents/out.txt", 'a')
  3. fileobj = open("c:/documents/out.txt")
3.
/users/turtle/logs on a Linux computer for writing
  1. fileobj = open("out.txt")
  2. fileobj = open("/users/turtle/logs/out.txt")
  3. fileobj = open("/users/turtle/logs/out.txt", 'w')
4.
c:\proj\assets on a Windows computer in append mode
  1. fileobj = open("c:\\proj\\assets\\out.txt", 'a')
  2. fileobj = open("c:\proj\assets\out.txt", 'a')

Working with CSV files

In Python, files are read from and written to as Unicode by default. Many common file formats use Unicode such as text files (.txt), Python code files (.py), and other code files (.c,.java).

Comma separated value (CSV, .csv) files are often used for storing tabular data. These files store cells of information as Unicode separated by commas. CSV files can be read using methods learned thus far, as seen in the example below.

Figure 14.2 CSV files. A CSV file is simply a text file with rows separated by newline
\n
characters and cells separated by commas.

Raw text of the file:

Title, Author, Pages\n1984, George Orwell, 268\nJane Eyre, Charlotte Bronte, 532\nWalden, Henry David Thoreau, 156\nMoby Dick, Herman Melville, 538

Example 14.3

Processing a CSV file

    """Processing a CSV file."""
    # Open the CSV file for reading
    file_obj = open("books.csv")

    # Rows are separated by newline \n characters, so readlines() can be used to read in all rows into a string list
    csv_rows = file_obj.readlines()

    list_csv = []

    # Remove \n characters from each row and split by comma and save into a 2D structure
    for row in csv_rows:
      # Remove \n character
      row = row.strip("\n")
      # Split using commas
      cells = row.split(",")
      list_csv.append(cells)
    
    # Print result
    print(list_csv)

The code's output is:

    [['Title', ' Author', ' Pages'], ['1984', ' George Orwell', ' 268'], ['Jane Eyre', ' Charlotte Bronte', ' 532'], ['Walden', ' Henry David Thoreau', ' 156'], ['Moby Dick', ' Herman Melville', ' 538']]

Concepts in Practice

File types and CSV files

5.
Why does readlines() work for reading the rows in a CSV file?
  1. readlines() reads line by line using the newline \n character.
  2. readlines() is not appropriate for reading a CSV file.
  3. readlines() automatically recognizes a CSV file and works accordingly.
6.
For the code in the example, what would be the output for the statement print(list_csv[1][2])?
  1. 532
  2. 268
  3. Jane Eyre
7.
What is the output of the following code for the books.csv seen above?
file_obj = open("books.csv")

csv_read = file_obj.readline()

print(csv_read)
  1. ['Title, Author, Pages\n', '1984, George Orwell, 268\n', 'Jane Eyre, Charlotte Bronte, 532\n', 'Walden, Henry David Thoreau, 156\n', 'Moby Dick, Herman Melville, 538']
  2. [['Title', ' Author', ' Pages'], ['1984', ' George Orwell', ' 268'], ['Jane Eyre', ' Charlotte Bronte', ' 532'], ['Walden', ' Henry David Thoreau', ' 156'], ['Moby Dick', ' Herman Melville', ' 538']]
  3. Title, Author, Pages

Exploring further

Files such as Word documents (.docx) and PDF documents (.pdf), image formats such as Portable Network Graphics (PNG, .png) and Joint Photographic Experts Group (JPEG, .jpeg or .jpg) as well as many other file types are encoded differently.

Some types of non-Unicode files can be read using specialized libraries that support the reading and writing of different file types.

PyPDF is a popular library that can be used to extract information from PDF files.

BeautifulSoup can be used to extract information from XML and HTML files. XML and HTML files usually contain unicode with structure provided through the use of angled <> bracket tags.

python-docx can be used to read and write DOCX files.

Additionally, csv is a built-in library that can be used to extract information from CSV files.

Try It

Processing a CSV file

The file fe.csv contains scores for a group of students on a final exam. Write a program to display the average score.

Citation/Attribution

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute OpenStax.

Attribution information
  • If you are redistributing all or part of this book in a print format, then you must include on every physical page the following attribution:
    Access for free at https://openstax.org/books/introduction-python-programming/pages/1-introduction
  • If you are redistributing all or part of this book in a digital format, then you must include on every digital page view the following attribution:
    Access for free at https://openstax.org/books/introduction-python-programming/pages/1-introduction
Citation information

© Jul 30, 2024 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

This book utilizes the OpenStax Python Code Runner. The code runner is developed by Wiley and is All Rights Reserved.