Reading from and Writing to Files

Introduction

So far, the data in our programs has either been hardcoded into the program itself or it came from the user who typed it in using the keyboard. This is pretty limiting and we will want programs that can read data from files.

In this lesson, we'll work with text files. Text files are files that use one of a number of standard encoding schemes where the file can be interpretted as printable characters. Later, you might learn about binary files, where the file contents are not viewable as characters, but we'll start with text files for now.

Opening a File

To open a file, we need to specify the name of the file using a string.

We can use a variable to represent the name and could:

  • Set it to a string literal, if the program is always going to use the same filename.
  • Set it to a filename entered by the user using input().

Next, we use the command open and the name of the file:

In [1]:
filename = 'story.txt'
file = open(filename, 'r')
file
Out[1]:
<_io.TextIOWrapper name='story.txt' mode='r' encoding='UTF-8'>

This opens the file named story.txt from the current directory. It is open for reading (that's the r mode) and the type of object is io.TextIOWrapper, but just think of it as an open file. The important conceptual idea here is that this object not only knows the contents of the file, but it knows our program's current position in the file. So once our program starts reading, it knows how much we've read and is able to keep giving us the next piece.

Reading from a File

There are several other ways to read from a file. In the following examples, the contents of story.txt are:

Mary had a little lamb

His fleece was white as snow
And everywhere that Mary went
The lamb was sure to go

1) Read a single line

In [2]:
myfile = open('story.txt', 'r')
s = myfile.readline()   # Read a line into s.
print(s)
s                       # Notice the \n that you only see when you look
                        # at the contents of the variable.
Mary had a little lamb

Out[2]:
'Mary had a little lamb\n'

The \n (backslash n) character is a single character representing a new line.

In [3]:
 s = myfile.readline()   # The next call continues where we left off.
 print(s)    
 s = myfile.readline()   # And so on...
 print(s)   
 myfile.close()

His fleece was white as snow

Notices that before line His fleece was white as snow, there is a blank line. That is because the second line read contained only whitespace.

We can use this approach to read an entire file, bit by bit, under our control.

2) Read a certain number of characters

In [4]:
filename = 'story.txt'
myfile = open(filename)
s = myfile.read(10)   # Read 10 characters into s.
print(s)
s = myfile.read(10)   # Read the next 10 characters into s.
print(s)
myfile.close()
Mary had a
 little la

We can also use this approach to read an entire file, bit by bit, under our control.

3) Read one line at a time from beginning to end.

If we know we want to read line by line through to the end of the file, a for loop makes this easy. This is probably the most common way to read a file. Use this approach unless you have a reason not to.

In [5]:
f = open('story.txt')
for line in f:
    print(line)     # Or do whatever you wish to line

myfile.close()     # Good habit: close a file when you are done with it.
Mary had a little lamb



His fleece was white as snow

And everywhere that Mary went

The lamb was sure to go

Question: Why is the output from the for loop double-spaced? Answer: print appends a \n to the string and there is also a \n at the end of each line.

Question: How can you single space the output? Answer: Strip the newline character from the end of each line before you print.

In [6]:
f = open('story.txt')
for line in f:
    line = line.strip('\n')
    print(line)
Mary had a little lamb

His fleece was white as snow
And everywhere that Mary went
The lamb was sure to go

4) Read the entire file contents into a single string.

In [7]:
filename = "story.txt"
myfile = open(filename)
s = myfile.read()  # Read the whole file and return it as a string.
print(s)
myfile.close()
Mary had a little lamb

His fleece was white as snow
And everywhere that Mary went
The lamb was sure to go
In [8]:
s
Out[8]:
'Mary had a little lamb\n\nHis fleece was white as snow\nAnd everywhere that Mary went\nThe lamb was sure to go'

(5) Use readlines() to read the file into a list of lines.

In [9]:
myfile = open('story.txt')
contents = myfile.readlines() 
type(contents)
contents
Out[9]:
['Mary had a little lamb\n',
 '\n',
 'His fleece was white as snow\n',
 'And everywhere that Mary went\n',
 'The lamb was sure to go']

Beginners often do one of these last two approaches because they seem easy.

  • Question: What is the downside of reading it all in at once?
  • Answer: It can potentially take a lot of space!

Don't use this technique unless you really need access to the whole file at once.

Usually, we can read a piece, deal with it, and toss it out.

Dealing with the end of a file

With the for loop approach, the loop automatically stops when the end of the file is encountered. Or never even iterates once if the file is empty!

But what happens if you are at the end of the file when you call read or readline?
You get the empty string. You then know you can stop trying to read more.

Example

In [10]:
# Detecting the end of the file while reading line by line
myfile = open('story.txt')
next_line = myfile.readline()
while next_line != "":
    print(next_line)
    next_line = myfile.readline()
Mary had a little lamb



His fleece was white as snow

And everywhere that Mary went

The lamb was sure to go

Practice Exercise: reading a file

The file january06.txt contains data from the UTM weather station for January 2006. Download it from the C4M website to your local machine and put it in the same directory as where Wing is storing your programs. Figuring out where to store the files or how to specify the paths to your file is half the battle!

  1. Open it up in Wing to see what it looks like.

  2. Write a Python program to open the file and read only the first line

  3. Read the second line (this is still a header)

  4. Read the third line into a variable line.

  5. What is the type of the value that line refers to?

  6. Call the method split() on variable line and save the return value. What is the type that is returned by this method call?

  7. Look up the method split() in the Python 3 documentation.

Practice Exercise: getting information from a file

Write a program that:

  1. opens the file january06.txt
  2. reads in the header and ignores it
  3. uses a loop to read in all the rest of the lines one by one
  4. prints out only the day and the temperature from each line

Practice Exercise: find coldest day and time

Now, write a program to find the day and time of the coldest reading in the file and then print that information.

Hint: Be careful. You must convert the values to integers before you compare them. The string '11' < '2' but 11 > 2.

Writing to a file

In addition to opening a file for reading using 'r', we'll explore two other modes: 'w' and 'a'. Both of those modes are used to write to a file.

Let's start with opening a file using mode 'w'. First, if the file does not exist, it is created:

In [11]:
new_file = open('example.txt', 'w')

Next, we use the write method to write the contents and then we close the file:

In [12]:
new_file.write('This is the first line.\n')
new_file.write('And the second\nand third.')
new_file.close()

We can then read and print the file contents:

In [13]:
new_file = open('example.txt', 'r')
print(new_file.read())
new_file.close()
This is the first line.
And the second
and third.

Now, let's open the file using mode 'a', which stands for append:

In [14]:
new_file = open('example.txt', 'a')
new_file.write('\nAdding another line!')  # Notice the \n character.
new_file.close()

# Next, read and print the file contents again.
new_file = open('example.txt', 'r')
print(new_file.read())
new_file.close()
This is the first line.
And the second
and third.
Adding another line!

Warning: if the file exists already, when it is opened using mode 'w', its contents will be deleted. This is different from mode 'a', which keeps the existing content and writes any new lines to the end of the file.

Let's open 'example.txt' using mode 'w' to see how the file changes:

In [15]:
# The file is opened and its contents are cleared.
new_file = open('example.txt', 'w')       

# This will be the one and only line in the file.
new_file.write('Adding some new content') 
new_file.close()

# Next, read and print the file contents again.  
new_file = open('example.txt', 'r')
print(new_file.read())
new_file.close()
Adding some new content

Practice Exercise: writing to a file

  1. Write your name and address to a file named contact.txt. Once you have executed your program, open contact.txt in Wing to verify that its contents are what you expect.
  2. Now, write a program to add your phone number to that file, using open's append mode. Again, open the file in Wing and check its contents.