Python File Objects #
Open file #
f = open('<path to file>')
Specify if it’s for reading (‘r’), writing (‘w’), appending (‘a’), or reading and writing (‘r+').
Open for reading:
f = open('<path to file>', 'r')
Print file details #
File name:
print(f.name)
File mode that it’s open with:
print(f.mode)
If reading the file in increments, this indicates the position in the file that it’s been read up to:
print(f.tell())
Close file #
After file is opened, it needs to be closed.
If the file isn’t closed, there could be leaks that lead to errors.
f.close()
Context manager #
Usually, working with files with a context manager is preferable.
The benefit: it allows us to work with files within block of code. After the block is complete, it closes the file. Super sanitary.
with open('<path to file>', 'r') as f:
pass
Reading small files #
This is OK if the file is small:
with open('<path to file>', 'r') as f:
f_contents = f.read()
print(f_contents) # prints contents of file
Print individual lines of the file, where each line is treated as a distinct element.
with open('<path to file>', 'r') as f:
f_contents = f.readlines()
print(f_contents) # prints contents of file
Running .readline()
gets one line at a time, each run moves to the next line.
with open('<path to file>', 'r') as f:
f_contents = f.readline()
print(f_contents, end ='') # prints contents of first line. 'end' argument specifies how line ends (defaults to '\n')
f_contents = f.readline() # this reads the next line
print(f_contents, end ='')
Reading large files #
with open('<path to file>', 'r') as f:
for line in f:
print(line, end = '')
^ this goes through one line at a time. It doesn’t go through everything all at once.
Alternateively, use .read()
with a specification for how many characters get read.
Note that each run of .read()
advances the position in the file.
with open('<path to file>', 'r') as f:
f_contents = f.read(100)
print(f_contents, end='')
# this picks up where the previous chunk left off at
# if there's nothing left to read, it returns an empty string
f_contents = f.read(100)
print(f_contents, end='')
If we don’t know how large the file is, use a loop.
with open('<path to file>', 'r') as f:
size_to_read = 10
f_contents = f.read(size_to_read)
while len(f_contents) > 0:
print(f_contents, end ='')
f_contents = f.read(size_to_read)
Change position of file interaction #
Specify precisely where. 0
value moves the position to the start.
f.seek(0)
Writing to files #
Proper way to write:
with open('<path to file>', 'w') as f:
f.write('<some text>')
w
– if the file does not exist, it creates the file. If the file already exists, it overwrites the file. Use a
to append to an existing file.
Writing a file in read mode causes an error:
with open('<path to file>', 'r') as f:
f.write('<some text>')
... not writable
Read from one file, write to another #
with open('<path to read file>', 'r') as rf: # rf == "read file"
with open('<path to write file>', 'w') as wf: # wf == "write file"
for line in rf:
wf.write(line)
Could be useful for transforming and serialzing.
Interacting with images #
When interacting with images, you need to work in binary mode (rb
, wb
), not text.
with open('<path to file>.jpg', 'rb') as rf:
with open('<path to target>.jpg', 'wb') as wf:
for line in rf:
wf.write(line)
Instead of line by line, do so in chunks:
with open('<path to file>.jpg', 'rb') as rf:
with open('<path to target>.jpg', 'wb') as wf:
chunk_size = 4096
rf_chunk = rf.read(chunk_size)
# keep reading until nothing left to read
while len(rf_chunk) > 0:
wf.write(rf_chunk)
rf_chunk = rf.read(chunk_size)