Chapter - 3 Binary Files: 3.1 Reading and Writing To A Binary File
Chapter - 3 Binary Files: 3.1 Reading and Writing To A Binary File
BINARY FILES
The open() function opens a file in text format by default. To open a file in binary
format, add 'b' to the mode parameter. Hence the "rb" mode opens the file in binary format
for reading, while the "wb" mode opens the file in binary format for writing. Unlike text
mode files, binary files are not human readable. When opened using any text editor, the
data is unrecognizable.
The following code stores a list of numbers in a binary file. The list is first converted
in a byte array before writing. The built-in function bytearray() returns a byte
representation of the object.
f=open("binfile.bin","wb")
num=[5, 10, 15, 20, 25]
arr=bytearray(num)
f.write(arr)
f.close()
To read the above binary file, the output of the read() method is casted to a list using
the list() function.
f=open("binfile.bin","rb")
num=list(f.read())
print (num)
f.close()
Method Description
file.close() Closes the file.
file.flush() Flushes the internal buffer.
next(file) Returns the next line from the file each time it is called.
file.read([size]) Reads at a specified number of bytes from the file.
file.readline() Reads one entire line from the file.
file.readlines() Reads until EOF and returns a list containing the lines.
file.seek(offset, from) Sets the file's current position.
file.tell() Returns the file's current position
file.write(str) Writes a string to the file. There is no return value.
3.2 Pickle Module
Python pickle module is used for serializing and de-serializing a Python object
structure. Any object in Python can be pickled so that it can be saved on disk. What pickle
does is that it “serializes” the object first before writing it to file. Pickling is a way to convert
a python object (list, dict, etc.) into a character stream. The idea is that this character
stream contains all the information necessary to reconstruct the object in another python
script.
Pickling: It is a process where a Python object is converted into a byte stream. We also call
this ‘serialization’, ‘marshalling’, or ‘flattening’.
Unpickling: It is the inverse of Pickling process where a byte stream is converted into an
object.
Data Serialization
Before beginning to serialize data, it is important to identify or decide how the data
should be structured during data serialization - flat or nested. The differences in the two
styles are shown in the below examples.
Flat style:
Nested style:
{"A"
Python’s NumPy array can be used to serialize and deserialize data to and from byte
representation.
Example:
import NumPy as np
array_format = np.frombuffer(byte_output)
Example:
import pickle
Output
Program 2:
Output
pickle.HIGHEST_PROTOCOL
This is an integer value representing the highest protocol version available. This is
considered as the protocol value which is passed to the functions dump(), dumps().
pickle.DEFAULT_PROTOCOL
This is an integer value representing the default protocol used for pickling whose
value may be less than the value of highest protocol.
3.5 Python Pickle dump
In this section, we are going to learn, how to store data using Python pickle. To do
so, we have to import the pickle module first.
Then use pickle.dump() function to store the object data to the file. pickle.dump()
function takes 3 arguments. The first argument is the object that you want to store. The
second argument is the file object you get by opening the desired file in write-binary (wb)
mode. And the third argument is the key-value argument. This argument defines the
protocol.
Program
Output
To retrieve pickled data, the steps are quite simple. You have to use pickle.load()
function to do that. The primary argument of pickle load function is the file object that you
get by opening the file in read-binary (rb) mode.
Simple! Isn’t it. Let’s write the code to retrieve data we pickled using the pickle
dump code. See the following code for understanding.
Program
Output
1. exception pickle.PickleError
This exception inherits Exception. It is the base class for all other exceptions raised
in pickling.
2. exception pickle.PicklingError
3. exception pickle.UnpicklingError
This exception inherits PickleError. This exception is raised when there is a problem
like data corruption or a security violation while unpickling an object.
3.8 Advantages of using Pickle Module
Recursive objects (objects containing references to themselves): Pickle keeps track of the
objects it has already serialized, so later references to the same object won’t be serialized
again. (The marshal module breaks for this.)
Object sharing (references to the same object in different places): This is similar to self-
referencing objects; pickle stores the object once, and ensures that all other references
point to the master copy. Shared objects remain shared, which can be very important for
mutable objects.
User-defined classes and their instances: Marshal does not support these at all, but
pickle can save and restore class instances transparently. The class definition must be
importable and live in the same module as when the object was stored.
Append method is used to add the new data to the end of the file, retaining the old data
also.
Program