Python | OCR on All the Images present in a Folder Simultaneously Last Updated : 11 Nov, 2019 Comments Improve Suggest changes Like Article Like Report If you have a folder full of images that has some text which needs to be extracted into a separate folder with the corresponding image file name or in a single file, then this is the perfect code you are looking for. This article not only gives you the basis of OCR (Optical Character Recognition) but also helps you to create output.txt file for every image inside the main folder and save it in some predetermined direction. Libraries Needed - pip3 install pillow pip3 install os-sys You will also need the tesseract-oct and pytesseract library. The tesseract-ocr can be downloaded and installed from here and the pytesseract can be installed using pip3 install pytesseract Below is the Python implementation - Python3 1== # Python program to extract text from all the images in a folder # storing the text in corresponding files in a different folder from PIL import Image import pytesseract as pt import os def main(): # path for the folder for getting the raw images path ="E:\\GeeksforGeeks\\images" # path for the folder for getting the output tempPath ="E:\\GeeksforGeeks\\textFiles" # iterating the images inside the folder for imageName in os.listdir(path): inputPath = os.path.join(path, imageName) img = Image.open(inputPath) # applying ocr using pytesseract for python text = pt.image_to_string(img, lang ="eng") # for removing the .jpg from the imagePath imagePath = imagePath[0:-4] fullTempPath = os.path.join(tempPath, 'time_'+imageName+".txt") print(text) # saving the text for every image in a separate .txt file file1 = open(fullTempPath, "w") file1.write(text) file1.close() if __name__ == '__main__': main() Input Image : image_sample1 Output : geeksforgeeks geeksforgeeks If you want to store all the text from the images in a single output file then the code will be a little different. The main difference is that the mode of the file in which we will be writing will change to "+a" to append the text and create the output.txt file if it is not present already. Python3 1== # extract text from all the images in a folder # storing the text in a single file from PIL import Image import pytesseract as pt import os def main(): # path for the folder for getting the raw images path ="E:\\GeeksforGeeks\\images" # link to the file in which output needs to be kept fullTempPath ="E:\\GeeksforGeeks\\output\\outputFile.txt" # iterating the images inside the folder for imageName in os.listdir(path): inputPath = os.path.join(path, imageName) img = Image.open(inputPath) # applying ocr using pytesseract for python text = pt.image_to_string(img, lang ="eng") # saving the text for appending it to the output.txt file # a + parameter used for creating the file if not present # and if present then append the text content file1 = open(fullTempPath, "a+") # providing the name of the image file1.write(imageName+"\n") # providing the content in the image file1.write(text+"\n") file1.close() # for printing the output file file2 = open(fullTempPath, 'r') print(file2.read()) file2.close() if __name__ == '__main__': main() Input Image : image_sample1 image_sample2 Output: It gave an output of the single file created after extracting all the information from the image inside the folder. The format of the file goes like this - Name of the image Content of the image Name of the next image and so on ..... Comment More infoAdvertise with us Next Article Python | OCR on All the Images present in a Folder Simultaneously A amankrsharma3 Follow Improve Article Tags : Machine Learning Image-Processing Python-pil python Practice Tags : Machine Learningpython Similar Reads Python Pillow Tutorial sinceDigital Image processing means processing the image digitally with the help of a computer. Using image processing we can perform operations like enhancing the image, blurring the image, extracting text from images, and many more operations. There are various ways to process images digitally. He 15+ min read Introduction to PillowPython: Pillow (a fork of PIL)Python Imaging Library (expansion of PIL) is the de facto image processing package for Python language. It incorporates lightweight image processing tools that aids in editing, creating and saving images. Support for Python Imaging Library got discontinued in 2011, but a project named pillow forked 4 min read Installation and setupHow to Install Pillow on MacOS?In this article, we will learn how to install Pillow in Python on MacOS. Python Imaging Library (expansion of PIL) is the de facto image processing package for Python language. Installation:Method 1: Using pip to install Pillow Follow the below steps to install the Pillow package on macOS using pip: 2 min read How to Install PIL on Windows?In this article, we will look into the various methods of installing the PIL package on a Windows machine. Prerequisite:Python PIP or Ananconda (Depending upon your preference)For PIP Users: Open up the command prompt and use the below command to install the PIL package: pip install Pillow The follo 1 min read How to Install PIL on Linux?PIL is an acronym for Python Image Library. It is also called Pillow. It is one of the most famous libraries for manipulating images using the python programming language. It is a free and open-source Python library. Installing PIL on Linux:Method 1: Using PIP command: Step 1: Open up the Linux term 1 min read Loading and Saving ImagesPython PIL | Image.save() methodPIL is the Python Imaging Library which provides the python interpreter with image editing capabilities. The Image module provides a class with the same name which is used to represent a PIL image. The module also provides a number of factory functions, including functions to load images from files, 3 min read Python PIL | Image.show() methodPIL is the Python Imaging Library which provides the python interpreter with image editing capabilities. The Image module provides a class with the same name which is used to represent a PIL image. The module also provides a number of factory functions, including functions to load images from files, 1 min read Finding Difference between Images using PILPython interpreter in itself doesn't contain the ability to process images and making out a conclusion to it. So, PIL(Python Imaging Library) adds image processing powers to the interpreter. PIL is an open-source library that provides python with external file support and efficiency to process image 2 min read Image Manipulation BasicsPython Pillow - Working with ImagesIn this article, we will see how to work with images using Pillow in Python. We will discuss basic operations like creating, saving, rotating images. So let's get started discussing in detail but first, let's see how to install pillow. Installation To install this package type the below command in t 4 min read Python PIL | Image.resize() methodPIL is the Python Imaging Library which provides the python interpreter with image editing capabilities. The Image module provides a class with the same name which is used to represent a PIL image. The module also provides a number of factory functions, including functions to load images from files, 4 min read Python Pillow - Flip and Rotate ImagesPrerequisites: Pillow Python Pillow or PIL is the Python library that provides image editing and manipulating features. The Image Module in it provides a number of functions to flip and rotate images. image.transpose() is the function used to rotate and flip images with necessary keywords as paramet 2 min read Python PIL | paste() and rotate() methodPIL is the Python Imaging Library which provides the python interpreter with image editing capabilities. PIL.Image.Image.paste() method is used to paste an image on another image. This is where the new() method comes in handy. Syntax: PIL.Image.Image.paste(image_1, image_2, box=None, mask=None) OR i 2 min read Adjusting Image PropertiesChange image resolution using Pillow in PythonPrerequisites: Python pillow PIL is the Python Imaging Library which provides the python interpreter with an in-depth file format support, an efficient internal representation, and fairly powerful image processing capabilities. Changing the resolution of an image simply means reducing or increasing 2 min read Image Enhancement in PILThe Python Imaging Library(PIL) adds powerful image processing capabilities. It provides immense file format support, an efficient representation, and fairly powerful image processing capabilities. The core image library is intended for fast access to data stored in very few basic pixel formats. It 4 min read Image Filtering and EffectsPython Pillow - Blur an ImageBlurring an image is a process of reducing the level of noise in the image, and it is one of the important aspects of image processing. In this article, we will learn to blur an image using a pillow library. To blur an image we make use of some methods of ImageFilter class of this library on image o 2 min read How to merge images with same size using the Python 3 module pillow?In this article, the task is to merge image with size using the module pillow in python 3. Python 3 module pillow : This is the update of Python Imaging Library. It is a free and open-source additional library for the Python programming language that adds support for opening, manipulating, and savi 2 min read Drawing on ImagesAdding Text on Image using Python - PILIn Python to open an image, image editing, saving that image in different formats one additional library called Python Imaging Library (PIL). Using this PIL we can do so many operations on images like create a new Image, edit an existing image, rotate an image, etc. For adding text we have to follow 2 min read Python Pillow - ImageDraw ModulePython's Pillow which is a fork of the discontinued Python Imaging Library (PIL) is a powerful library that is capable of adding image processing capabilities to your python code. Pillow offers many modules that ease the process of working and modifying images. In this article, we will have a look a 5 min read Python Pillow - Colors on an ImageIn this article, we will learn Colors on an Image using the Pillow module in Python. Let's discuss some concepts: A crucial class within the Python Imaging Library is the Image class. It's defined within the Image module and provides a PIL image on which manipulation operations are often administere 4 min read Image TransformationsHow to rotate an image using Python?Image rotation in Python rotates an image around its centre by a specified angle using forward or inverse methods. When the angle isnât a multiple of 90 degrees, parts of the image may move outside the visible boundaries and get clipped. To avoid losing important content during rotation you need pro 3 min read Python PIL | Image.transform() methodPIL is the Python Imaging Library which provides the python interpreter with image editing capabilities. The Image module provides a class with the same name which is used to represent a PIL image. The module also provides a number of factory functions, including functions to load images from files, 1 min read Working with Image MetadataHow to extract image metadata in Python?Prerequisites: PIL Metadata stands for data about data. In case of images, metadata means details about the image and its production. Some metadata is generated automatically by the capturing device. Some details contained by image metadata is as follows: HeightWidthDate and TimeModel etc. Python h 2 min read Python | Working with the Image Data Type in pillowIn this article, we will look into some attributes of an Image object that will give information about the image and the file it was loaded from. For this, we will need to import image module from pillow. Image we will be working on : size() method - It helps to get the dimensions of an image. IMG = 2 min read Like