Extract speech text from video in Python
Last Updated :
24 Apr, 2025
Nowadays, videos have become an integral part of our lives. Videos educate us and provide the necessary information. In this article, we will learn how to extract text speech from video using Python.
Extract Speech Text from the Video
To extract speech text from video in Python, we require the following modules to install, Here we are using Python PIP to install different modules.
Moviepy Module
The moviepy module in Python is used to perform basic operations on a video. It is used in the video editing process to perform functions like cutting, adding text, merging videos, and many more. You can install the moviepy module by writing the following command in your terminal.
pip install moviepy
Note: This module automatically installs FFmpeg. However, you might prompt to install it in some cases. You can refer to the links here to install FFmpeg on Linux and on Windows.
SpeechRecognition
The speechrecognition module in Python provides an easy way to interact with speech and audio files. You can install the SpeechRecognition module in Python using the following command:
pip install SpeechRecognition
Steps to Extract Speech Text from Video in Python
Step 1: Import the required modules
The first step is to import the required modules, i.e., moviepy and speech_recognition.
import moviepy.editor as mp
import speech_recognition as sr
Step 2: Load the video
The next step is to load the video who's the speech we want to extract. For this, we will use the VideoFileClip() function of moviepy module.
mp.VideoFileClip("file_path")
Step 3: Extract audio from the video
Then extract the audio from the video using the audio attribute and then write the '.mp4' file to the '.wav' file using the write_audiofile() function.
audio.write_audiofile("fimename.wav")
Step 4: Load audio
Load the newly converted audio file using the AudioFile() function of the speech recognition module.
with sr.AudioFile("geeksforgeeks.wav") as source:
data = r.record(source)
Step 5: Convert audio to text
The final step is to convert the data extracted from the audio to text format. This can be done using the recognize_google() function and passing the extracted data as the parameter.
text = r.recognize_google(data)
Code Implementation:
Now, let us see the full implementation of the code to extract speech text from a video in Python. We will take geeksforgeeks.mp4 as an example video for this problem statement. Make sure the associated video is present in the folder where the script is located.
Python3
import moviepy.editor as mp
import speech_recognition as sr
# Load the video
video = mp.VideoFileClip("geeksforgeeks.mp4")
# Extract the audio from the video
audio_file = video.audio
audio_file.write_audiofile("geeksforgeeks.wav")
# Initialize recognizer
r = sr.Recognizer()
# Load the audio file
with sr.AudioFile("geeksforgeeks.wav") as source:
data = r.record(source)
# Convert speech to text
text = r.recognize_google(data)
# Print the text
print("\nThe resultant text from video is: \n")
print(text)
Output:
Text extracted from geeksforgeeks.mp4
Similar Reads
Extract images from video in Python
OpenCV comes with many powerful video editing functions. In current scenario, techniques such as image scanning, face recognition can be accomplished using OpenCV. Image Analysis is a very common field in the area of Computer Vision. It is the extraction of meaningful information from videos or imag
2 min read
Extract time from datetime in Python
In this article, we are going to see how to extract time from DateTime in Python. In Python, there is no such type of datatype as DateTime, first, we have to create our data into DateTime format and then we will convert our DateTime data into time. A Python module is used to convert the data into Da
4 min read
Parsel: How to Extract Text From HTML in Python
Parsel is a Python library used for extracting data from HTML and XML documents. It provides tools for parsing, navigating, and extracting information using CSS selectors and XPath expressions. Parsel is particularly useful for web scraping tasks where you need to programmatically extract specific d
2 min read
Python OpenCV: Capture Video from Camera
Python provides various libraries for image and video processing. One of them is OpenCV. OpenCV is a vast library that helps in providing various functions for image and video operations. With OpenCV, we can capture a video from the camera. It lets you create a video capture object which is helpful
4 min read
Extract Video Titles with the Youtube API in Python
Extracting video titles from YouTube can be incredibly useful for various applications, such as data analysis, content creation, and trend tracking. The YouTube Data API provides a robust way to interact with YouTube's platform and extract this information. This guide will walk you through the proce
4 min read
Extract Video Frames from Webcam and Save to Images using Python
There are two libraries you can use: OpenCV and ImageIO. Which one to choose is situation-dependent and it is usually best to use the one you are already more familiar with. If you are new to both then ImageIO is easier to learn, so it could be a good starting point. Whichever one you choose, you ca
2 min read
Extract title from a webpage using Python
Prerequisite Implementing Web Scraping in Python with BeautifulSoup, Python Urllib Module, Tools for Web Scraping In this article, we are going to write python scripts to extract the title form the webpage from the given webpage URL. Method 1: bs4 Beautiful Soup(bs4) is a Python library for pulling
3 min read
Video Generation using Python
In the vast landscape of multimedia technology, the art of video generation stands as a fascinating and innovative endeavor. It involves the dynamic synthesis of visual elements, breathing life into static images through intricate algorithms and models. Video generation has become an integral part o
4 min read
How to extract youtube data in Python?
Prerequisites: Beautifulsoup YouTube statistics of a YouTube channel can be used for analysis and it can also be extracted using python code. A lot of data like viewCount, subscriberCount, and videoCount can be retrieved. This article discusses 2 ways in which this can be done. Method 1: Using YouTu
3 min read
How to get text of a tag in selenium - Python?
Selenium is a powerful tool for controlling web browsers through programs and performing browser automation. It is functional for all browsers, works on all major OS and its scripts are written in various languages i.e Python, Java, C#, etc, we will be working with Python. In this article, we will w
1 min read