0% found this document useful (0 votes)

7 views

SpeechRecognition

The document provides an overview of speech recognition technology using Python, particularly focusing on the SpeechRecognition library. It includes instructions for installation, code examples for transcribing audio files, and capturing audio from a microphone. Additionally, it discusses the setup for using Google Speech Recognition and handling exceptions during the recognition process.

Uploaded by

gauravendra272002

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

SpeechRecognition

Uploaded by

gauravendra272002

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

SpeechRecognition

Speech recognition is a technology that allows computers to understand and

process human speech. Python, with its simplicity and robust libraries, offers
several modules to tackle speech recognition tasks effectively. One of the most
popular libraries for this purpose is the SpeechRecognition library.

With SpeechRecognition Library

In this section, we will base our speech recognition system on this tutorial.
SpeechRecognition library offers many transcribing engines like Google Speech
Recognition, and that's what we'll be using.

 Before we get started, let's install the required libraries:

 $ pip install SpeechRecognition pydub

Open up a new file named speechrecognition.py, and add the following:

# importing libraries
import speech_recognition as sr
import os
from pydub import AudioSegment
from pydub.silence import split_on_silence

# create a speech recognition object

r = sr.Recognizer()

 The below function loads the audio file, performs speech recognition, and
returns the text:

 # a function to recognize speech in the audio file

 # so that we don't repeat ourselves in in other functions
def transcribe_audio(path):
# use the audio file as the audio source
with sr.AudioFile(path) as source:
audio_listened = r.record(source)
# try converting it to text
text = r.recognize_google(audio_listened)
return text
 Next, we make a function to split the audio files into chunks in silence:

# a function that splits the audio file into chunks on silence

# and applies speech recognition
def get_large_audio_transcription_on_silence(path):
"""
Splitting the large audio file into chunks
and apply speech recognition on each of these chunks
"""
# open the audio file using pydub
sound = AudioSegment.from_file(path)
# split audio sound where silence is 700 miliseconds or more and
get chunks
chunks = split_on_silence(sound,
# experiment with this value for your target audio file
min_silence_len = 500,
# adjust this per requirement
silence_thresh = sound.dBFS-14,
# keep the silence for 1 second, adjustable as well
keep_silence=500,
)
folder_name = "audio-chunks"
# create a directory to store the audio chunks
if not os.path.isdir(folder_name):
os.mkdir(folder_name)
whole_text = ""
# process each chunk
for i, audio_chunk in enumerate(chunks, start=1):
# export audio chunk and save it in
# the `folder_name` directory.
chunk_filename = os.path.join(folder_name, f"chunk{i}.wav")
audio_chunk.export(chunk_filename, format="wav")
# recognize the chunk
with sr.AudioFile(chunk_filename) as source:
audio_listened = r.record(source)
# try converting it to text
try:
text = r.recognize_google(audio_listened)
except sr.UnknownValueError as e:
print("Error:", str(e))
else:
text = f"{text.capitalize()}. "
print(chunk_filename, ":", text)
whole_text += text
# return the text for all chunks detected
return whole_text
print(get_large_audio_transcription_on_silence("7601-291468-0006.wav"))

Implementing Speech Recognition with Python

basic implementation using the SpeechRecognition library involves several steps:

Audio Capture: Capturing audio from the microphone using PyAudio.

Audio Processing: Converting the audio signal into data that the SpeechRecognition library can work
with.

Recognition: Calling the recognize_google() method (or another available recognition method) on
the SpeechRecognition library to convert the audio data into text.

Pro_2

import speech_recognition as sr

# Initialize recognizer class (for recognizing the speech)

r = sr.Recognizer()

# Reading Microphone as source

# listening the speech and store in audio_text variable
with sr.Microphone() as source:
print("Talk")
audio_text = r.listen(source)
print("Time over, thanks")
# recoginze_() method will throw a request
# error if the API is unreachable,
# hence using exception handling

try:
# using google speech recognition
print("Text: "+r.recognize_google(audio_text))
except:
print("Sorry, I did not get that")

Speech Recognition in Python using Google Speech API

sudo pip install SpeechRecognition

PyAudio: Use the following command for Linux users

sudo apt-get install python-pyaudio python3-pyaudio
If the versions in the repositories are too old,
install pyaudio using the following command
sudo apt-get install portaudio19-dev python-all-dev python3-all-dev
&&
sudo pip install pyaudio
pip install pyaudio
USB Device 0x46d:0x825: Audio (hw:1, 0)

Make a note of this as it will be used in the program.

Set Chunk Size: This basically involved specifying how many bytes of data we want to read at once.
Typically, this value is specified in powers of 2 such as 1024 or 2048

Set Sampling Rate: Sampling rate defines how often values are recorded for processing

Set Device ID to the selected microphone : In this step, we specify the device ID of the microphone
that we wish to use in order to avoid ambiguity in case there are multiple microphones. This also
helps debug, in the sense that, while running the program, we will know whether the specified
microphone is being recognized. During the program, we specify a parameter device_id. The
program will say that device_id could not be found if the microphone is not recognized.

Allow Adjusting for Ambient Noise: Since the surrounding noise varies, we must allow the program a
second or two to adjust the energy threshold of recording so it is adjusted according to the external
noise level.

Speech to text translation: This is done with the help of Google Speech Recognition. This requires an
active internet connection to work. However, there are certain offline Recognition systems such as
PocketSphinx, that have a very rigorous installation process that requires several dependencies.
Google Speech Recognition is one of the easiest to use.

SPEECH HINDI
pip install SpeechRecognition
pip install PyAudio
pip install pipwin
pipwin install pyaudio

WAP Speech Hindi

# import required module
import speech_recognition as sr

# explicit function to take input commands

# and recognize them
def takeCommandHindi():

r = sr.Recognizer()
with sr.Microphone() as source:

# seconds of non-speaking audio before

# a phrase is considered complete
print('Listening')
r.pause_threshold = 0.7
audio = r.listen(source)
try:
print("Recognizing")
Query = r.recognize_google(audio, language='hi-In')

# for listening the command in indian english

print("the query is printed='", Query, "'")

# handling the exception, so that assistant can

# ask for telling again the command
except Exception as e:
print(e)
print("Say that again sir")
return "None"
return Query

# Driver Code

# call the function

takeCommandHindi()

Desktop Assistant Final
No ratings yet
Desktop Assistant Final
15 pages
Speech Recognition System
No ratings yet
Speech Recognition System
16 pages
How Speech Recognition Works: Hidden Markov Model
No ratings yet
How Speech Recognition Works: Hidden Markov Model
25 pages
Speech To Text Conversion
No ratings yet
Speech To Text Conversion
7 pages
Jarvis Tutorial
No ratings yet
Jarvis Tutorial
3 pages
2. Sphinx speech recognition
No ratings yet
2. Sphinx speech recognition
5 pages
Lecture
No ratings yet
Lecture
7 pages
Labs_9
No ratings yet
Labs_9
4 pages
Speech to Text
No ratings yet
Speech to Text
17 pages
Voice Assistant - Doge: Bachelor of Engineering IN Computer Science & Engineering
No ratings yet
Voice Assistant - Doge: Bachelor of Engineering IN Computer Science & Engineering
48 pages
Spoken Language Processing in Python Chapter2
No ratings yet
Spoken Language Processing in Python Chapter2
23 pages
Python GuiaUser
No ratings yet
Python GuiaUser
23 pages
Jarvis Voice Assistant
No ratings yet
Jarvis Voice Assistant
2 pages
Voice_Assistant_Report
No ratings yet
Voice_Assistant_Report
4 pages
jarvis
No ratings yet
jarvis
4 pages
Week-8 Nlp Lab Program
No ratings yet
Week-8 Nlp Lab Program
6 pages
Pydub
No ratings yet
Pydub
26 pages
2.5 Automatic Speech Recognition
No ratings yet
2.5 Automatic Speech Recognition
8 pages
Speech To Text - No Need To Write - 03
No ratings yet
Speech To Text - No Need To Write - 03
1 page
Speech Recognition
No ratings yet
Speech Recognition
4 pages
7B-Sem-DL-lab1
No ratings yet
7B-Sem-DL-lab1
1 page
Methodology To Use in Speech To Text Python - Google Search PDF
No ratings yet
Methodology To Use in Speech To Text Python - Google Search PDF
1 page
Voice Assistant Suggetion
No ratings yet
Voice Assistant Suggetion
3 pages
Assistant
No ratings yet
Assistant
2 pages
Spoken Language Processing in Python Chapter3
No ratings yet
Spoken Language Processing in Python Chapter3
26 pages
Speech Recog
No ratings yet
Speech Recog
5 pages
Speech-To-Text: Python
No ratings yet
Speech-To-Text: Python
10 pages
Virtual Assistance Project Brief
No ratings yet
Virtual Assistance Project Brief
8 pages
Speech Recognition Transcription With Open Source ...
No ratings yet
Speech Recognition Transcription With Open Source ...
2 pages
TSA Lab 2
No ratings yet
TSA Lab 2
3 pages
Speech Recognition
No ratings yet
Speech Recognition
13 pages
Project Report
No ratings yet
Project Report
58 pages
Speech Understanding Content
No ratings yet
Speech Understanding Content
9 pages
dl_proj_rep
No ratings yet
dl_proj_rep
11 pages
Coding The Future: A Comprehensive Guide To AI Development-By Tyler Welch
No ratings yet
Coding The Future: A Comprehensive Guide To AI Development-By Tyler Welch
180 pages
Synopsis
No ratings yet
Synopsis
5 pages
Coding The Future: A Comprehensive Guide To AI Development-By Tyler P Welch - The Astral Merchant
No ratings yet
Coding The Future: A Comprehensive Guide To AI Development-By Tyler P Welch - The Astral Merchant
31 pages
Voice M
No ratings yet
Voice M
19 pages
Speech Understanding Content
No ratings yet
Speech Understanding Content
10 pages
Voice_Assistant_Report_40_Pages
No ratings yet
Voice_Assistant_Report_40_Pages
44 pages
speech_recog[1]
No ratings yet
speech_recog[1]
2 pages
#Pip Install Pyttsx3 #Pip Install Speechrecognition #Pip Install Wikipedia
No ratings yet
#Pip Install Pyttsx3 #Pip Install Speechrecognition #Pip Install Wikipedia
3 pages
Raspberry Pi
No ratings yet
Raspberry Pi
16 pages
py report
No ratings yet
py report
8 pages
Voice_Identification_GLM4_Guide
No ratings yet
Voice_Identification_GLM4_Guide
2 pages
Training Project.pptyx
No ratings yet
Training Project.pptyx
11 pages
The PC Interfaced Voice Recognition System Is To Implement A Password For Authentication
No ratings yet
The PC Interfaced Voice Recognition System Is To Implement A Password For Authentication
7 pages
Python Based Voice Assistant Presentation
No ratings yet
Python Based Voice Assistant Presentation
8 pages
Ai
No ratings yet
Ai
2 pages
7sem_projectreport
No ratings yet
7sem_projectreport
33 pages
Chat Bot 1
No ratings yet
Chat Bot 1
7 pages
Setting Up Packages for Speech Recognition
No ratings yet
Setting Up Packages for Speech Recognition
3 pages
NLP EXP 8
No ratings yet
NLP EXP 8
2 pages
Lecture 1
No ratings yet
Lecture 1
48 pages
code
No ratings yet
code
4 pages
Speech Recognition System: Surabhi Bansal Ruchi Bahety
No ratings yet
Speech Recognition System: Surabhi Bansal Ruchi Bahety
5 pages
aa Alexa
No ratings yet
aa Alexa
3 pages
Python Pranks and Mischief with NLP
From Everand
Python Pranks and Mischief with NLP
Edward Franklin
No ratings yet
Python-Deprecated Library v1.1 Documentation
From Everand
Python-Deprecated Library v1.1 Documentation
Laurent LAPORTE
No ratings yet
Backend Handbook: for Ruby on Rails Apps
From Everand
Backend Handbook: for Ruby on Rails Apps
Francisco Quintero
1/5 (1)
Algorithms Lecture 4: Dynamic Programming
No ratings yet
Algorithms Lecture 4: Dynamic Programming
15 pages
Isc Topicwise
No ratings yet
Isc Topicwise
22 pages
Dokumen - Pub Database Design and Implementation Java JDBC 2nbsped 9783030338350 9783030338367
100% (1)
Dokumen - Pub Database Design and Implementation Java JDBC 2nbsped 9783030338350 9783030338367
468 pages
General Interview Questions
No ratings yet
General Interview Questions
4 pages
CAO - Processor Organization and Control Unit
No ratings yet
CAO - Processor Organization and Control Unit
120 pages
Practical Assignment No 1 To 9
100% (2)
Practical Assignment No 1 To 9
46 pages
TCS NQT 2023 Coding Questions With Codes Hiringhus
No ratings yet
TCS NQT 2023 Coding Questions With Codes Hiringhus
18 pages
DBMS M3
No ratings yet
DBMS M3
66 pages
React JS Interview Questions
No ratings yet
React JS Interview Questions
49 pages
Kotlin Tutorial PDF
100% (1)
Kotlin Tutorial PDF
5 pages
RainFall - Prediction - Ipynb - Colaboratory
No ratings yet
RainFall - Prediction - Ipynb - Colaboratory
7 pages
Unit 1
No ratings yet
Unit 1
24 pages
Web Development using PHP Viva question
No ratings yet
Web Development using PHP Viva question
5 pages
Sarwar Software
No ratings yet
Sarwar Software
1 page
Cmdbuild Database Schema: Class List
No ratings yet
Cmdbuild Database Schema: Class List
16 pages
C Aptitude Questions and Answers With Explanation-2
No ratings yet
C Aptitude Questions and Answers With Explanation-2
23 pages
ACS758 At328p
No ratings yet
ACS758 At328p
2 pages
VBCC Compiler System: Volker Barthelmann
No ratings yet
VBCC Compiler System: Volker Barthelmann
142 pages
Computer Science ARMLITE
No ratings yet
Computer Science ARMLITE
79 pages
Introduction To Coding & Computational Thinking Exercises
No ratings yet
Introduction To Coding & Computational Thinking Exercises
9 pages
COA Lab Questions For Semester Exam-Nov 2020
No ratings yet
COA Lab Questions For Semester Exam-Nov 2020
5 pages
Move and Click Your Mouse With A VBA Macro - Wellsr
No ratings yet
Move and Click Your Mouse With A VBA Macro - Wellsr
7 pages
Practice QUestions-Annual Exam 2023-2024 Set-2
No ratings yet
Practice QUestions-Annual Exam 2023-2024 Set-2
10 pages
Web Based Claim Processing System
100% (1)
Web Based Claim Processing System
100 pages
Radial Basis Function (RBF) Neural Networks For The Senior Design Project
No ratings yet
Radial Basis Function (RBF) Neural Networks For The Senior Design Project
17 pages
Conversion Functions
No ratings yet
Conversion Functions
2 pages
Introduction To Pic Programming: Programming Baseline Pics in C
No ratings yet
Introduction To Pic Programming: Programming Baseline Pics in C
20 pages
Fast Lempel-ZIV (LZ'78) Algorithm Using Codebook Hashing: Megha Atwal, Lovnish Bansal
No ratings yet
Fast Lempel-ZIV (LZ'78) Algorithm Using Codebook Hashing: Megha Atwal, Lovnish Bansal
4 pages
Java 1 Assignment
No ratings yet
Java 1 Assignment
12 pages
SQL Notes B.SC
No ratings yet
SQL Notes B.SC
2 pages