Offline Speech to Text Without any Popup Dialog in Android
Last Updated :
14 Apr, 2025
In this article, we are going to implement an offline speech-to-text functionality in our project. It can work both Online and Offline. When there is no internet connectivity, it will use the pre-stored language model from our mobile device, so it didn't recognize much clearly but gave good results. When it is Online it recognizes all the words correctly. Note that we are going to implement this project using the Kotlin language.
Note: The offline method will not work on those devices whose API Version is less than 23. Also, this application is not meant to be run on emulators.
Step-by-Step Implementation
Step 1: Create a New Project
To create a new project in Android Studio please refer to How to Create/Start a New Project in Android Studio.
Note: Select Java/Kotlin as the programming language.
Step 2: Adding Permission
To access the mobile device microphone, we have to add RECORD_AUDIO permission in our AndroidManifest.xml file like below:
<uses-permission android:name="android.permission.RECORD_AUDIO"/>
Step 3: Modify the colors.xml file and Add the Resources
Add the Below lines in the colors.xml file.
<color name="mic_enabled_color">#0E87E7</color>
<color name="mic_disabled_color">#6D6A6A</color>
Create a mic.xml file in the res > drawable folder and the following code.
mic.xml:
XML
<vector xmlns:android="http://schemas.android.com/apk/res/android"
android:width="24dp"
android:height="24dp"
android:viewportWidth="960"
android:viewportHeight="960">
<path
android:pathData="M480,560q-50,0 -85,-35t-35,-85v-240q0,-50 35,-85t85,-35q50,0 85,35t35,85v240q0,50 -35,85t-85,35ZM480,320ZM440,840v-123q-104,-14 -172,-93t-68,-184h80q0,83 58.5,141.5T480,640q83,0 141.5,-58.5T680,440h80q0,105 -68,184t-172,93v123h-80ZM480,480q17,0 28.5,-11.5T520,440v-240q0,-17 -11.5,-28.5T480,160q-17,0 -28.5,11.5T440,200v240q0,17 11.5,28.5T480,480Z"
android:fillColor="#e8eaed"/>
</vector>
Step 4: Working with the activity_main.xml file
Go to the activity_main.xml file and refer to the following code. Below is the code for the activity_main.xml file.
activity_main.xml:
activity_main.xml
<?xml version="1.0" encoding="utf-8"?>
<LinearLayout
xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:app="http://schemas.android.com/apk/res-auto"
xmlns:tools="http://schemas.android.com/tools"
android:layout_width="match_parent"
android:layout_height="match_parent"
android:gravity="center"
android:orientation="vertical"
tools:context=".MainActivity">
<TextView
android:id="@+id/speak_output_tv"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_marginHorizontal="20dp"
android:text="Output Text Here..."
android:textAlignment="center"
android:textSize="25sp" />
<ImageView
android:id="@+id/mic_speak_iv"
android:layout_width="60dp"
android:layout_height="60dp"
android:layout_marginTop="20dp"
android:src="@drawable/mic"
app:tint="@color/mic_disabled_color" />
</LinearLayout>
Design UI:

Step 5: Working with the MainActivity.kt file
Go to the MainActivity.kt file and refer to the following code.
Checking Audio Permission:
To get started, we first need to allow the app to access microphone permission. This function will check if the app is able to access the microphone permission or not. If the permission is not granted then it will open the settings directly and from there the user can allow the microphone permission manually. This offline speech to text is not supported for lower API versions i.e., below 23, so here we are first checking the mobile API version by using Build.VERSION.SDK_INT, and here Build.VERSION_CODES.M will return the constant value of M i.e., 23. Replace the package name from the code with your package name(You can find your package name from the AndroidManifest.xml file)
Kotlin
private fun checkAudioPermission() {
if(Build.VERSION.SDK_INT >= Build.VERSION_CODES.M) { // M = 23
if(ContextCompat.checkSelfPermission(this, "android.permission.RECORD_AUDIO") != PackageManager.PERMISSION_GRANTED) {
// this will open settings which asks for permission
val intent = Intent(Settings.ACTION_APPLICATION_DETAILS_SETTINGS, Uri.parse("package:com.programmingtech.offlinespeechtotext"))
startActivity(intent)
Toast.makeText(this, "Allow Microphone Permission", Toast.LENGTH_SHORT).show()
}
}
}
The Function which Handles Speech to Text:
This is the main function of our project which handles speech. We have to first create an object of SpeechRecognizer class of current Context i.e., this(If we are using any Fragments, AlertDialog, etc, there we can replace this with context). Then we have to create an intent and attach EXTRA_LANGUAGE_MODEL and LANGUAGE_MODEL_FREE_FORM to the intent. In setRecognitionListener() method we have to override all the necessary functions like below. To get the speech result, we have to use onResults() method and storing the array list output from the Bundle. The element at the first index will give the output of the speech. We can also use useful functions like onBeginningOfSpeech() which runs first before it started listening and onEndOfSpeech() which runs after the result.
Kotlin
private fun startSpeechToText() {
val speechRecognizer = SpeechRecognizer.createSpeechRecognizer(this)
val speechRecognizerIntent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
speechRecognizerIntent.putExtra(
RecognizerIntent.EXTRA_LANGUAGE_MODEL,
RecognizerIntent.LANGUAGE_MODEL_FREE_FORM
)
speechRecognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault())
speechRecognizer.setRecognitionListener(object : RecognitionListener {
override fun onReadyForSpeech(bundle: Bundle?) {}
override fun onBeginningOfSpeech() {}
override fun onRmsChanged(v: Float) {}
override fun onBufferReceived(bytes: ByteArray?) {}
override fun onEndOfSpeech() {}
override fun onError(i: Int) {}
override fun onResults(bundle: Bundle) {
val result = bundle.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)
if (result != null) {
// result[0] will give the output of speech
}
}
override fun onPartialResults(bundle: Bundle) {}
override fun onEvent(i: Int, bundle: Bundle?) {}
})
// starts listening ...
speechRecognizer.startListening(speechRecognizerIntent)
}
Entire code for MainActivity.kt:
Below is the final code for the MainActivity.kt file. Comments are added inside the code to understand the code in more detail.
MainActivity.kt:
MainActivity.kt
package org.geeksforgeeks.demo
import android.content.Intent
import android.content.pm.PackageManager
import android.net.Uri
import android.os.Build
import android.os.Bundle
import android.provider.Settings
import android.speech.RecognitionListener
import android.speech.RecognizerIntent
import android.speech.SpeechRecognizer
import android.widget.ImageView
import android.widget.TextView
import android.widget.Toast
import androidx.appcompat.app.AppCompatActivity
import androidx.core.app.ActivityCompat
import androidx.core.content.ContextCompat
import java.util.Locale
class MainActivity : AppCompatActivity() {
private lateinit var micIV: ImageView
private lateinit var outputTV: TextView
private lateinit var speechRecognizer: SpeechRecognizer
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
setContentView(R.layout.activity_main)
micIV = findViewById(R.id.mic_speak_iv)
outputTV = findViewById(R.id.speak_output_tv)
micIV.setOnClickListener {
checkAudioPermission()
}
}
private fun startSpeechToText() {
speechRecognizer = SpeechRecognizer.createSpeechRecognizer(this)
val speechRecognizerIntent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
speechRecognizerIntent.putExtra(
RecognizerIntent.EXTRA_LANGUAGE_MODEL,
RecognizerIntent.LANGUAGE_MODEL_FREE_FORM
)
speechRecognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault())
speechRecognizer.setRecognitionListener(object : RecognitionListener {
override fun onReadyForSpeech(bundle: Bundle?) {}
override fun onBeginningOfSpeech() {}
override fun onRmsChanged(v: Float) {}
override fun onBufferReceived(bytes: ByteArray?) {}
override fun onEndOfSpeech() {
// changing the color of our mic icon to
// gray to indicate it is not listening
micIV.setColorFilter(
ContextCompat.getColor(
applicationContext,
R.color.mic_disabled_color
)
) // #FF6D6A6A
}
override fun onError(errorCode: Int) {
val message = when (errorCode) {
SpeechRecognizer.ERROR_AUDIO -> "Audio recording error"
SpeechRecognizer.ERROR_CLIENT -> "Client side error"
SpeechRecognizer.ERROR_INSUFFICIENT_PERMISSIONS -> "Please allow permissions"
SpeechRecognizer.ERROR_NETWORK -> "Network error"
SpeechRecognizer.ERROR_NETWORK_TIMEOUT -> "Network timeout"
SpeechRecognizer.ERROR_NO_MATCH -> "No voice detected"
SpeechRecognizer.ERROR_RECOGNIZER_BUSY -> "Already Listening"
SpeechRecognizer.ERROR_SERVER -> "Server error"
// Add other cases based on SpeechRecognizer error codes
else -> "Unknown error"
}
Toast.makeText(applicationContext, message, Toast.LENGTH_SHORT)
.show()
}
override fun onResults(bundle: Bundle) {
val result = bundle.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)
if (result != null) {
// attaching the output
// to our textview
outputTV.text = result[0]
}
}
override fun onPartialResults(bundle: Bundle) {}
override fun onEvent(i: Int, bundle: Bundle?) {}
})
speechRecognizer.startListening(speechRecognizerIntent)
}
private fun checkAudioPermission() {
if (ContextCompat.checkSelfPermission(this, android.Manifest.permission.RECORD_AUDIO)
!= PackageManager.PERMISSION_GRANTED
) {
ActivityCompat.requestPermissions(
this,
arrayOf(android.Manifest.permission.RECORD_AUDIO),
1
)
} else {
// Permission is already granted
micIV.setColorFilter(ContextCompat.getColor(this, R.color.mic_enabled_color))
startSpeechToText()
}
}
override fun onRequestPermissionsResult(
requestCode: Int, permissions: Array<out String>, grantResults: IntArray
) {
super.onRequestPermissionsResult(requestCode, permissions, grantResults)
if (requestCode == 1) {
if (grantResults.isNotEmpty() && grantResults[0] == PackageManager.PERMISSION_GRANTED) {
Toast.makeText(this, "Permission Granted", Toast.LENGTH_SHORT).show()
micIV.setColorFilter(ContextCompat.getColor(this, R.color.mic_enabled_color))
startSpeechToText()
} else {
Toast.makeText(this, "Microphone permission denied", Toast.LENGTH_SHORT).show()
// Optional: If the user denied with "Don't ask again"
if (!ActivityCompat.shouldShowRequestPermissionRationale(this, android.Manifest.permission.RECORD_AUDIO)) {
val intent = Intent(
Settings.ACTION_APPLICATION_DETAILS_SETTINGS,
Uri.parse("package:$packageName")
)
startActivity(intent)
Toast.makeText(this, "Please allow Microphone permission from Settings", Toast.LENGTH_LONG).show()
}
}
}
}
override fun onDestroy() {
super.onDestroy()
speechRecognizer.destroy()
}
}
Output:
Similar Reads
Speech to Text Application in Android with Kotlin
Speech to Text is seen in many applications such as Google search. With the help of this feature, the user can simply speak the query he wants to search. The text format of that speech will be automatically generated in the search bar. In this article, we will be taking a look at How to implement Sp
4 min read
How to Convert Text to Speech in Android using Kotlin?
Text to Speech App converts the text written on the screen to speech like you have written âHello Worldâ on the screen and when you press the button it will speak âHello Worldâ. Text-to-speech is commonly used as an accessibility feature to help people who have trouble reading on-screen text, but it
3 min read
How to Convert Text to Speech in Android?
Text to Speech App converts the text written on the screen to speech like you have written "Hello World" on the screen and when you press the button it will speak "Hello World". Text-to-speech is commonly used as an accessibility feature to help people who have trouble reading on-screen text, but it
3 min read
How to Convert Speech to Text in Android?
In this article, speech to text feature is implemented in an application in Android. Speech to text means that anything that the user says is converted into text. This feature has come out to be a very common and useful feature for the users. In various places where search feature is implemented lik
5 min read
Speech to Text Application in Android using Jetpack Compose
Speech to Text is used in most applications such as Google Search for searching any query. For using this feature user simply has to tap on the microphone icon and speak the query he wants to search. The speech of the user will be converted to text. In this article, we will take a look at How we can
7 min read
Alert Dialog with SingleItemSelection in Android
Alert Dialogs are the UI elements that pop up when the user performs some crucial actions with the application. These window-like elements may contain multiple or single items to select from the list or have the error message and some action buttons. In this article, it's been discussed how to imple
4 min read
How to open dialer in Android through Intent?
The phone dialer is an activity available with the Android operating system to call a number. Usually, such activity may or may not have an EditText, for taking the number as input, and a Call button. When the user presses the Call button, it invokes the dialer app activity. Use of 'tel:' prefix is
3 min read
How to Create Dialog with Custom Layout in Android?
In Android, A dialog is a small window that prompts the user to make a decision, provide some additional information, and inform the user about some particular task. The following are the main purposes or goals of a dialog To warn the user about any activity.To inform the user about any activity.To
3 min read
Alert Dialog with MultipleItemSelection in Android
In the previous article Alert Dialog with SingleItemSelection in Android, we have seen how the alert dialog is built for single item selection. In this article, it's been discussed how to build an alert dialog with multiple item selection. Multiple Item selection dialogs are used when the user wants
5 min read
Display Popup Menu On Long Press of a View in Android
Android Popup Menu displays a list of items in a vertical list which presents the view that invoked the menu and is useful to provide an overflow of actions that are related to specific content. In this tutorial, we will learn how to display a popup menu on the long-press of a view. We will learn it
3 min read