0% found this document useful (0 votes)
14 views

Arman-2

Uploaded by

vats3347
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Arman-2

Uploaded by

vats3347
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Project based learning on

“TEXT TO SPEECH CONVERTER USING PYTHON”

I
By

Arman Vats (Reg. no. 202100453)

BACHELOR OF ARTIFICIAL INTELLIGENCE AND DATA SCIENCE

SIKKIM MANIPAL INSTITUTE OF TECHNOLOGY


(A constituent college of Sikkim Manipal University)
MAJITAR, RANGPO, EAST SIKKIM – 737136

II
ABSTRACT

This Python-based desktop application utilizes tkinter for its graphical user interface (GUI)
and pyttsx3 for text-to-speech conversion, providing users with a streamlined and intuitive
experience. The home page of the application serves as an introductory guide, offering clear
navigation to the main text-to-speech page. On this page, users can easily input their text and
choose from two available voices, making the process straightforward and user-friendly.

The application's GUI is designed with thoughtful attention to detail, featuring clear labels,
well-placed buttons, and responsive hover effects that enhance usability. This design
approach not only makes the interface visually appealing but also ensures a smooth and
engaging user experience. By focusing on accessibility and ease of use, the application caters
to a diverse audience, including language learners and individuals with disabilities.

One of the standout features of the application is its customizable speech output, allowing
users to select their preferred voice. This flexibility is particularly beneficial for tailoring
speech output to individual needs and preferences. The text-to-speech functionality operates
efficiently, delivering accurate and timely conversions that contribute to an overall effective
and enjoyable user experience.

III
ACKNOWLEDGMENTS

We would like to extend our sincere gratitude to the following individuals and organizations
for their contributions to the development of this text-to-speech application:

• The Python community for providing a robust and versatile programming language that
enabled us to bring this project to life.

• The developers of tkinter and pyttsx3 for creating and maintaining these excellent libraries,
which were instrumental in building the application's GUI and text-to-speech
functionality.

• Our colleagues and peers for their valuable feedback and suggestions, which helped shape
the application into its current form.

• The online communities and forums that provided us with resources, tutorials, and
troubleshooting advice throughout the development process.

We are also grateful for the opportunity to work on this project, which has allowed us to
explore the intersection of GUI design and text-to-speech technology. We hope that this
application will be of use to others and serve as a foundation for future projects in this area."

IV
LIST OF CONTENTS

SNO TITLE PAGE NO

1 Introduction 1

2 Tools Used 2

3 Introduction to Python 3-4

4 Introduction to Tkinter4 5-8

5 Introduction to Pyttsx3 9

6 Project 10-14

7 Limitations and Future Development 15-16

8 Conclusion 17

V
LIST OF FIGURES

S.No CONTENT Pg No

1. Fig No 1: Code Snipper 4

2. Fig no 2 : Fundamental structure of Tkinter program 6

3. Fig no 3: Architecture of tkinter 8

4. Fig no 4: pyttsx3 working 9

5. Fig no 5: Front page 10

6. Fig no 6: Hover effect 11

7. Fig no 7: User Interface for Text Enter 12

8. Fig no 8: User Interface after Enter the Text 13

VI
1. INTRODUCTION

A text-to-speech application is developed using Python. The tkinter library is employed for
creating the graphical user interface (GUI), and the pyttsx3 library is utilized for converting
text into spoken words. A home page is designed, providing an introductory overview and a
button to navigate to the main functionality.

The main functionality of the application is found on the text-to-speech page. Here, text can
be entered into a designated text box. Two different voices are supported, offering users a
choice in the tone of the spoken text. Once the text is entered and a voice is selected, the text
can be read aloud in the chosen voice.

The GUI of the application is designed to be user-friendly and engaging. A sky blue color
scheme is used, creating a pleasing aesthetic. The "Convert" button is made prominent and
pink, making it easy for users to initiate the conversion process. Interactive elements, such as
hover effects on buttons, are also included to further enhance user engagement.

The Python Text to Speech Converter project is not just a demonstration of text-to-speech
technology, but also a practical tool. It can be used as an accessibility tool for individuals
with visual impairments, an educational aid for language learning, or even as an
entertainment application. The customizable features and simple design make it a versatile
tool that can be adapted to suit various user needs.

The main window of the application is created using the tkinter library. Within this main
window, a text entry widget is created for users to input the text they want to convert to
speech. A "Convert" button is also added, using the ttk. Button widget. When this button is
clicked, the text entered by the user is retrieved from the text entry widget and converted to
speech using the pyttsx3 library.

The pyttsx3 library is a crucial component of this project. Two voices are supported, allowing
users to choose between a female and a male voice. When the "Convert" button is clicked, the
tts command is run first, and the window freezes until pyttsx3 finishes speaking. Only then
does the window become responsive again.

1
2. TOOL USED

I. Python Programming Language


Programming language Python is high-level, interpreted, and dynamically typed. A vast
range of applications, including web development, data analysis, artificial intelligence,
machine learning, and scientific computing, make it one of the most widely used
programming languages in the world today. its usability and readability. Its syntax is
straightforward and quick to learn, making it a fantastic choice for novices.

Python is a popular choice for scripting and automation because of its versatility.
Everything from small projects like automating email responses to more involved ones
like web scraping, data analysis, and even creating intricate software applications can be
done with it.

II. Tkinter
Tkinter is a standard Python GUI (Graphical User Interface) library that provides a set of
tools and widgets to create desktop applications with graphical interfaces. Tkinter is
included with most Python installations, making it easily accessible for developers who
want to build GUI applications without requiring additional installations or libraries.

Tkinter is the inbuilt python module that is used to create GUI applications. It is one of
the most commonly used modules for creating GUI applications in Python as it is simple
and easy to work with.

III. pyttsx3
pyttsx3 is a text-to-speech conversion library in Python. Unlike alternative libraries, it
works offline and is compatible with both Python 2 and 3. An application invokes the
pyttsx3.init() factory function to get a reference to a pyttsx3. Engine instance. it is a very
easy to use tool which converts the entered text into speech. The pyttsx3 module supports
two voices first is female and the second is male which is provided by “sapi5” for
windows.

2
3. INTRODUCTION TO PYTHON
Python is a high-level, interpreted programming language known for its readability and
simplicity. It was created by Guido van Rossum and first released in 1991. Python's design
emphasizes code readability with its use of significant whitespace, making it easier to
understand and write code.
Here are some key features of Python:
1. Readable and Simple Syntax: Python's syntax is designed to be easy to read and write,
which helps both beginners and experienced programmers write clean and maintainable code.
2. Interpreted Language: Python code is executed line-by-line, which allows for interactive
programming and easier debugging.
3. Dynamic Typing: Variables in Python do not need explicit type definitions; the type is
determined at runtime.
4. Extensive Standard Library: Python comes with a large standard library that supports many
common programming tasks, such as connecting to web servers, reading and writing files,
and handling data.
5. Versatility: Python is used in various domains, including web development, data analysis,
artificial intelligence, scientific computing, automation, and more.
6. Community and Ecosystem: Python has a strong community and a rich ecosystem of third-
party libraries and frameworks, such as Django for web development, Pandas for data
analysis, and TensorFlow for machine learning.
Overall, Python is popular for its ease of learning and its wide range of applications.

What can Python do?

 Python can be used on a server to create web applications.


 Python can be used alongside software to create workflows.
 Python can connect to database systems. It can also read and modify files.
 Python can be used to handle big data and perform complex mathematics.
 Python can be used for rapid prototyping, or for production-ready software
development.

3
Why Python?

 Python works on different platforms (Windows, Mac, Linux, Raspberry Pi, etc).
 Python has a simple syntax similar to the English language.
 Python has syntax that allows developers to write programs with fewer lines than
some other programming languages.
 Python runs on an interpreter system, meaning that code can be executed as soon as it
is written. This means that prototyping can be very quick.
 Python can be treated in a procedural way, an object-oriented way or a functional
way.

Python Syntax compared to other programming languages

 Python was designed for readability, and has some similarities to the English
language with influence from mathematics.
 Python uses new lines to complete a command, as opposed to other programming
languages which often use semicolons or parentheses.
 Python relies on indentation, using whitespace, to define scope; such as the scope of
loops, functions and classes. Other programming languages often use curly-brackets
for this purpose.

4
Fig No 1: Code Snipper

4. INTRODUCTION TO TKINTER
Tkinter is a standard Python GUI (Graphical User Interface) library that provides a set of
tools and widgets to create desktop applications with graphical interfaces. Tkinter is included
with most Python installations, making it easily accessible for developers who want to build
GUI applications without requiring additional installations or libraries.

Full Form of Tkinter


The name “Tkinter” comes from “Tk interface“, referring to the Tk GUI toolkit that Tkinter
is based on. Tkinter provides a way to create windows, buttons, labels, text boxes, and other
GUI components to build interactive applications.

Significance of Tkinter
Tkinter is the inbuilt python module that is used to create GUI applications. It is one of the
most commonly used modules for creating GUI applications in Python as it is simple and
easy to work with. You don’t need to worry about the installation of the Tkinter module
separately as it comes with Python already. It gives an object-oriented interface to the Tk
GUI toolkit. Among all, Tkinter is most widely used.

Where is Python Tkinter used?


Here are some common use cases for Tkinter:

• Creating windows and dialog boxes: Tkinter can be used to create windows and dialog
boxes that allow users to interact with your program. These can be used to display
information, gather input, or present options to the user.
• Building a GUI for a desktop application: Tkinter can be used to create the interface for a
desktop application, including buttons, menus, and other interactive elements.
• Adding a GUI to a command-line program: Tkinter can be used to add a GUI to a
command-line program, making it easier for users to interact with the program and
input arguments.

5
• Creating custom widgets: Tkinter includes a variety of built-in widgets, such as buttons,
labels, and text boxes, but it also allows you to create your own custom widgets.
• Prototyping a GUI: Tkinter can be used to quickly prototype a GUI, allowing you to test
and iterate on different design ideas before committing to a final implementation.
In summary, Tkinter is a useful tool for creating a wide variety of graphical user interfaces,
including windows, dialog boxes, and custom widgets. It is particularly well-suited for
building desktop applications and adding a GUI to command-line programs.

To create a Tkinter Python app, you follow these basic steps:

• Import the kinter module: This is done just like importing any other module in Python.
Note that in Python 2.x, the module is named ‘Tkinter’, while in Python 3.x, it is
named ‘tkinter’.
• Create the main window (container): The main window serves as the container for all the
GUI elements you’ll add later.
• Add widgets to the main window: You can add any number of widgets like buttons,
labels, entry fields, etc., to the main window to design the interface as desired.
• Apply event triggers to the widgets: You can attach event triggers to the widgets to
define how they respond to user interactions.

Fig no 2 : Fundamental structure of Tkinter program

6
ARCHITECTURE

Tcl/Tk is not a single library but rather consists of a few distinct modules, each with separate
functionality and its own official documentation. Python’s binary releases also ship an add-on
module together with it.

Tcl
Tcl is a dynamic interpreted programming language, just like Python. Though it can
be used on its own as a general-purpose programming language, it is most commonly
embedded into C applications as a scripting engine or an interface to the Tk toolkit.
The Tcl library has a C interface to create and manage one or more instances of a Tcl
interpreter, run Tcl commands and scripts in those instances, and add custom
commands implemented in either Tcl or C. Each interpreter has an event queue, and
there are facilities to send events to it and process them. Unlike Python, Tcl’s
execution model is designed around cooperative multitasking, and Tkinter bridges this
difference (see Threading model for details).

Tk
Tk is a Tcl package implemented in C that adds custom commands to create and
manipulate GUI widgets. Each Tk object embeds its own Tcl interpreter instance with
Tk loaded into it. Tk’s widgets are very customizable, though at the cost of a dated
appearance. Tk uses Tcl’s event queue to generate and process GUI events.

Ttk
Themed Tk (Ttk) is a newer family of Tk widgets that provide a much better
appearance on different platforms than many of the classic Tk widgets. Ttk is
distributed as part of Tk, starting with Tk version 8.5. Python bindings are provided in
a separate module, tkinter.ttk.

7
Internally, Tk and Ttk use facilities of the underlying operating system, i.e., Xlib on
Unix/X11, Cocoa on macOS, GDI on Windows.

When your Python application uses a class in Tkinter, e.g., to create a widget,
the tkinter module first assembles a Tcl/Tk command string. It passes that Tcl command
string to an internal _tkinter binary module, which then calls the Tcl interpreter to evaluate it.
The Tcl interpreter will then call into the Tk and/or Ttk packages, which will in turn make
calls to Xlib, Cocoa, or GDI.

Fig no 3: Architecture of tkinter

8
5. INTRODUCTION TO PYTTSX3
pyttsx3 is a text-to-speech conversion library in Python. Unlike alternative libraries, it
works offline and is compatible with both Python 2 and 3. An application invokes the
pyttsx3.init() factory function to get a reference to a pyttsx3. Engine instance. it is a very
easy to use tool which converts the entered text into speech. The pyttsx3 module supports
two voices first is female and the second is male which is provided by “sapi5” for
windows.

Fig no 4: pyttsx3 working

9
6. PROJECT

Before implementing the actual design of the project, a few user interface designs were
constructed to visualize the user interaction with the system

Fig no 5: Front page

The above figure depicts the front interface of the Text to Speech application. On this
screen, users will see a welcome message and a brief description of the application's
functionality, which is a text-to-speech converter. Below this, there's a prominent button
labelled "Go to Text to Speech," inviting the user to start the application. By clicking the
button, users are taken to the main interface where they can enter text, select a voice, and
have the text spoken aloud. The interface is designed to be simple and user-friendly.

10
Fig no 6: Hover effect

The above figure demonstrates the hover effect applied when the user moves the cursor over
the button in the application. As seen, the button's background colour changes from its
default colour to orange, providing a visual cue that the button is interactive and ready to be
clicked. This effect enhances the user experience by making the interface more responsive
and engaging. The hover effect is designed to make it clear when the button is active,
improving the overall usability of the application. This feature is implemented to ensure that
users have a smooth and intuitive interaction with the interface.

11
Fig no 7: User Interface for Text Enter

The displayed interface is the user-friendly section of the Text to Speech application, where
users can input the text they wish to convert to speech. This interface is designed to be
intuitive, making it easy for users to interact with the application. In addition to the text entry
field, the interface offers two options for selecting the voice of the virtual assistant. These
options allow users to customize the voice output, choosing between different voice types
based on their preferences. The selectable voices enhance the flexibility of the application,
catering to diverse user needs and preferences. Once the text is entered and the desired voice

12
is selected, users can click the "Speak" button to hear the text read aloud in the chosen voice.
This setup ensures a seamless and personalized experience, enabling users to easily convert
written text into spoken words with just a few clicks, all within a visually appealing and
functional interface.

Fig no 8: User Interface after Enter the Text

This interface showcases the hover effect that occurs when the user interacts with the "Speak"
button after entering text into the provided text box. When the user hovers over the button, its
colour changes, indicating that the button is active and ready to be clicked. This visual

13
feedback enhances the user experience by clearly signalling that the button will execute the
command to convert the entered text to speech.

After the user clicks the button, the AI reads aloud the text that was input in the text box,
using the voice selected by the user. The ability to choose between different voices allows
users to personalize the output, making the interaction more engaging and tailored to their
preferences. The hover effect not only improves the usability of the interface but also adds a
layer of interactivity, ensuring that users can easily convert their written text into spoken
words with an AI voice of their choice.

7. LIMITATIONS AND FUTURE DEVELOPMENT

There are some limitations for the current system to which solutions can be provided as a
future development:

1. Limited Voice Options


o Currently, the application offers only two voice choices for text-to-speech
conversion. This limitation restricts users who may prefer a wider variety of
voices, accents, or languages to suit their specific needs or preferences.

2. Lack of Language Support


o The application is confined to converting text in a single language, typically
English. This restricts non-English speakers or those who wish to convert text
in other languages, limiting the application's global usability and accessibility.

3. No Control Over Speech Parameters


o Users do not have the ability to adjust speech parameters such as speed, pitch,
and volume within the application. This lack of customization can affect the
clarity and comfort of the audio output, especially for users with specific
auditory requirements.

4. Inability to Save Audio Output


o The application does not provide an option to save the generated speech as
audio files (e.g., MP3 or WAV). This limits the utility of the application for

14
users who wish to store, share, or use the audio output for offline purposes or
in other multimedia projects.

5. Basic Error Handling and Input Validation


o There is minimal error handling implemented in the current version. The
application may not appropriately handle scenarios such as empty text input,
unsupported characters, or runtime errors, which can lead to crashes or
unexpected behavior, affecting user experience and reliability.

There are some Future Development for the current system to which solutions can be
provided as a future development:

1. Expansion of Voice and Language Options


o Integrate additional voices covering various accents, as well as support for
multiple languages. This enhancement would cater to a broader audience and
make the application more versatile and inclusive.
2. Implementation of Adjustable Speech Parameters
o Introduce controls allowing users to modify speech speed, pitch, and volume
according to their preferences. This feature would enhance the flexibility and
usability of the application, making it suitable for diverse contexts and user
needs.
3. Enhanced User Interface and Design
o Improve the GUI by incorporating modern design principles, responsive
layouts, and customizable themes. This would make the application more
aesthetically pleasing and user-friendly, providing a better overall user
experience.
4. Integration of Text Input from External Sources
o Enable users to import text from external sources such as text files, PDFs, or
clipboard content. This feature would streamline the workflow for users
dealing with large amounts of text and enhance the application's efficiency.

15
5. Cross-Platform Compatibility
o Adapt the application to be compatible across different operating systems such
as macOS and Linux, in addition to Windows. Ensuring cross-platform
support would broaden the user base and make the application accessible to a
wider audience.

8. CONCLUSION

The development of this Python-based desktop application, leveraging tkinter for the
graphical user interface (GUI) and pyttsx3 for text-to-speech functionality, marks a
significant achievement in delivering a practical and accessible tool for users with various
needs. The project focused on creating a user-friendly and intuitive interface that caters to
both novice and experienced users. By emphasizing simplicity and clarity, the design ensures
that users can easily navigate the application, making it suitable for a wide audience,
including those with limited technical expertise.

The application’s functionality centers around converting text into speech, a task it performs
with notable efficiency and accuracy. The integration of pyttsx3 allows the application to
operate offline, providing a crucial advantage over many other text-to-speech solutions that
require an active internet connection. This offline capability enhances the application’s
versatility, making it useful in environments where connectivity might be an issue. The quick
and seamless conversion of text to speech ensures that the application is not only functional
but also reliable for daily use.

A standout feature of the application is its ability to customize the speech output, offering
users two different voices to choose from. This feature is particularly valuable for those who
require specific types of speech output, such as language learners who may benefit from a
particular voice for clarity, or individuals with disabilities who need tailored auditory support.
The ability to select a preferred voice adds a layer of personalization, making the application
more engaging and user-centered. This flexibility in voice selection demonstrates the
application’s commitment to catering to the diverse needs of its users.

Accessibility was a central consideration throughout the development of this project. The
application’s capacity to convert text into spoken words opens up possibilities for users with
visual impairments or reading difficulties, allowing them to access written content audibly.

16
The clean and simple GUI design further supports accessibility by ensuring that users with
varying levels of ability can interact with the application without encountering unnecessary
barriers. The hover effects on buttons, while aesthetically pleasing, also serve a practical
purpose by providing visual cues that guide users through the application’s features.

In conclusion, this project has successfully created a robust and user-friendly tool that
effectively meets the needs of a broad user base. The combination of an intuitive interface,
reliable text-to-speech functionality, and customizable features makes this application a
valuable resource for various users, from language learners to individuals with accessibility
needs. The project underscores the importance of user-centered design and demonstrates how
thoughtful integration of functionality and accessibility can result in a powerful and versatile
application. As it stands, the application provides a strong foundation for future
enhancements, such as expanding voice options or adding additional accessibility features,
which could further broaden its appeal and usability.

9. REFERENCE

1. https://www.w3schools.com/python/
2. https://www.python.org/about/
3. https://pypi.org/project/pyttsx3/
4. Learn Python Programming - Beginner to Master | Udemy
5. https://www.geeksforgeeks.org/python-text-to-speech-by-using-pyttsx3/
6. https://pypi.org/project/pyttsx3/
7. https://www.youtube.com/
8. https://www.google.co.in/

17

You might also like