0% found this document useful (0 votes)
23 views

VIRTUAL

Uploaded by

hirendevgan1577
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

VIRTUAL

Uploaded by

hirendevgan1577
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 67

PROJECT REPORT

VIRTUAL ASSISTANT
SUBMITTED IN PARTIAL FULFILLMENT FOR AWARD
OF
Diploma
IN
COMPUTER ENGINEERING

DEPARTMENT OF COMPUTER
ENGINNERING
Seth Jai Prakash, Polytechnic
Damla, Haryana

Hardik (210150800120) Under the guidance of


Gautam (210150800119) Ms. Pooja Saini
Aditya (210150800102)
Shubham (21015822107)
Kamini (210150800127)

Page | 1
PROJECT REPORT
VIRTUAL ASSISTANT
SUBMITTED IN PARTIAL FULFILLMENT FOR AWARD
OF
Diploma
IN
DEPARTMENT OF COMPUTER
ENGINNERING
Seth Jai Parkash, Polytechnic
Damla, Haryana

Hardik (210150800120) Under the guidance of


Gautam (210150800119) Ms. Pooja Saini
Aditya (210150800102)
Shubham (21015822107)
Kamini (210150800127)

Page | 2
BONAFIDE CERTIFICATE

Certify that this project report titled “VIRTUAL


ASSISTANT” is the bona fide work of “HARDIK,
GAUTAM KAPOOR, ADITYA DANGWAL, SHUBHAM,
KAMINI” of Computer Engineering, who carried out this
work under my supervision.

SIGNATURE
Ms. Pooja Saini
Teaching Staff
Department of COE,
Seth Jai Parkash Polytechnic,
Damla, Haryana

Submitted for the project viva-voice examination held on

Internal Examiner External Examiner

Page | 3
Acknowledgement

We are always thankful for our college. Seth Jai Parkash


Polytechnic, who endorsed us throughout this project.

Our heartfelt thanks to HOD Dr. Karanbir Singh,


Department of Computer Engineering, S.J.P.P. College of
Diploma in Engineering, Damla, for the prompt and limitless
help in providing excellent infrastructure to do the project and
prepare the thesis.
We express our deep sense of gratitude to our guide. Mrs.
Pooja Saini, Teaching Staff, Department of Computer
Engineering, for her invaluable support, guidance, and
encouragement for the successful completion of this project.
His vision and spirit always inspire and enlighten.
We express our sincere thanks to the project committee
members, Mr. Amandeep Singh Gulati, Mr. Rohit
Mandhar, Ms. Sandeep Kaur, Ms. Shifa Ansari and Ms.
Richa Kharbanda, Teaching Staff, Department of Computer
Science and Engineering, Damla, for their invaluable guidance
and technical support.

We thank all faculty members of the department computer


science and engineering for their valuable help and guidance.
We are grateful to our family and friends for their constant
support and encouragement. We thank the almighty, whose
shower of blessings made this project a reality.

Page | 4
ABSTRACT
The project aims to develop a personal virtual assistant for a
Windows-based system. Jarvis draws its inspiration from
virtual assistants like Cortana for Windows and Siri for iOS. It
has been designed to provide a user-friendly interface for
carrying out a variety of tasks by employing certain well-
defined commands. The user can interact with the assistant
either through a voice command or using keyboard input. As a
personal assistant, Jarvis assists the end user with day-to-day
activities like general human conversation, searching queries in
Google or Yahoo, searching for videos, playing songs, live
weather conditions, searching for medicine details, and
reminding the user about the scheduled event and tasks. The
virtual takes the voice through our microphone, converts it into
computer-understandable language, and gives the required
solutions and answers that are asked by the user. This assistant
connects with the world to provide results that the user has
questioned. The project works on voice input, gives output
through voice, and displays the text on the screen. The main
agenda of our virtual assistant is to make people smart and give
instant and computed results.

Page | 5
TABLE CONTENT

Chapter Title Page


No No
Abstract
List of Figure
List of ABBREVATIONS
1. Introduction 11-14

1.1 Objective of the project


1.2 Motivation
1.3 Scope of the project
1.4 Applicability
2. Preliminaries 15-17

2.1 Existing System


2.1.1 Disadvantages
2.2 Proposed System
2.2.2 Advantages
3. Literature Survey 18-19
4. System Specification 20

4.1 Hardware Requirements


4.2 Software Requirements
5. System Design 21-31

5.1 Architecture Diagram


5.2 UML Diagram
5.2.1 Use Case Diagram
5.2.2 Class Diagram
5.2.3 Activity Diagram
5.2.4 Sequence Diagram
Page | 6
6. System Implementation 32-36

6.1 Modules
6.2 Modules Description
7. System Study 37-38

Feasibility Study
7.1.1 Economical Feasibility
7.1.2 Technical Feasibility
7.1.3 Social Feasibility
8. System Testing 39-41

8.1 Testing
8.2 Types of Testing
8.2.1 Unit Testing
8.2.2 Integration Testing
8.2.3 Functional Testing
8.2.4 System Testing
8.2.5 White Box Testing
8.2.6 Black Box Testing
9. Appendix -1 Source Code 42-63
Appendix -2 Snaps Shots
10. Conclusion & Future Work 64-65

10.1 Conclusion
10.2 Future Work

Reference

Page | 7
List of Figure

Figure Name of the Figure


No
1.1 Virtual Assistant
5.1 System Architecture
5.2 Use Case Diagram
5.3 Class Diagram
5.4 Activity Diagram
5.5 (a) Sequence Diagram For
Query-Response
5.5 (b) Sequence Diagram For
Task Execution

Page | 8
LIST OF ABBREVATIONS

ACRONYM ABBREVATION
UML UNIFIED MODELING
LANGUAGE
UI USER INTERFACE
NLP NATURAL LANGUAGE
PROCESS
API APPLICATION PROGRAM
INTERFACE
OS OPERTAING SYSTEM
GUI GRAPHICAL USER
INTERFACE
AI ARTIFICAL
INTELLIGENCE
IoT INTERNET OF THINGS
SAPI SPEECH APPLICATION
PROGRAMMING
INTERFACE

Page | 9
Chapter -1
Introduction

1.1 OBJECTIVE OF THE PROJECT

The development of technology allows for the introduction of


more advanced solutions in everyday life. This makes work less
exhausting for employees and also increases work safety. As
technology is developing day by day, people are becoming
more dependent on it. One of the most commonly used
platforms is a computer. We all want to make the most of these
computers by using the keyboard, but a more convenient way
is to input commands through voice. Giving input through
voice is not only beneficial for normal people but also for those
who are visually impaired and are not able to give the input by
using a keyboard. For this purpose, there is a need for a virtual
assistant that can not only take commands through voice but
also execute the desired instruction and give output either in the
form of voice or any other means.
A virtual assistant is software that can perform tasks and
provide different services to the individual as per the
individual’s dictated commands. It is done through a
synchronous process involving the recognition of speech
patterns and then responding via synthetic speech. Through
Page | 10
these assistants, a user can automate tasks ranging from but not
limited to mailing, task management, and media. It is typically
a cloud-based program that requires an internet-connected
device and/or application to work. The technologies that power
virtual assistants are machine learning, natural processing, and
speech recognition platforms. It uses a sophisticated algorithm
to learn from data input and become better at predicting the end
user’s needs.

1.2 MOTIVATION

The main purpose of this project is to build a program that will


be able to serve humans like personal assistants. This is an
interesting concept, and many people around the globe are
working on it. Today, time and security are two main things to
which people are more sensitive; no one has the time to spoil;
nobody would like their security breach, and this project is
mainly for those kinds of people.
This system is designed to be used efficiently on desktops.
Virtual assistant software improves user productivity by
managing routine tasks for the user and by providing
information from an online source to the user. This project was
started on the premise that there is a sufficient amount of openly
available data and information on the web that can be utilized
to build a virtual assistant that has access to making intelligent
decisions for routine user activities.

1.3 SCOPE OF THE PROJECT

Virtual assistants will continue to offer more individualized


experiences as they get better at differentiating between voices.
Page | 11
However, it’s not just developers that need to address the
complexity of developing for voice, as brands also need to
understand the capabilities of each device and integration and
if it makes sense for their specific brand. They will also need to
focus on maintaining a user experience that is consistent in the
coming years as complexity becomes more of a concern. This
is because the visual interface with the virtual assistant is
missing. A user simply cannot see or touch a voice interface.
Virtual assistants are software programs that help you ease your
day-to-day tasks, such as showing weather reports, playing
music, etc.

1.3 APPLICABLITY
The mass adoption of artificial intelligence in users' everyday
lives is also fueling the shift towards voice. The number of
devices, such as smart thermostats and speakers, is giving voice
assistants more utility in a connected user's life. Smart speakers
are the number one way we are seeing voice being used.
Many industry experts even predict that nearly every
application will integrate voice technology in some way in the
next 5 years. The use of virtual assistants can also enhance the
IoT (Internet of Things) system. Twenty years from now,
Microsoft and its competitors will be offering personal digital
assistants that will offer the services of a full-time employee,
usually reserved for the rich and famous.

Page | 12
FIGURES 1.1 VIRTUAL ASSISTANT

Page | 13
CHAPTER 2
PRELIMINARIES

2.1 EXISTING SYSTEM

This project describes one of the most efficient ways for voice
recognition. It overcomes many of the drawbacks in the
existing solutions to make the Virtual Assistant more efficient.
It uses natural language processing to carry out the specified
tasks. It has various functionalities like network connection and
managing activities by just voice commands. It reduces the
utilization of input devices like keyboard.

This project describes the method to implement a virtual


assistant for desktop using the APIs. In this module, the voice
commands are converted to text through Google Speech API.
Text input is just stored in the database for further process. It is
recognized and matched with the commands available in
database. Once the command is found, its respective task is
executed as voice, text or through user interface as output.

2.1.1 DISADVANTAGES

 They propose a new detection scheme that gets two


similar results which could cause confusions to the user
on deciding the actual/desired output.

Page | 14
 Though the efficiency is high of the proposed module,
the time consumption for each task to complete is higher
and also the complexity of the algorithms would make it
very tough to tweak it if needed in the future.

2.2 PROPOSED SYSTEM

1.QUERIES FROM THE WEB:

Making queries is an essential part of one's life. We have


addressed the essential part of a netizen's life by enabling our
voice assistant to search the web. Virtual Assistant supports a
plethora of search engine like Google displays the result by
scraping the searched queries.

2. ACCESSING NEWS:

Being up-to-date in this modern world is very much important.


In that way news plays a big crucial role in keeping ourselves
updated. News keeps you informed and also helps in spreading
knowledge.

3. TO SEARCH SOMETHING ON WIKIPEDIA:

Wikipedia's purpose is to benefit readers by acting as a widely


accessible and free encyclopaedia; a comprehensive written
Page | 15
compendium that contains information on all branches of
knowledge.

4. ACCESSING MUSIC PLAYLIST:


Music have remained as a main source of entertainment, one of
the most prioritized tasks of virtual assistants. you can play any
song of your choice. However, you can also play a random song
with the help of a random module. Every time you command
to play music, the Virtual Assistant will play any random song
from the song directory.

5. OPENING CODE EDITOR:


Virtual Assistant is capable of opening your code editor or IDE
with a single voice command.

2.1.2 ADVANTAGES

 Platform independence
 Increased flexibility
 Saves time by automating repetitive tasks
 Accessibility options for Mobility and the visually
impaired
 Reducing our dependence on screens
 Adding personality to our daily lives
 More human touch
 Coordination of IoT devices
 Accessible and inclusive
Page | 16
CHAPTER 3
Literature Survey

3.1 "VOICE ASSISTANT USING PYTHON"


YEAR: 2020
AUTHORS: Subhash Mani Kaushal, Megha Mishra
CONCEPT: Natural Language Processing.

3.2 "DESKTOP VOICE ASSISTANT"


YEAR: 2020
AUTHORS: Gaurav Agarwal, Harsha Gupta, Chinmay Jain
CONCEPT: Prerequisite APIs For Virtual Assistant.

3.3 "SMART PYTHON CODING THROUGH VOICE


RECOGNITION"
YEAR: 2019
AUTHORS: M. A. Jawale, A. B. Pawar, D. N. Kyatanavar
CONCEPT: User experience field for better programming
Integrated Development Environment Development (IDE).
3.4 "VPA: VIRTUAL PERSONAL ASSISTANT"
YEAR: 2018
AUTHORS: Nikita Saibewar, Yash Shah, Monika Das
CONCEPT: Implementation of Functionalities of VPA.

Page | 17
3.5 "SURVEY ON VIRTUAL ASSISTANT"
YEAR: 2018
AUTHORS: Amrita Sunil Tulshan, Sudhir Namdeorao
CONCEPT: Functionalities of Existing IVAS.

Page | 18
CHAPTER – 4
SYSTEM SPECIFICATION

4.1HARDWARES REQUIREMENTS

 Processor - Intel Pentium 4


 RAM-512 MB
 Hardware capacity:80GB
 Monitor type- 15inch colour monitor
 CD-Drive type- 52xmax
 Mouse
 Microphone
 Personal Computer / Laptop

4.2SOFTWARE REQUIREMENTS

 Operating System – Windows


 Simulation Tools - Visual Studio Code
 Python - Version 3.9.6
 Packages
1. Pyttsx3
2. Speech Recognition
3. Wikipedia
Page | 19
CHAPTER – 5
SYSTEM DESIGN

An architectural diagram is diagram of a system that is used to


abstract the overall outline of the software system and the
relationships, constraints, and boundaries between
components. It is an important tool as it provides an overall
view of the physical deployment of the software system and its
evolution roadmap. An architecture description is a formal
description and representation of a system, organized in a way
that supports reasoning about the structures and behavioural of
the system. After going through the above process, we have
successfully enabled the model to understand the features.

FIGURES 5.1 SYSTEM ARCHITECTURE DIAGRAM

Page | 20
5.2 UML DIAGRAM
The Unified Modeling Language is a general-purpose,
developmental modeling language in the field of software
engineering that is intended to provide a standard way to
visualize the design of a system.
A UML diagram is a diagram based on the UML (Unified
Modeling Language) with the purpose of visually representing
a system along with its main actors, roles, actions, artifacts, or
classes in order to better understand, alter, maintain, or
document information about the system.

UML defines service models for representing systems.


The Unified Modeling Language is a general-purpose,
developmental modeling language in the field of software
engineering that is intended to provide a standard way to
visualize the design of a system.
A UML diagram is a diagram based on the UML (Unified
Modeling Language) with the purpose of visually representing
a system along with its main actors, roles, actions, artifacts, or
classes in order to better understand, alter, maintain, or
document information about the system.

ADVANTAGES

 Most used and flexible


 Development time is reduced

Page | 21
 Provide standard for software development
 It has large visual element to construct and easy to
follow

5.2.1 USE CASE DIAGRAM

In UML, use-case diagrams model the behavioural


of a system and help to capture the requirements of the system.
Use-case diagrams describe the high-level functions and scope
of a system. These diagrams also identify the interactions
between the system and its actors. In this project there is only
one user. The user queries command to the system. System then
interprets it and fetches answer. The response is sent back to the
user.

Page | 22
Figure 5.2 Use Case Diagram

Page | 23
5.2.2 CLASS DIAGRAM

Class diagram is a static diagram. It represents the static view


of an application. Class diagram is not only used for
visualizing, describing, and documenting different aspects of a
system but also for constructing executable code of the
software application.
The class user has 2 attributes command that it sends in audio
and the response it receives which is also audio. It performs
function to listen the user command. Interpret it and then reply
or sends back response accordingly. Question class has the
command in string form as it is interpreted by interpret class. It
sends it to general or about or search function based on its
identification. The task class also has interpreted command in
string format.

Page | 24
FIGURES 5.3 CLASS DIAGRAM

Page | 25
5.2.3 ACTIVITY DIAGRAM

An activity diagram is a behavioural diagram. It depicts the


behavioural of a system. An activity diagram portrays the
control flow from a start point to a finish point showing the
various decision paths that exist while the activity is being
executed.

Initially, the system is in idle mode. As it receives any wakeup


call it begins execution. The received command is identified
whether it is a question or task to be performed. Specific action
is taken accordingly. After the question is being answered or
the task is being performed, the system waits for another
command. This loop continues unless it receives a quit
command.

Page | 26
FIGURE 5.4 ACTIVITY DIAGRAM

Page | 27
5.2.4 SEQUENCE DIAGRAM

A sequence diagram is a Unified Modeling Language (UML)


diagram that illustrates the sequence of messages between
objects in an interaction. A sequence diagram consists of a
group of objects that are represented by lifelines, and the
messages that they exchange over time during the interaction.

The below sequence diagram shows how an answer asked by


the user is being fetched from internet. The audio query is
interpreted and sent to Web scraper. The web scraper searches
and finds the answer. It is then sent back to speaker, where it
speaks the answer to user.

Page | 28
Figure 5.5 (a) Sequence Diagram For Query Response

Page | 29
The user send command to virtual assistant in audio form. The
command is passed to interpreter. It identifies what the user has
asked directs it to task executer. If the task is missing some
information, the virtual assistant asks about it. The received
information is sent to back to task and it is accomplished. After
execution feedback is sent back to user.

Figure 5.6(b) Sequence Diagram For Task Execution

Page | 30
CHAPTER – 6
SYSTEM IMPLEMENTATION

6.1 MODULES

Pyaudio, speech recognition, pyttsx3, datetime, Wikipedia,


webbrowser, os, smtplib, requests, json, time, tkinter
From tkinter- messagebox, simpledialog, Label, filedialog,
scrolledtext, Menu
from PIL - Image, ImageTk
threading
google.generativeai
transformers, pipeline
from tkinter.font - Font
from youtubesearchpython - VideosSearch

Page | 31
6.2 MODULE DESCRIPTION

6.2.1 import pyaudio


PyAudio is a Python library that provides bindings for
PortAudio, a cross-platform audio library. It allows users to
record and play audio in real-time with a simple and easy-to-
use interface. PyAudio supports various audio functionalities,
including input and output streams, and is commonly used for
audio processing and analysis tasks.
6.2.2 import speech_recognition as sr

speech_recognition is a Python library that allows for:


 Converting spoken language into written text.
 Supporting multiple speech recognition engines and APIs
like Google Web Speech API, Microsoft Bing Voice
Recognition, and more.
 Capturing audio input from microphones and recognizing
speech in real-time or from audio files.

6.2.3 import pyttsx3


pyttsx3 is a Python library that provides text-to-speech
conversion using various speech engines. It supports both
offline and online text-to-speech services, and it allows you to
adjust speech rate, volume, and voice properties. It works on
multiple platforms including Windows, macOS, and Linux.
6.2.4 import datetime

Page | 32
datetime is a built-in Python module that supplies classes for
manipulating dates and times. It allows you to perform a variety
of operations, such as getting the current date and time,
formatting and parsing dates, calculating time differences, and
more.
6.2.5 import Wikipedia
wikipedia is a Python library that allows users to access and
interact with the data on Wikipedia. It provides functionalities
to search for articles, retrieve article summaries, full article
content, and other meta-data such as page links and categories.
6.2.6 import web browser
A built-in Python module that provides a high-level interface
to allow displaying web-based documents to users. It can open
URLs in a web browser.

6.2.7 import os
A built-in Python module that provides a way to interact with
the operating system. It offers functionalities for file and
directory manipulation, environment variables, and more.
6.2.8 import openai
A Python library for interacting with OpenAI's API, enabling
tasks such as natural language processing, generation, and other
AI functionalities.
6.2.9 import smtplib
A built-in Python module for sending emails using the Simple
Mail Transfer Protocol (SMTP).
Page | 33
6.2.10 import requests
A Python library for making HTTP requests in a simple and
human-friendly way. It supports various HTTP methods like
GET, POST, PUT, DELETE, and more.
6.2.11 import json
A built-in Python module for parsing JSON (JavaScript Object
Notation) data and converting Python data structures to JSON
strings.
6.2.12 import time
A built-in Python module that provides various time-related
functions, such as getting the current time, sleeping for a
specified number of seconds, and measuring intervals.
6.2.13 import tkinter as tk
from tkinter import messagebox, simpledialog, Label,
P0hotoImage

tkinter (tk): A built-in Python library for creating graphical


user interfaces (GUIs).
 messagebox: A module in tkinter that provides standard
pop-up message dialogs.
 simpledialog: A module in tkinter that provides simple
dialogs to prompt the user for input.
 Label: A widget in tkinter used to display text or images.
 PhotoImage: A class in tkinter used to display images in
supported formats like GIF, PGM, and PPM.

Page | 34
6.2.14 from PIL import Image, ImageTk

import threading
PIL (Pillow): The Python Imaging Library (PIL) fork, Pillow,
adds powerful image processing capabilities.
 Image: A module in Pillow for opening, manipulating,
and saving various image file formats.
 ImageTk: A module in Pillow for using PIL images with
tkinter applications.
threading: A built-in Python module for creating, controlling,
and managing threads, enabling concurrent execution of code.
6.2.15 import threading
Threading can be used to run multiple tasks concurrently. For
example, you can use threading to run multiple YouTube
searches simultaneously or process data while performing other
tasks.
6.2.16 import google.generativeai as genai
To use Google Generative AI, you'll need the google-
generativeai library. Ensure you have the required API keys and
setup.
6.2.17 from transformers import pipeline
The transformers library by Hugging Face allows you to use
various pre-trained AI models.
6.2.18 from tkinter.font import Font
Using Tkinter, you can create GUI applications and customize
fonts.

Page | 35
CHAPTER – 7
SYSTEM STUDY

7.1 FESABILITY STUDY


Feasibility study can help you determine whether or not you
should proceed with your project. It is essential to evaluate cost
and benefit of the proposed system.
Three key considerations involved in the feasibility analysis
are:
• Economical feasibility
• Technical feasibility
• Social
7.1.1 ECONOMICAL FEASIBILITY
Here, we find the total cost and benefit of the proposed system
over current system. For this project, the main cost is
documentation cost. User also would have to pay for
microphones and speakers. Again, they are cheap and available.
7.1.2 TECHNICAL FEASIBILITY
It includes finding out technologies for the project, both
hardware and software. For virtual assistant, user must have
microphone to convey their message and a speaker to listen
what system speak. There are very cheap now a day and
everyone generally possess them.
7.1.3 SOCIAL FEASIBILITY
The aspect of study is to check the level of acceptance of the
system by the user. This includes the process of training the
Page | 36
user to use the system efficiently. The user must not feel
threatened by the system, instead must accept it as a necessity.
The level of acceptance by the users solely depends on the
methods that are employed to educate the user about the system
and to make him familiar with it. His level of confidence must
be raised so that he is also able to make some constructive
criticism, which is welcomed, as he is the final user of the
system.

Page | 37
CHAPTER 8
SYSTEM TESTING

8.1 TESTING
The purpose of testing is to discover errors. Testing is the
process of trying to discover every conceivable fault or
weakness in a work product. It provides a way to check the
functionality of components, sub - assemblies, assemblies
and/or a finished product It is the process of exercising software
with the intent of ensuring that the Software system meets its
requirements and user expectations and does not fail in an
unacceptable manner. There are various types of tests. Each test
type addresses a specific testing requirement.
8.2 TYPES OF TESTING UNIT TESTING

8.2.1 UNIT TESTING


Unit testing involves the design of test cases that validate that
the internal program logic is functioning properly, and that
program inputs produce valid outputs. It is the testing of
individual software units of the application. It is done after the
completion of an individual unit before integration. This is a
structural testing, that relies on knowledge of its construction
and is invasive. Unit tests perform basic tests at component
level and test a specific business process, application, and/or
system configuration.

Page | 38
8.2.2 INTEGRATION TESTING
Integration tests are designed to test integrated software
components to determine if they actually run as one program.
Testing is event driven and is more concerned with the basic
outcome of screens or fields. Integration tests demonstrate that
although the components were individually satisfaction, as
shown by successfully unit testing, the combination of
components is correct and consistent. Integration testing is
specifically aimed at exposing the problems that arise from the
combination of components.
8.2.3 FUNCTIONAL TESTING
Functional tests provide systematic demonstrations that
functions tested are available as specified by the business and
technical 29 requirements, system documentation, and user
manuals. Functional testing is centred on the following items:
•Valid input: identified classes of valid input must be accepted
identify classes of valid input must be accepted.
• Invalid output: identified classes of valid input must be
accepted.

•Functions: Identified functions must be exercised


Output: identified classes of application outputs must be
exercised.
8.2.4 SYSTEM TESTING
System testing ensures that the entire integrated software
system. It tests a configuration to ensure known and predictable
results. System testing is based on the process descriptions,

Page | 39
flows, emphasizing pre-driven process links and integration
points.
8.2.5 WHITE BOX TESTING
White Box Testing is a testing in which in which the software
tester has knowledge of the inner workings, structure and
language of the 30 software, or at least its purpose. It is purpose.
It is used to test areas that cannot be reached from a black box
level.
8.2.6 BLACK BOX TESTING
Black Box Testing is testing the software without any
knowledge of the inner workings, structure or language of the
module being tested. Black box tests, as most other kinds of
tests, must be written from a definitive source document, such
as specification or requirements document.

Page | 40
CHAPTER 9
APPENDIX -1
SOURCE CODE
import pyaudio
import speech_recognition as sr
import pyttsx3
import datetime
import wikipedia
import webbrowser
import os
import smtplib
import requests
import json
import time
import tkinter as tk
from tkinter import messagebox, simpledialog, Label,
filedialog, scrolledtext, Menu
from PIL import Image, ImageTk
import threading
import google.generativeai as genai
from transformers import pipeline
from tkinter.font import Font
from youtubesearchpython import VideosSearch
Page | 41
# Initialize the text-to-speech engine
engine = pyttsx3.init('sapi5')
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[1].id)

# Set your Gemini API key here


os.environ["API_KEY"] =
'AIzaSyCQqLLXfcL3JNlJ7Ds8_AWSeVy8TsFImt4'
genai.configure(api_key=os.environ["API_KEY"])

# Email credentials
email_user = '[email protected]'
email_password = 'your_email_password'

# Global variables to update the GUI from different threads


last_command = ""
def speak(audio):
engine.say(audio)
engine.runAndWait()

def update_command_label(command):
global last_command

Page | 42
last_command = command
gui.last_command_label.config(text=f"Last Command:
{command}")

def wishMe():
hour = int(datetime.datetime.now().hour)
if 0 <= hour < 12:
speak("Good Morning!")
elif 12 <= hour < 18:
speak("Good Afternoon!")
else:
speak("Good Evening!")
speak("I am your virtual assistant, How may I help you?")

def takeCommand():
r = sr.Recognizer()
with sr.Microphone() as source:
print("Listening...")
r.pause_threshold = 1
audio = r.listen(source)
try:
print("Recognizing...")
query = r.recognize_google(audio, language="en-in")

Page | 43
print(f"User said: {query}\n")
update_command_label(query)
return query
except Exception as e:
print(f"Could not understand the audio: {e}")
return "None"

def chatGPTResponse(prompt):
model = genai.GenerativeModel('gemini-1.5-pro-latest')
response = model.generate_content(prompt)
return response.text

def youtube():
webbrowser.open("https://www.youtube.com/")

def google(query):
speak("Opening on Google...")
query = query.replace("google", "")

webbrowser.open(f"https://www.google.com/search?q={query
}")

def sendEmail(to, content):

Page | 44
try:
server = smtplib.SMTP('smtp.gmail.com', 587)
server.starttls()
server.login(email_user, email_password)
server.sendmail(email_user, to, content)
server.close()
speak("Email has been sent.")
except Exception as e:
speak("Sorry, I am not able to send this email.")
print(f"Email error: {e}")

def getWeather(city):
try:
api_key = "your_openweathermap_api_key"
base_url =
f"http://api.openweathermap.org/data/2.5/weather?q={city}&
appid={api_key}&units=metric"
response = requests.get(base_url)
data = response.json()
if data["cod"] != "404":
main = data["main"]
weather_description =
data["weather"][0]["description"]
temperature = main["temp"]
Page | 45
speak(f"The temperature in {city} is {temperature}
degrees Celsius with {weather_description}.")
else:
speak("City not found.")
except Exception as e:
speak("Sorry, I couldn't retrieve the weather
information.")
print(f"Weather error: {e}")

def getNews():
try:
api_key = "your_newsapi_api_key"
base_url = f"https://newsapi.org/v2/top-
headlines?country=us&apiKey={api_key}"
response = requests.get(base_url)
news_data = response.json()
articles = news_data["articles"]
speak("Here are the top headlines:")
for article in articles[:5]:
speak(article["title"])
except Exception as e:
speak("Sorry, I couldn't retrieve the news.")
print(f"News error: {e}")

Page | 46
def tellJoke():
try:
response = requests.get("https://official-joke-
api.appspot.com/random_joke")
joke = response.json()
speak(f"Here is a joke: {joke['setup']} ...
{joke['punchline']}")
except Exception as e:
speak("Sorry, I couldn't retrieve a joke.")
print(f"Joke error: {e}")

def tellQuote():
try:
response = requests.get("https://api.quotable.io/random")
quote = response.json()
speak(f"Here is an inspirational quote: {quote['content']}
by {quote['author']}")
except Exception as e:
speak("Sorry, I couldn't retrieve a quote.")
print(f"Quote error: {e}")

def handle_command(query):
query = query.lower()
if "wikipedia" in query:
Page | 47
speak("Searching Wikipedia...")
query = query.replace("wikipedia", "")
results = wikipedia.summary(query, sentences=2)
speak("According to Wikipedia")
speak(results)
return results
elif 'open youtube' in query:
youtube()
return "opening youtube..."
elif 'google' in query:
google(query)
return "opening google..."
elif 'play' in query:
videosSearch = VideosSearch(query, limit=2)

webbrowser.open(videosSearch.result()['result'][0]["link"])
return "opening on youtube..."
elif 'the time' in query:
strTime =
datetime.datetime.now().strftime("%H:%M:%S")
speak(f"The time is {strTime}")
return strTime
elif 'local' in query:

Page | 48
speak("Playing the song...")
music_dir = "D:\\snaptube audios"
songs = os.listdir(music_dir)
if songs:
os.startfile(os.path.join(music_dir, songs[0]))
else:
speak("No songs found in the directory.")
elif 'explain yourself' in query:
speak("I am Realtalk voice communication utility for
Windows and Linux.")
speak("My creators are Hardik, Shubham, Aditya
Dangwal, and Gautam.")
return "I am Realtalk voice communication utility for
Windows and Linux. /n My creators are Hardik, Shubham,
Aditya Dangwal,Gautam, and kamni"

elif 'open code' in query:


codePath =
"C:\\Users\\DELL\\AppData\\Local\\Programs\\Microsoft VS
Code\\Code.exe"
os.startfile(codePath)
return "opening vs code..."
elif 'activate' in query or 'talk' in query or 'question' in query:
speak("Sure, let's chat. What would you like to know?")

Page | 49
chat_query = takeCommand()
if chat_query:
response = chatGPTResponse(chat_query)
speak(response)
elif 'gmail' in query:
try:
speak("What should I say?")
content = simpledialog.askstring("Gmail", "What
should I say?")
speak("To whom should I send it?")
to = simpledialog.askstring("Gmail", "Enter the
recipient email:")
if content and to:
sendEmail(to, content)
except Exception as e:
speak("Sorry, I am not able to send this gmail.")
print(f"Gmail error: {e}")
elif 'weather' in query:
speak("Which city's weather would you like to know?")
city = simpledialog.askstring("Weather", "Enter city
name:")
if city:
getWeather(city)
elif 'news' in query:
Page | 50
getNews()
elif 'joke' in query:
tellJoke()
elif 'quote' in query:
tellQuote()
elif 'set reminder' in query:
speak("What should I remind you about?")
reminder = simpledialog.askstring("Reminder", "What
should I remind you about?")
speak("In how many seconds should I remind you?")
delay = simpledialog.askinteger("Reminder", "Enter the
delay in seconds:")
if reminder and delay:
time.sleep(delay)
speak(f"Reminder: {reminder}")
elif 'write' in query:
response = chatGPTResponse(query)

speak("Do you want to save?")

while True:
speak("If you want to save, say yes; otherwise, say no.")

Page | 51
save = takeCommand()
if "yes" in save:
with open("a.txt", "w") as f:
f.write(response)
break
elif "no" in save:
break
else:
continue
return response
elif 'exit' in query:
speak("Happy to help you! Have a good day ahead.")
root.destroy()
else:
Gemini_response = chatGPTResponse(query)
speak(Gemini_response)
return Gemini_response

def start_listening():
speak("How can I help you?")
query = takeCommand()
if query != "None":
handle_command(query)
Page | 52
def listen_for_wake_word():
recognizer = sr.Recognizer()
microphone = sr.Microphone()

while True:
with microphone as source:
print("Listening for wake word...")
recognizer.adjust_for_ambient_noise(source)
audio = recognizer.listen(source)

try:
wake_word = recognizer.recognize_google(audio,
language="en-in").lower()
if "jarvis" in wake_word:
print("Wake word detected!")
speak("Yes, how can I help you?")
start_listening()
except sr.UnknownValueError:
continue
except sr.RequestError as e:
print(f"Recognition error: {e}")

Page | 53
class VirtualAssistantGUI:
def __init__(self, root):
self.root = root
self.root.title("Virtual Assistant")
self.root.geometry("800x600")
self.root.configure(bg="#f7f7f7")

# Create a custom font


self.font_regular = Font(family="Helvetica", size=12)
self.font_bold = Font(family="Helvetica", size=12,
weight="bold")

# Create a menu
self.menu = Menu(self.root, bg="#f7f7f7", fg="#333333",
tearoff=0)
self.root.config(menu=self.menu)

self.options_menu = Menu(self.menu, bg="#f7f7f7",


fg="#333333", tearoff=0)
self.menu.add_cascade(label="Options",
menu=self.options_menu)
self.options_menu.add_command(label="Clear History",
command=self.clear_history)

Page | 54
# Create frames
self.top_frame = tk.Frame(root, bg="#f7f7f7")
self.top_frame.pack(pady=10, padx=10)

self.middle_frame = tk.Frame(root, bg="#ffffff", bd=1,


relief="solid")
self.middle_frame.pack(pady=10, padx=10,
fill=tk.BOTH, expand=True)

self.bottom_frame = tk.Frame(root, bg="#f7f7f7")


self.bottom_frame.pack(pady=10, padx=10, fill=tk.X)

# Conversation history (ScrolledText widget)


self.conversation_history =
scrolledtext.ScrolledText(self.middle_frame, wrap=tk.WORD,
width=70, height=20, state='disabled', bg="#ffffff",
fg="#333333", font=self.font_regular, bd=0, relief="solid")
self.conversation_history.pack(fill=tk.BOTH,
expand=True, padx=10, pady=10)

# Entry field for input


self.input_field = tk.Entry(self.bottom_frame, width=50,
font=self.font_regular, bg="#ffffff", fg="#333333", bd=1,
relief="solid")

Page | 55
self.input_field.pack(side=tk.LEFT, padx=10, fill=tk.X,
expand=True)
self.input_field.bind('<Return>', self.process_input)

# Send button
self.send_button = tk.Button(self.bottom_frame,
text="Send", command=self.process_input,
font=self.font_bold, bg="#007bff", fg="#ffffff", bd=0,
padx=10, pady=5)
self.send_button.pack(side=tk.LEFT, padx=5)

# Voice command button


self.voice_button = tk.Button(self.bottom_frame,
text="🎤", command=self.process_voice_command,
font=self.font_bold, bg="#28a745", fg="#ffffff", bd=0,
padx=10, pady=5)
self.voice_button.pack(side=tk.LEFT, padx=5)

# Upload image button


self.upload_button = tk.Button(self.bottom_frame,
text="📷", command=self.upload_image, font=self.font_bold,
bg="#17a2b8", fg="#ffffff", bd=0, padx=10, pady=5)
self.upload_button.pack(side=tk.LEFT, padx=5)

# Last command label

Page | 56
self.last_command_label = tk.Label(self.top_frame,
text="Last Command: None", font=self.font_regular,
bg="#f7f7f7", fg="#333333")
self.last_command_label.pack()

# Command suggestions
self.suggestions_label = tk.Label(self.top_frame,
text="Suggestions: hello, how are you, what's your name",
font=("Helvetica", 10), bg="#f7f7f7", fg="#888888")
self.suggestions_label.pack()

def process_input(self, event=None):


user_input = self.input_field.get()
if user_input:
self.update_conversation_history(f"You:
{user_input}")
self.input_field.delete(0, tk.END)
a=handle_command(user_input)
if a !="n":
self.update_conversation_history("Assistant: "+a)

def process_voice_command(self):
query = takeCommand()
if query != "None":

Page | 57
self.update_conversation_history(f"You: {query}")
handle_command(query)

def upload_image(self):
file_path = filedialog.askopenfilename()
if file_path:
self.update_conversation_history(f"Uploaded image:
{file_path}")

def clear_history(self):
self.conversation_history.config(state=tk.NORMAL)
self.conversation_history.delete(1.0, tk.END)
self.conversation_history.config(state=tk.DISABLED)

def update_conversation_history(self, text):


self.conversation_history.config(state=tk.NORMAL)
self.conversation_history.insert(tk.END, text + "\n")
self.conversation_history.config(state=tk.DISABLED)
self.conversation_history.yview(tk.END)

# Initialize the Tkinter root and GUI


root = tk.Tk()
gui = VirtualAssistantGUI(root)

Page | 58
# Start the wake word listening thread
wake_word_thread =
threading.Thread(target=listen_for_wake_word,
daemon=True)
wake_word_thread.start()

# Run the Tkinter event loop


root.mainloop()

Page | 59
Appendix -2
Snaps Shot

Page | 60
Page | 61
Page | 62
CHAPTER 10
CONCLUSION AND FUTURE WORK

10.1 CONLUSION

Through this virtual assistant, we have automated various


services using a single line command. It eases most of the tasks
of the user like searching the web, playing music and doing
Wikipedia searches. We aim to make this project a complete
server assistant and make it smart enough to act as a
replacement for a general server administration. The project is
built using available open-source software modules with visual
studio code community backing which can accommodate any
updates in future. The modular nature of this project makes it
more flexible and easier to add additional features without
disturbing current system functionalities. It not only works on
human commands but also give responses to the user based on
the query being asked or the words spoken by the user such as
opening tasks and operations. The application should also
eliminate any kind of unnecessary manual work required in the
user life of performing every task.
10.2 FUTURE WORK

The virtual assistants which are currently available are fast and
responsive but we still have to go a long way. The
understanding and reliability of the current systems need to be
improved a lot. The assistants available nowadays are still not
reliable in critical scenarios. The future plans include
Page | 63
integrating our virtual assistant with mobile using React Native
to provide a synchronised experience between the two
connected devices. Further, in the long run, our virtual assistant
is planned to feature auto deployment supporting elastic
beanstalk, backup files, and all operations which a general
Server Administrator does. The future of these assistants will
have the virtual assistants incorporated with Artificial
Intelligence which includes Machine Learning, Neural
Networks, etc. and IoT.

Page | 64
REFERENCE

1. *Google Assistant SDK*: The Google Assistant SDK lets


you add voice control, natural language understanding, and
Google’s intelligence to your devices. You can create a
custom voice interface using the SDK.

- Documentation: [Google Assistant


SDK](https://developers.google.com/assistant/sdk)

2. *Dialogflow*: Dialogflow is a natural language


understanding platform used to design and integrate
conversational user interfaces into mobile apps, web
applications, devices, bots, and more.

- Documentation:
[Dialogflow](https://cloud.google.com/dialogflow/docs)
- GitHub Repository: [dialogflow/dialogflow-fulfillment-
nodejs](https://github.com/dialogflow/dialogflow-fulfillment-
nodejs)

3. *Actions on Google*: This platform allows you to extend


the functionality of Google Assistant by building apps that can
handle specific types of user requests.

Page | 65
- Documentation: [Actions on
Google](https://developers.google.com/assistant)
- GitHub Repository: [actions-on-
google](https://github.com/actions-on-google)

4. *Google Cloud Speech-to-Text*: This service converts


audio to text by applying powerful neural network models in
an easy-to-use API.

- Documentation: [Google Cloud Speech-to-


Text](https://cloud.google.com/speech-to-text)

5. *Google Cloud Text-to-Speech*: This service converts text


into natural-sounding speech using deep learning models.

- Documentation: [Google Cloud Text-to-


Speech](https://cloud.google.com/text-to-speech)

6. *Sample Projects and Code Repositories*:


- [Google Assistant Python SDK
Sample](https://github.com/googlesamples/assistant-sdk-
python)
- [Google Assistant Node.js SDK
Sample](https://github.com/googlesamples/assistant-sdk-
nodejs)

Page | 66
- [Google Cloud
Samples](https://github.com/GoogleCloudPlatform/python-
docs-samples)

Page | 67

You might also like