VIRTUAL
VIRTUAL
VIRTUAL ASSISTANT
SUBMITTED IN PARTIAL FULFILLMENT FOR AWARD
OF
Diploma
IN
COMPUTER ENGINEERING
DEPARTMENT OF COMPUTER
ENGINNERING
Seth Jai Prakash, Polytechnic
Damla, Haryana
Page | 1
PROJECT REPORT
VIRTUAL ASSISTANT
SUBMITTED IN PARTIAL FULFILLMENT FOR AWARD
OF
Diploma
IN
DEPARTMENT OF COMPUTER
ENGINNERING
Seth Jai Parkash, Polytechnic
Damla, Haryana
Page | 2
BONAFIDE CERTIFICATE
SIGNATURE
Ms. Pooja Saini
Teaching Staff
Department of COE,
Seth Jai Parkash Polytechnic,
Damla, Haryana
Page | 3
Acknowledgement
Page | 4
ABSTRACT
The project aims to develop a personal virtual assistant for a
Windows-based system. Jarvis draws its inspiration from
virtual assistants like Cortana for Windows and Siri for iOS. It
has been designed to provide a user-friendly interface for
carrying out a variety of tasks by employing certain well-
defined commands. The user can interact with the assistant
either through a voice command or using keyboard input. As a
personal assistant, Jarvis assists the end user with day-to-day
activities like general human conversation, searching queries in
Google or Yahoo, searching for videos, playing songs, live
weather conditions, searching for medicine details, and
reminding the user about the scheduled event and tasks. The
virtual takes the voice through our microphone, converts it into
computer-understandable language, and gives the required
solutions and answers that are asked by the user. This assistant
connects with the world to provide results that the user has
questioned. The project works on voice input, gives output
through voice, and displays the text on the screen. The main
agenda of our virtual assistant is to make people smart and give
instant and computed results.
Page | 5
TABLE CONTENT
6.1 Modules
6.2 Modules Description
7. System Study 37-38
Feasibility Study
7.1.1 Economical Feasibility
7.1.2 Technical Feasibility
7.1.3 Social Feasibility
8. System Testing 39-41
8.1 Testing
8.2 Types of Testing
8.2.1 Unit Testing
8.2.2 Integration Testing
8.2.3 Functional Testing
8.2.4 System Testing
8.2.5 White Box Testing
8.2.6 Black Box Testing
9. Appendix -1 Source Code 42-63
Appendix -2 Snaps Shots
10. Conclusion & Future Work 64-65
10.1 Conclusion
10.2 Future Work
Reference
Page | 7
List of Figure
Page | 8
LIST OF ABBREVATIONS
ACRONYM ABBREVATION
UML UNIFIED MODELING
LANGUAGE
UI USER INTERFACE
NLP NATURAL LANGUAGE
PROCESS
API APPLICATION PROGRAM
INTERFACE
OS OPERTAING SYSTEM
GUI GRAPHICAL USER
INTERFACE
AI ARTIFICAL
INTELLIGENCE
IoT INTERNET OF THINGS
SAPI SPEECH APPLICATION
PROGRAMMING
INTERFACE
Page | 9
Chapter -1
Introduction
1.2 MOTIVATION
1.3 APPLICABLITY
The mass adoption of artificial intelligence in users' everyday
lives is also fueling the shift towards voice. The number of
devices, such as smart thermostats and speakers, is giving voice
assistants more utility in a connected user's life. Smart speakers
are the number one way we are seeing voice being used.
Many industry experts even predict that nearly every
application will integrate voice technology in some way in the
next 5 years. The use of virtual assistants can also enhance the
IoT (Internet of Things) system. Twenty years from now,
Microsoft and its competitors will be offering personal digital
assistants that will offer the services of a full-time employee,
usually reserved for the rich and famous.
Page | 12
FIGURES 1.1 VIRTUAL ASSISTANT
Page | 13
CHAPTER 2
PRELIMINARIES
This project describes one of the most efficient ways for voice
recognition. It overcomes many of the drawbacks in the
existing solutions to make the Virtual Assistant more efficient.
It uses natural language processing to carry out the specified
tasks. It has various functionalities like network connection and
managing activities by just voice commands. It reduces the
utilization of input devices like keyboard.
2.1.1 DISADVANTAGES
Page | 14
Though the efficiency is high of the proposed module,
the time consumption for each task to complete is higher
and also the complexity of the algorithms would make it
very tough to tweak it if needed in the future.
2. ACCESSING NEWS:
2.1.2 ADVANTAGES
Platform independence
Increased flexibility
Saves time by automating repetitive tasks
Accessibility options for Mobility and the visually
impaired
Reducing our dependence on screens
Adding personality to our daily lives
More human touch
Coordination of IoT devices
Accessible and inclusive
Page | 16
CHAPTER 3
Literature Survey
Page | 17
3.5 "SURVEY ON VIRTUAL ASSISTANT"
YEAR: 2018
AUTHORS: Amrita Sunil Tulshan, Sudhir Namdeorao
CONCEPT: Functionalities of Existing IVAS.
Page | 18
CHAPTER – 4
SYSTEM SPECIFICATION
4.1HARDWARES REQUIREMENTS
4.2SOFTWARE REQUIREMENTS
Page | 20
5.2 UML DIAGRAM
The Unified Modeling Language is a general-purpose,
developmental modeling language in the field of software
engineering that is intended to provide a standard way to
visualize the design of a system.
A UML diagram is a diagram based on the UML (Unified
Modeling Language) with the purpose of visually representing
a system along with its main actors, roles, actions, artifacts, or
classes in order to better understand, alter, maintain, or
document information about the system.
ADVANTAGES
Page | 21
Provide standard for software development
It has large visual element to construct and easy to
follow
Page | 22
Figure 5.2 Use Case Diagram
Page | 23
5.2.2 CLASS DIAGRAM
Page | 24
FIGURES 5.3 CLASS DIAGRAM
Page | 25
5.2.3 ACTIVITY DIAGRAM
Page | 26
FIGURE 5.4 ACTIVITY DIAGRAM
Page | 27
5.2.4 SEQUENCE DIAGRAM
Page | 28
Figure 5.5 (a) Sequence Diagram For Query Response
Page | 29
The user send command to virtual assistant in audio form. The
command is passed to interpreter. It identifies what the user has
asked directs it to task executer. If the task is missing some
information, the virtual assistant asks about it. The received
information is sent to back to task and it is accomplished. After
execution feedback is sent back to user.
Page | 30
CHAPTER – 6
SYSTEM IMPLEMENTATION
6.1 MODULES
Page | 31
6.2 MODULE DESCRIPTION
Page | 32
datetime is a built-in Python module that supplies classes for
manipulating dates and times. It allows you to perform a variety
of operations, such as getting the current date and time,
formatting and parsing dates, calculating time differences, and
more.
6.2.5 import Wikipedia
wikipedia is a Python library that allows users to access and
interact with the data on Wikipedia. It provides functionalities
to search for articles, retrieve article summaries, full article
content, and other meta-data such as page links and categories.
6.2.6 import web browser
A built-in Python module that provides a high-level interface
to allow displaying web-based documents to users. It can open
URLs in a web browser.
6.2.7 import os
A built-in Python module that provides a way to interact with
the operating system. It offers functionalities for file and
directory manipulation, environment variables, and more.
6.2.8 import openai
A Python library for interacting with OpenAI's API, enabling
tasks such as natural language processing, generation, and other
AI functionalities.
6.2.9 import smtplib
A built-in Python module for sending emails using the Simple
Mail Transfer Protocol (SMTP).
Page | 33
6.2.10 import requests
A Python library for making HTTP requests in a simple and
human-friendly way. It supports various HTTP methods like
GET, POST, PUT, DELETE, and more.
6.2.11 import json
A built-in Python module for parsing JSON (JavaScript Object
Notation) data and converting Python data structures to JSON
strings.
6.2.12 import time
A built-in Python module that provides various time-related
functions, such as getting the current time, sleeping for a
specified number of seconds, and measuring intervals.
6.2.13 import tkinter as tk
from tkinter import messagebox, simpledialog, Label,
P0hotoImage
Page | 34
6.2.14 from PIL import Image, ImageTk
import threading
PIL (Pillow): The Python Imaging Library (PIL) fork, Pillow,
adds powerful image processing capabilities.
Image: A module in Pillow for opening, manipulating,
and saving various image file formats.
ImageTk: A module in Pillow for using PIL images with
tkinter applications.
threading: A built-in Python module for creating, controlling,
and managing threads, enabling concurrent execution of code.
6.2.15 import threading
Threading can be used to run multiple tasks concurrently. For
example, you can use threading to run multiple YouTube
searches simultaneously or process data while performing other
tasks.
6.2.16 import google.generativeai as genai
To use Google Generative AI, you'll need the google-
generativeai library. Ensure you have the required API keys and
setup.
6.2.17 from transformers import pipeline
The transformers library by Hugging Face allows you to use
various pre-trained AI models.
6.2.18 from tkinter.font import Font
Using Tkinter, you can create GUI applications and customize
fonts.
Page | 35
CHAPTER – 7
SYSTEM STUDY
Page | 37
CHAPTER 8
SYSTEM TESTING
8.1 TESTING
The purpose of testing is to discover errors. Testing is the
process of trying to discover every conceivable fault or
weakness in a work product. It provides a way to check the
functionality of components, sub - assemblies, assemblies
and/or a finished product It is the process of exercising software
with the intent of ensuring that the Software system meets its
requirements and user expectations and does not fail in an
unacceptable manner. There are various types of tests. Each test
type addresses a specific testing requirement.
8.2 TYPES OF TESTING UNIT TESTING
Page | 38
8.2.2 INTEGRATION TESTING
Integration tests are designed to test integrated software
components to determine if they actually run as one program.
Testing is event driven and is more concerned with the basic
outcome of screens or fields. Integration tests demonstrate that
although the components were individually satisfaction, as
shown by successfully unit testing, the combination of
components is correct and consistent. Integration testing is
specifically aimed at exposing the problems that arise from the
combination of components.
8.2.3 FUNCTIONAL TESTING
Functional tests provide systematic demonstrations that
functions tested are available as specified by the business and
technical 29 requirements, system documentation, and user
manuals. Functional testing is centred on the following items:
•Valid input: identified classes of valid input must be accepted
identify classes of valid input must be accepted.
• Invalid output: identified classes of valid input must be
accepted.
Page | 39
flows, emphasizing pre-driven process links and integration
points.
8.2.5 WHITE BOX TESTING
White Box Testing is a testing in which in which the software
tester has knowledge of the inner workings, structure and
language of the 30 software, or at least its purpose. It is purpose.
It is used to test areas that cannot be reached from a black box
level.
8.2.6 BLACK BOX TESTING
Black Box Testing is testing the software without any
knowledge of the inner workings, structure or language of the
module being tested. Black box tests, as most other kinds of
tests, must be written from a definitive source document, such
as specification or requirements document.
Page | 40
CHAPTER 9
APPENDIX -1
SOURCE CODE
import pyaudio
import speech_recognition as sr
import pyttsx3
import datetime
import wikipedia
import webbrowser
import os
import smtplib
import requests
import json
import time
import tkinter as tk
from tkinter import messagebox, simpledialog, Label,
filedialog, scrolledtext, Menu
from PIL import Image, ImageTk
import threading
import google.generativeai as genai
from transformers import pipeline
from tkinter.font import Font
from youtubesearchpython import VideosSearch
Page | 41
# Initialize the text-to-speech engine
engine = pyttsx3.init('sapi5')
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[1].id)
# Email credentials
email_user = '[email protected]'
email_password = 'your_email_password'
def update_command_label(command):
global last_command
Page | 42
last_command = command
gui.last_command_label.config(text=f"Last Command:
{command}")
def wishMe():
hour = int(datetime.datetime.now().hour)
if 0 <= hour < 12:
speak("Good Morning!")
elif 12 <= hour < 18:
speak("Good Afternoon!")
else:
speak("Good Evening!")
speak("I am your virtual assistant, How may I help you?")
def takeCommand():
r = sr.Recognizer()
with sr.Microphone() as source:
print("Listening...")
r.pause_threshold = 1
audio = r.listen(source)
try:
print("Recognizing...")
query = r.recognize_google(audio, language="en-in")
Page | 43
print(f"User said: {query}\n")
update_command_label(query)
return query
except Exception as e:
print(f"Could not understand the audio: {e}")
return "None"
def chatGPTResponse(prompt):
model = genai.GenerativeModel('gemini-1.5-pro-latest')
response = model.generate_content(prompt)
return response.text
def youtube():
webbrowser.open("https://www.youtube.com/")
def google(query):
speak("Opening on Google...")
query = query.replace("google", "")
webbrowser.open(f"https://www.google.com/search?q={query
}")
Page | 44
try:
server = smtplib.SMTP('smtp.gmail.com', 587)
server.starttls()
server.login(email_user, email_password)
server.sendmail(email_user, to, content)
server.close()
speak("Email has been sent.")
except Exception as e:
speak("Sorry, I am not able to send this email.")
print(f"Email error: {e}")
def getWeather(city):
try:
api_key = "your_openweathermap_api_key"
base_url =
f"http://api.openweathermap.org/data/2.5/weather?q={city}&
appid={api_key}&units=metric"
response = requests.get(base_url)
data = response.json()
if data["cod"] != "404":
main = data["main"]
weather_description =
data["weather"][0]["description"]
temperature = main["temp"]
Page | 45
speak(f"The temperature in {city} is {temperature}
degrees Celsius with {weather_description}.")
else:
speak("City not found.")
except Exception as e:
speak("Sorry, I couldn't retrieve the weather
information.")
print(f"Weather error: {e}")
def getNews():
try:
api_key = "your_newsapi_api_key"
base_url = f"https://newsapi.org/v2/top-
headlines?country=us&apiKey={api_key}"
response = requests.get(base_url)
news_data = response.json()
articles = news_data["articles"]
speak("Here are the top headlines:")
for article in articles[:5]:
speak(article["title"])
except Exception as e:
speak("Sorry, I couldn't retrieve the news.")
print(f"News error: {e}")
Page | 46
def tellJoke():
try:
response = requests.get("https://official-joke-
api.appspot.com/random_joke")
joke = response.json()
speak(f"Here is a joke: {joke['setup']} ...
{joke['punchline']}")
except Exception as e:
speak("Sorry, I couldn't retrieve a joke.")
print(f"Joke error: {e}")
def tellQuote():
try:
response = requests.get("https://api.quotable.io/random")
quote = response.json()
speak(f"Here is an inspirational quote: {quote['content']}
by {quote['author']}")
except Exception as e:
speak("Sorry, I couldn't retrieve a quote.")
print(f"Quote error: {e}")
def handle_command(query):
query = query.lower()
if "wikipedia" in query:
Page | 47
speak("Searching Wikipedia...")
query = query.replace("wikipedia", "")
results = wikipedia.summary(query, sentences=2)
speak("According to Wikipedia")
speak(results)
return results
elif 'open youtube' in query:
youtube()
return "opening youtube..."
elif 'google' in query:
google(query)
return "opening google..."
elif 'play' in query:
videosSearch = VideosSearch(query, limit=2)
webbrowser.open(videosSearch.result()['result'][0]["link"])
return "opening on youtube..."
elif 'the time' in query:
strTime =
datetime.datetime.now().strftime("%H:%M:%S")
speak(f"The time is {strTime}")
return strTime
elif 'local' in query:
Page | 48
speak("Playing the song...")
music_dir = "D:\\snaptube audios"
songs = os.listdir(music_dir)
if songs:
os.startfile(os.path.join(music_dir, songs[0]))
else:
speak("No songs found in the directory.")
elif 'explain yourself' in query:
speak("I am Realtalk voice communication utility for
Windows and Linux.")
speak("My creators are Hardik, Shubham, Aditya
Dangwal, and Gautam.")
return "I am Realtalk voice communication utility for
Windows and Linux. /n My creators are Hardik, Shubham,
Aditya Dangwal,Gautam, and kamni"
Page | 49
chat_query = takeCommand()
if chat_query:
response = chatGPTResponse(chat_query)
speak(response)
elif 'gmail' in query:
try:
speak("What should I say?")
content = simpledialog.askstring("Gmail", "What
should I say?")
speak("To whom should I send it?")
to = simpledialog.askstring("Gmail", "Enter the
recipient email:")
if content and to:
sendEmail(to, content)
except Exception as e:
speak("Sorry, I am not able to send this gmail.")
print(f"Gmail error: {e}")
elif 'weather' in query:
speak("Which city's weather would you like to know?")
city = simpledialog.askstring("Weather", "Enter city
name:")
if city:
getWeather(city)
elif 'news' in query:
Page | 50
getNews()
elif 'joke' in query:
tellJoke()
elif 'quote' in query:
tellQuote()
elif 'set reminder' in query:
speak("What should I remind you about?")
reminder = simpledialog.askstring("Reminder", "What
should I remind you about?")
speak("In how many seconds should I remind you?")
delay = simpledialog.askinteger("Reminder", "Enter the
delay in seconds:")
if reminder and delay:
time.sleep(delay)
speak(f"Reminder: {reminder}")
elif 'write' in query:
response = chatGPTResponse(query)
while True:
speak("If you want to save, say yes; otherwise, say no.")
Page | 51
save = takeCommand()
if "yes" in save:
with open("a.txt", "w") as f:
f.write(response)
break
elif "no" in save:
break
else:
continue
return response
elif 'exit' in query:
speak("Happy to help you! Have a good day ahead.")
root.destroy()
else:
Gemini_response = chatGPTResponse(query)
speak(Gemini_response)
return Gemini_response
def start_listening():
speak("How can I help you?")
query = takeCommand()
if query != "None":
handle_command(query)
Page | 52
def listen_for_wake_word():
recognizer = sr.Recognizer()
microphone = sr.Microphone()
while True:
with microphone as source:
print("Listening for wake word...")
recognizer.adjust_for_ambient_noise(source)
audio = recognizer.listen(source)
try:
wake_word = recognizer.recognize_google(audio,
language="en-in").lower()
if "jarvis" in wake_word:
print("Wake word detected!")
speak("Yes, how can I help you?")
start_listening()
except sr.UnknownValueError:
continue
except sr.RequestError as e:
print(f"Recognition error: {e}")
Page | 53
class VirtualAssistantGUI:
def __init__(self, root):
self.root = root
self.root.title("Virtual Assistant")
self.root.geometry("800x600")
self.root.configure(bg="#f7f7f7")
# Create a menu
self.menu = Menu(self.root, bg="#f7f7f7", fg="#333333",
tearoff=0)
self.root.config(menu=self.menu)
Page | 54
# Create frames
self.top_frame = tk.Frame(root, bg="#f7f7f7")
self.top_frame.pack(pady=10, padx=10)
Page | 55
self.input_field.pack(side=tk.LEFT, padx=10, fill=tk.X,
expand=True)
self.input_field.bind('<Return>', self.process_input)
# Send button
self.send_button = tk.Button(self.bottom_frame,
text="Send", command=self.process_input,
font=self.font_bold, bg="#007bff", fg="#ffffff", bd=0,
padx=10, pady=5)
self.send_button.pack(side=tk.LEFT, padx=5)
Page | 56
self.last_command_label = tk.Label(self.top_frame,
text="Last Command: None", font=self.font_regular,
bg="#f7f7f7", fg="#333333")
self.last_command_label.pack()
# Command suggestions
self.suggestions_label = tk.Label(self.top_frame,
text="Suggestions: hello, how are you, what's your name",
font=("Helvetica", 10), bg="#f7f7f7", fg="#888888")
self.suggestions_label.pack()
def process_voice_command(self):
query = takeCommand()
if query != "None":
Page | 57
self.update_conversation_history(f"You: {query}")
handle_command(query)
def upload_image(self):
file_path = filedialog.askopenfilename()
if file_path:
self.update_conversation_history(f"Uploaded image:
{file_path}")
def clear_history(self):
self.conversation_history.config(state=tk.NORMAL)
self.conversation_history.delete(1.0, tk.END)
self.conversation_history.config(state=tk.DISABLED)
Page | 58
# Start the wake word listening thread
wake_word_thread =
threading.Thread(target=listen_for_wake_word,
daemon=True)
wake_word_thread.start()
Page | 59
Appendix -2
Snaps Shot
Page | 60
Page | 61
Page | 62
CHAPTER 10
CONCLUSION AND FUTURE WORK
10.1 CONLUSION
The virtual assistants which are currently available are fast and
responsive but we still have to go a long way. The
understanding and reliability of the current systems need to be
improved a lot. The assistants available nowadays are still not
reliable in critical scenarios. The future plans include
Page | 63
integrating our virtual assistant with mobile using React Native
to provide a synchronised experience between the two
connected devices. Further, in the long run, our virtual assistant
is planned to feature auto deployment supporting elastic
beanstalk, backup files, and all operations which a general
Server Administrator does. The future of these assistants will
have the virtual assistants incorporated with Artificial
Intelligence which includes Machine Learning, Neural
Networks, etc. and IoT.
Page | 64
REFERENCE
- Documentation:
[Dialogflow](https://cloud.google.com/dialogflow/docs)
- GitHub Repository: [dialogflow/dialogflow-fulfillment-
nodejs](https://github.com/dialogflow/dialogflow-fulfillment-
nodejs)
Page | 65
- Documentation: [Actions on
Google](https://developers.google.com/assistant)
- GitHub Repository: [actions-on-
google](https://github.com/actions-on-google)
Page | 66
- [Google Cloud
Samples](https://github.com/GoogleCloudPlatform/python-
docs-samples)
Page | 67