0% found this document useful (0 votes)
22 views

Team Omega

Uhshjss Gavnanams Snshksmwhal s s Njsjajjs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Team Omega

Uhshjss Gavnanams Snshksmwhal s s Njsjajjs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

SMART INDIA HACKATHON 2024

TITLE PAGE
• Problem Statement ID – SIH1604

• Problem Statement Title - Conversational

image recognition chatbot

• Theme- Smart Automation

• PS Category- Software

• Team ID-

• Team Omega
Team
Omega IMAGE RECOGNITION CHATBOT
• This app includes a chatbot with sections like
cooking,studying,travelling etc., where users can select
their topic.
• Users can upload pictures and the chatbot analyzes
them to answer questions asked by user.
• It provides personalized responses based on user data
using deep learning.
• The chatbot can be controlled with voice commands.
• With a single power button click users can take a picture,
select the section and ask the questions in voice and it
gives appropriate answers in voice.
• The chatbot delivers responses in both voice and text in
the user's native language.
• It can convert uploaded images into videos for multiple
purposes.
• Google's APIs is used to ensure accurate and up to
date information.
@SIH Idea submission- Template 2
Team
Omega TECHNICAL APPROACH
• Use Flask for the app interface and node.js Technologies:
Frontend: Flask
for backend part.
Backend: Node.js
• Integrate Dialog Flow for chatbot Chatbot: Dialog Flow
conversations. Image Processing: COCO dataset,YOLO model
• Use pre trained data sets like COCO, pre` Voice: Google Speech-to-Text, Text-to-Speech
trained models like YOLO to analyze Multilingual: Google Translate API
Video Conversion: FFmpeg
uploaded images.
Cloud: Google Cloud with Kubernetes
• Use Google Speech to Text for voice input
and Text to Speech for responses.
• Fetch real-time info using Google APIs.
• Translate responses using Google
Translate API.
• Convert images into videos with FFmpeg.

@SIH Idea submission- Template 3


Team
Omega FEASIBILITY AND VIABILITY
Feasibility and viability : Challenges and Risks :
• Combining NLP and image recognition is achievable • The chatbot may have trouble recognizing
using current deep learning models and frameworks. images due to quality and lighting differences
• Significant training data, AI expertise, and cloud and also needs to protect user data.
resources are needed for development and • It depends on external services like Google
integration. which might fail or change unexpectedly.
• Development and maintenance costs include cloud • Translating multiple languages can be hard
infrastructure, model training, and updates. and it might not always have the right
• High potential for applications in industries like information for every image.
e-commerce and healthcare.
• Cost of Scaling: As usage scales, cloud infrastructure Strategies to overcome:
and computing power expenses may rise significantly. • Improve image recognition by regularly
• Risks include technological advancements and upgrading the system and training it with a
ensuring user adoption. wider variety of images.
• The market for chatbots is competitive, but few • Safeguard user data with advanced encryption
solutions integrate image recognition, offering a unique and security practices.
value proposition.

@SIH Idea submission- Template 4


Team IMPACT AND BENEFITS
Omega

Positive impact : Benefits :


• Enhance learning with quick help for school tasks. • Empowers individuals with disabilities or language
• Improve travel experiences by easily identifying barriers.
landmarks and translating signs. • Allows users to instantly access information 24/7.
• Make informed decisions with personalized • Opens new markets with innovative services like
insights for products and equipment without virtual assistance in retail or healthcare, creating
external assistance. new revenue streams.
• Save time and improve daily life with efficient • Users benefit from increased productivity as the
solutions for various tasks. chatbot offers quick accurate answers allowing
Negative impact: them to resolve issues.
• Automation may reduce the need for human
customer service roles, leading to job losses.
• Over reliance on automated systems could
reduce human interaction and adaptability.

@SIH Idea submission- Template 5


Team
Omega
RESEARCH AND REFERENCES
References
[1] Stanford CS224N Custom Project\. report015.pdf (stanford.edu)
[2] Siri. Siri - Apple.
[3] Google assistant. Google Assistant, your own personal Google.
[4 Visual dialog . [1611.08669] Visual Dialog (arxiv.org)
[5] Microsoft COCO: Common Objects in Context. COCO - Common Objects in Context (cocodataset.org)
[6] Bengio, Y., Ducharme, R., Vincent, P., and Jauvin, C. (2003). A neural probabilistic language model.
Journal of machine learning research, 3(Feb):1137–1155.for neural dialogue generation. arXiv preprint
arXiv:1701.06547.
[7] Harris, Z. S. (1954). Distributional structure. Word, 10(2-3):146–162
[8] maskrcnn-benchmark https://github.com/facebookresearch/maskrcnn-benchmark
[9] arXiv:1805.08318
[10] NLTK. http://www.nltk.org/.
[11] Torch. Torch | Scientific computing for LuaJIT.
[12] Django. The web framework for perfectionists with deadlines | Django (djangoproject.com)

@SIH Idea submission- Template 6

You might also like