Visual Image Caption Generator 38

This project report presents a visual image caption generator using deep learning, specifically combining CNN and LSTM models to generate meaningful captions from images. The proposed system addresses limitations of existing methods by analyzing object relationships and providing accessibility features for visually impaired users. The project utilizes the COCO 2017 dataset and aims to outperform human baselines in captioning accuracy and efficiency.

Uploaded by

s45033966

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

0 views

Visual Image Caption Generator 38

Uploaded by

s45033966

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

VISUAL IMAGE CAPTION GENERATOR

USING DEEP LEARNING

A PROJECT REPORT
Submitted by

MANIKANDAN
(Reg.No. 0122127038)

In partial fulfillment of the requirements

for the award of the degree

BACHELOR OF COMPUTER APPLICATION

Computer science

Under the Guidance of

VIGNESHNARTHI Msc,M.phil,PHD

DEPARTMENT OF COMPUTER SCIENCE

ALAGAPPA GOVERNMENT ARTS COLLEGE
(Grade-I college and Re-accredited with “B”
Grade by NAAC)
KARAIKUDI – 630 003.

March -2025
ABSTRACT

The combination of computer vision and natural language processing in Artificial intelligence
has sparked a lot of interest in research in recent years, thanks to the advent of deep
learning. The context of a photograph is automatically described in English. When a picture
is captioned, the computer learns to interpret the visual information of the image using one
or more phrases. The ability to analyze the state, properties, and relationship between these
objects is required for the meaningful description generation process of high-level picture
semantics. Using CNN -LSTM architectural models on the captioning of a graphical image,
we hope to detect things and inform people via text messages in this research. To correctly
identify the items, the input image is first reduced to grayscale and then processed by a
Convolution Neural Network (CNN). The COCO Dataset 2017 was used. The proposed
method for blind individuals is intended to be expanded to include persons with vision loss to
speech messages to help them reach their full potential and to track their intellect. In this
project, we follow a variety of important concepts of image captioning and its standard
processes, as this work develops a generative CNN-LSTM model that outperforms human
baselines
EXISTING SYSTEM

The existing systems for image captioning primarily rely on traditional machine learning
models and rule-based algorithms, which lack the capability to generate meaningful captions
due to limited contextual understanding of images. These systems focus on identifying
objects but fail to analyze their relationships and interactions, resulting in generic and less
descriptive outputs. Additionally, they rely heavily on handcrafted features and basic natural
language processing techniques, making them inefficient and unsuitable for diverse datasets
like COCO 2017. Moreover, these systems are not designed with accessibility features,
limiting their usability for visually impaired individuals.

DISADVANTAGES

● Limited contextual understanding of images, leading to generic and less

meaningful captions.

● Focuses only on object detection without analyzing relationships or

interactions between objects.

● Heavy reliance on handcrafted features and traditional rule-based algorithms,

reducing efficiency.

● Struggles with diverse and complex datasets like COCO 2017, limiting
scalability.

● Lack of real-time processing capabilities for practical applications.

● Does not include accessibility features, making it unsuitable for visually

impaired individuals.
PROPOSED SYSTEM

The proposed system addresses these limitations by utilizing a deep learning approach that
combines Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM)
networks. This system analyzes the context, properties, and relationships between objects in
an image to generate high-quality captions in natural language. By training on the COCO
2017 dataset, it adapts to diverse real-world scenarios, starting with feature extraction using
a CNN and passing the results to an LSTM for caption generation. Additionally, the system
incorporates speech generation to convert captions into audio, making it accessible to
visually impaired users. This innovative approach aims to provide meaningful descriptions
while outperforming human baselines in accuracy and efficiency, offering an inclusive and
intelligent solution.

ADVANTAGES

● Recommendations in Editing Applications

● Assistance for visually impaired
● Social Media posts
● Self-Driving cars
● Robotics
● Easy to implement and connect to new data sources
1. HARDWARE REQUIREMENTS

● System: i3 Processor
● Hard Disk: 500 GB.
● Monitor: 15’’LED
● Input Devices: Keyboard, Mouse
● Ram: 4GB

2. SOFTWARE REQUIREMENTS

● Platform: Google Colab

● Coding Language: Python

3. ALGORITHMS

● Convolutional Neural Network

● Long Short-Term Memory

4. METHODOLOGY

● Import Libraries Upload

● COCO (Common Objects and Contexts) Dataset 2017. (Data Preprocessing)
● Apply CNN to identify the objects in the image.
● Preprocess and tokenize the captions.
● Use LSTM to predict the next word of the sentence.
● Make a Data Generator
● View Images with caption
THANK YOU

SLPS 25088
100% (1)
SLPS 25088
3 pages
How To Use Audacity Guide
100% (2)
How To Use Audacity Guide
20 pages
USE Case Diagram and Description Use Case Description 1: Employee Management System
100% (2)
USE Case Diagram and Description Use Case Description 1: Employee Management System
6 pages
Image Caption Generator Research Paper
No ratings yet
Image Caption Generator Research Paper
4 pages
Document from Deependra singh (1)
No ratings yet
Document from Deependra singh (1)
10 pages
Mini Project Fln..
No ratings yet
Mini Project Fln..
51 pages
Project Synopsis Imagecaptioning
No ratings yet
Project Synopsis Imagecaptioning
5 pages
Image Captioning Using Deep Learning Mait
No ratings yet
Image Captioning Using Deep Learning Mait
8 pages
(IJCST-V11I4P7) :dr. T. S. Suganya, Mrs. M. Divya, T. Santhosh Kumar, K. Prem Kumar
No ratings yet
(IJCST-V11I4P7) :dr. T. S. Suganya, Mrs. M. Divya, T. Santhosh Kumar, K. Prem Kumar
4 pages
Acd
No ratings yet
Acd
15 pages
Major Report Final
No ratings yet
Major Report Final
40 pages
Building A Voice Based Image Caption Generator With Deep Learning
No ratings yet
Building A Voice Based Image Caption Generator With Deep Learning
6 pages
Project Review
No ratings yet
Project Review
12 pages
A Novel Approach of Image Caption Generator Using Deep Learning
No ratings yet
A Novel Approach of Image Caption Generator Using Deep Learning
6 pages
Image Caption Generator
No ratings yet
Image Caption Generator
2 pages
Project Report
No ratings yet
Project Report
35 pages
Report Contents Image Caption Generation-1
No ratings yet
Report Contents Image Caption Generation-1
42 pages
CNN and RNN
No ratings yet
CNN and RNN
82 pages
Paper 17881
No ratings yet
Paper 17881
6 pages
Image Caption Bot With Keras and Speech Generation For
No ratings yet
Image Caption Bot With Keras and Speech Generation For
7 pages
Minor
No ratings yet
Minor
14 pages
ROHAN PRASAD FinalProjectReport - Rohan Gamer
No ratings yet
ROHAN PRASAD FinalProjectReport - Rohan Gamer
39 pages
Poster 2
No ratings yet
Poster 2
1 page
RP Springer
No ratings yet
RP Springer
10 pages
Image Captioning Synopsis
No ratings yet
Image Captioning Synopsis
17 pages
Image Caption
No ratings yet
Image Caption
16 pages
Sunnit Singh Shivam Kumar Soham Chatterjee Abhishek Kumar Sujata Dawn MuHmt
No ratings yet
Sunnit Singh Shivam Kumar Soham Chatterjee Abhishek Kumar Sujata Dawn MuHmt
6 pages
Report 1
No ratings yet
Report 1
34 pages
Black and White Both Sides Updated
No ratings yet
Black and White Both Sides Updated
25 pages
Internship Report (Sanjay Final)
No ratings yet
Internship Report (Sanjay Final)
45 pages
Image Caption Generator Using AI: Review - 1
No ratings yet
Image Caption Generator Using AI: Review - 1
9 pages
Detection and Recognition of Objects in Image Caption Generator System A Deep Learning Approach
No ratings yet
Detection and Recognition of Objects in Image Caption Generator System A Deep Learning Approach
3 pages
Image Captioning
No ratings yet
Image Captioning
8 pages
Fin Irjmets1689950550
No ratings yet
Fin Irjmets1689950550
5 pages
03_1121
No ratings yet
03_1121
12 pages
Synopsis May 2024 (Pradeep, Vikas) - 1
No ratings yet
Synopsis May 2024 (Pradeep, Vikas) - 1
14 pages
PGCON Paper Final
No ratings yet
PGCON Paper Final
4 pages
Image Caption Generator Using Deep Learning
No ratings yet
Image Caption Generator Using Deep Learning
9 pages
Conference Papers
No ratings yet
Conference Papers
10 pages
Image Caption Generator Report
No ratings yet
Image Caption Generator Report
27 pages
Image Caption Generator Using Deep Learning: Guided by Dr. Ch. Bindu Madhuri, M Tech, PH.D
No ratings yet
Image Caption Generator Using Deep Learning: Guided by Dr. Ch. Bindu Madhuri, M Tech, PH.D
9 pages
CHERUKURI VARALAKSHMI-2
No ratings yet
CHERUKURI VARALAKSHMI-2
21 pages
Image Caption Generator
No ratings yet
Image Caption Generator
6 pages
Image To TXT Original Final
No ratings yet
Image To TXT Original Final
32 pages
IJCRT2310418
No ratings yet
IJCRT2310418
8 pages
DL 20i0551 Project Proposal
No ratings yet
DL 20i0551 Project Proposal
3 pages
Image Caption Generator PCL
No ratings yet
Image Caption Generator PCL
19 pages
Sample project doc-REC
No ratings yet
Sample project doc-REC
66 pages
IJIEMR March 2023 COPY RIGHT (2 Files Merged)
No ratings yet
IJIEMR March 2023 COPY RIGHT (2 Files Merged)
8 pages
Research Paper of Generating Caption From Image
No ratings yet
Research Paper of Generating Caption From Image
5 pages
Image Captioning Generator Using Deep Machine Learning
No ratings yet
Image Captioning Generator Using Deep Machine Learning
3 pages
15 Report PDF
No ratings yet
15 Report PDF
35 pages
Automated Image Captioning Using CNN and RNN
No ratings yet
Automated Image Captioning Using CNN and RNN
17 pages
Image Caption Generator Using Deep Learning
No ratings yet
Image Caption Generator Using Deep Learning
5 pages
Image Captioning - A Deep Learning Approach
No ratings yet
Image Captioning - A Deep Learning Approach
4 pages
118-presentation
No ratings yet
118-presentation
26 pages
Automatic Image Captioning Using Neural Networks
No ratings yet
Automatic Image Captioning Using Neural Networks
9 pages
Image Captioning Generator Using CNN and LSTM
No ratings yet
Image Captioning Generator Using CNN and LSTM
8 pages
Image Caption Generation
No ratings yet
Image Caption Generation
8 pages
Image To Caption Generator
No ratings yet
Image To Caption Generator
7 pages
ALGORITHM Saikareddy Img Cap-1742112866980
No ratings yet
ALGORITHM Saikareddy Img Cap-1742112866980
6 pages
Apply Deep Learning-based CNN and LSTM for Visual Image Caption Generator
No ratings yet
Apply Deep Learning-based CNN and LSTM for Visual Image Caption Generator
6 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Goal Resume
No ratings yet
Goal Resume
1 page
Phi Sing Email Detection Report Python
No ratings yet
Phi Sing Email Detection Report Python
59 pages
Main CV Simple
No ratings yet
Main CV Simple
7 pages
Computer Masti Level 1
No ratings yet
Computer Masti Level 1
80 pages
FU-HCM Introduction To Databases: You Are Here
No ratings yet
FU-HCM Introduction To Databases: You Are Here
16 pages
DG-A6 Serial Port Server User Manual Rev1.2 US
No ratings yet
DG-A6 Serial Port Server User Manual Rev1.2 US
18 pages
DC 102 Module 1 Lesson 1
No ratings yet
DC 102 Module 1 Lesson 1
12 pages
Script
No ratings yet
Script
6 pages
Mobile Infocast FINAL V2
No ratings yet
Mobile Infocast FINAL V2
68 pages
IGCSE ICT Revision Notes
86% (14)
IGCSE ICT Revision Notes
10 pages
Tsion 2024 CV
No ratings yet
Tsion 2024 CV
2 pages
The Role of Computers
No ratings yet
The Role of Computers
21 pages
A8 Mini User Manual v1.2
No ratings yet
A8 Mini User Manual v1.2
69 pages
Installation Troubleshooting
No ratings yet
Installation Troubleshooting
3 pages
Staley DVD Evidence
100% (1)
Staley DVD Evidence
4 pages
Advanced Excel Shortcuts
100% (1)
Advanced Excel Shortcuts
8 pages
Veritas Netbackup 8.1.2: Administration: Course Description
No ratings yet
Veritas Netbackup 8.1.2: Administration: Course Description
2 pages
Sample Paper 2
No ratings yet
Sample Paper 2
5 pages
Info Sheet: Security Guidance V.4
No ratings yet
Info Sheet: Security Guidance V.4
2 pages
Lead Status
No ratings yet
Lead Status
5 pages
Devops Lab Manual Programs
No ratings yet
Devops Lab Manual Programs
25 pages
Sega Transformers Shadows Rising Manual
No ratings yet
Sega Transformers Shadows Rising Manual
170 pages
T-Codes, Tables, Reports
No ratings yet
T-Codes, Tables, Reports
6 pages
Brief Note On FCI Activities and Important IT Initiatives.: Page 1 of 5
No ratings yet
Brief Note On FCI Activities and Important IT Initiatives.: Page 1 of 5
5 pages
Mini Project-Faculty Domain Selection With Title (2024-2025)
No ratings yet
Mini Project-Faculty Domain Selection With Title (2024-2025)
1 page
Smart Kart Project
No ratings yet
Smart Kart Project
16 pages
LM - Unit3 2
No ratings yet
LM - Unit3 2
7 pages