Multimodal Content Processing Across Media Sources

Uploaded by

Ariane Vincent

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Multimodal Content Processing Across Media Sources

Uploaded by

Ariane Vincent

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

Multimodal Content Processing across

Media Sources
INTRODUCTION
● The Visual Audio Text Summarizer is introduced to meet the increasing demand for
efficient tools in online education by streamlining the process of distilling essential
insights from educational videos.

● It integrates visual and audio elements seamlessly, utilizing advanced natural

language processing and machine learning techniques to generate concise text
summaries, thus enhancing the accessibility and utility of video content.

● With a user-friendly design, this system aims to revolutionize how users interact
with educational videos, contributing to a more effective and personalized learning
experience in the digital education era.
OBJECTIVE
● Develop a system that can automatically extract key information from videos and to
create concise and informative summary of the video.

● Provide viewers with a more user-friendly and time-efficient way to navigate and
access video content, reducing the need to watch lengthy videos in their entirety.
PROBLEM STATEMENT
● In today's digital landscape, the vast and ever-growing volume of video content poses a
significant challenge for users seeking to efficiently access and comprehend
information.
● Videos often require a substantial time investment to watch in full. Furthermore,
content creators frequently publish lengthy videos, making it impractical for users
with time constraints or limited attention spans to consume the entire content.
● The rapid growth of online audio and video content presents a dual challenge and
opportunity for product teams working in this space. To tackle this, we propose an
innovative Visual Audio Text Summarization approach.
LITERATURE SURVEY
TITLE TECHNOLOGY/ ADVANTAGES DISADVANTAGES
PROTOCOL

Learning To Generate Headline Popularity

High-quality headlines. Lack of Explainability.
Popular Headlines Prediction Model.
Amin Omidvar and
Aijun An BART,ProphetNet,T5.

Ai-based video FFmpeg Small database for

Identifying different
summarization using summarization.
ffmpeg and speakers
nlp,Hansaraj
Wankhede1 R Bharathi
Kumar2 Ramtekkar4
Rachana
Chawke5 ,2023
TITLE TECHNOLOGY/ ADVANTAGES DISADVANTAGES
PROTOCOL

Video summarization Automatic Speech Handling Unpunctuated Extractive summarization

using speech Recognition(ASR), Transcripts
recognition and text NLP
summarization,Tirath
Tyagi, Lakshaya Dhari,
Yash Nigam, Renuka
Nagpal,2023

Abstractive Summarizer Natural Language Summarizing videos It does not make use of
for Youtube Processing , Machine based on its subtitle is sentences from the original
Videos,Sulochana Learning , Abstractive
the fastest way of content.
Devi,Rahul summarization .
Nadar,Alfredprem generating summary..
Lucas,2023
PROPOSED SYSTEM
PREPROCESSING
● The system includes a preprocessing module dedicated to categorizing videos
into two distinct paths: subtitle extraction or video to audio conversion.
● As part of its initial processing step, the system determines the presence of
subtitles within the video content.
● Upon detecting the presence of subtitles within the video, the system proceeds
to route the video towards the path designated for subtitle extraction.
● Conversely, in cases where no subtitles are identified, the system directs the
video towards the path designated for audio extraction.
● The decision-making process regarding video categorization and path
assignment is facilitated through the utilization of ffmpeg.
AUDIO EXTRACTION MODULE
● Process of extracting the audio from video
● Performed using MoviePy.
● MoviePy is a Python library used for video editing, processing, and
manipulation tasks. When it comes to extracting audio from a video file,
MoviePy offers a convenient and straightforward solution.
● After extracting the audio, the output is then forwarded to a text conversion
module to transcribe the spoken words into written text.
SUBTITLE EXTRACTION MODULE
● Extraction of subtitle from video.
● Performed by using Youtube Transcript API if the user input is a YouTube URL.
● Youtube Transcript API: Access and retrieve transcripts of videos hosted on the
YouTube platform.
● If the .srt file is available in the video then it is directly fetched and subtitle is
extracted.
TEXT CONVERSION MODULE
● Process of converting audio into text.
● The process can be done using any of the two models : IBM Watson and
Whisper
● IBM Watson: Extraction of text from audio using IBM Watson model.
● Whisper: Extraction of text using a pretrainde openai Whisper model.
SUMMARIZATION MODULE

● The extracted text undergoes summarization, which involves condensing the

content into shorter, more concise versions.
● There are two approaches to summarization: abstractive and extractive.
● Abstractive Summarization: This method utilizes pre-trained models like BART
and T5 to generate summaries by rewriting and synthesizing the content in a
new form.
● Extractive Summarization:Extractive summarization involves using pre-trained
models such as TF-IDF and NLTK to select and compile key sentences directly
from the original text without altering their content.
CONCLUSION
● Visual Audio Text Summarization is an innovative approach to tackle the challenge
of efficiently accessing and comprehending the vast amount of video content
available today. It can provide users with a concise and informative summary of
lengthy videos, saving them time and effort.

● It presents a promising solution for product teams working in the online audio and
video content space, enabling users to quickly grasp the key points of lengthy
videos.
REFERENCES
[1] Zuzana Cernekova, Ioannis Pitas, Senior Member, IEEE, and Christophoros Nikou, Member,
IEEE“] Information Theory-Based Shot Cut/Fade Detection and Video Summarization”.

[2] Muhammad Bagus Andra Department of Computer Science and Kumamoto University
Kumamoto, Japan “Automatic Lecture Video Content Summarization with Attention-based
Recurrent Neural Network”.

[3] Sandra E. F. de Avila†, Antonio da Luz Jr.†‡, Arnaldo de A. Araujo ́†, and Matthieu
Cord§†Computer Science Department — Federal University of Minas Gerais. “VSUMM: An
Approach for Automatic Video Summarization and Quantitative Evaluation” In IEEE Conference,
2016.

[4]Purnendu Banerjee Society for Natural Language Technology Research Module 130, SDF
Building Kolkata-700091, India. “Automatic Detection of Handwritten Texts from Video Frames of
Lectures”.

[5] S. Sah et al. “Semantic Text Summarization of Long Videos”. In: 2017 IEEE Winter Conference
on Applications of Computer Vision (WACV). 2017, pp. 989–997. DOI: 10.1109/WACV.2017.115.
[6] A. Dilawari and M. U. G. Khan. “ASoVS: Abstractive Summarization of Video Sequences”. In: IEEE
Access 7 (2019), pp. 29253–29263. DOI: 10. 1109/ACCESS. 2019.2902507.

[7] T. M. Moses and K. Balachandran. “A classified study on semantic analysis of video

summarization”. In: 2017 International Conference on Algorithms, Methodology, Models and
Applications in Emerging Technologies (ICAMMAET). 2017, pp. 1–6. DOI: 10 . 1109 /
ICAMMAET.2017.8186684.

[8] H. Li et al. “Read, Watch, Listen, and Summarize: Multi-Modal Summarization for Asynchronous
Text, Image, Audio and Video”. In: IEEE Transactions on Knowledge and Data Engineering 31.5
(2019), pp. 996– 1009. DOI: 10.1109/TKDE.2018.2848260.

[9]Chenyang Zhang and Yingli Tian. “Automatic video description generation via LSTM with joint two-
stream encoding”. In: 2016 23rd International Conference on Pattern Recognition (ICPR). 2016, pp.
2924–2929. DOI: 10.1109/ICPR.2016.7900081.

[10]David Ten. Keyword and Sentence Extraction with TextRank (pytextrank). 2018 (accessed
November 7, 2020). URL: https://xang1234.github.io/textrank/.

[11]K. Davila and R. Zanibbi, ‘‘Whiteboard video summarization via spatiotemporal conflict
minimization,’’ in Proc. 14th IAPR Int. Conf. Document Anal. Recognit. (ICDAR), Nov. 2017, pp. 355–
362.
THANK YOU

Top Notch 1 Assessment Answer Key PDF
70% (10)
Top Notch 1 Assessment Answer Key PDF
5 pages
Acknowledgement & 5. Dedication
80% (15)
Acknowledgement & 5. Dedication
2 pages
The IIE Harvard Style Reference Guide - Adapted For The IIE 2023 PDF
No ratings yet
The IIE Harvard Style Reference Guide - Adapted For The IIE 2023 PDF
42 pages
Automatic Subtitle Generator
0% (1)
Automatic Subtitle Generator
25 pages
Chapter 001
No ratings yet
Chapter 001
7 pages
JPSP - 2022 - 115
No ratings yet
JPSP - 2022 - 115
6 pages
Video Transcript Summarizer
No ratings yet
Video Transcript Summarizer
11 pages
IJNRD2306300
No ratings yet
IJNRD2306300
6 pages
Abstractive Summarizer For Youtube Videos: Abstract. The Paper Goal Is To Design A User Interface Where The User Can Get
No ratings yet
Abstractive Summarizer For Youtube Videos: Abstract. The Paper Goal Is To Design A User Interface Where The User Can Get
8 pages
Report
No ratings yet
Report
18 pages
Youtube Transcript Summarizer Using Flask
No ratings yet
Youtube Transcript Summarizer Using Flask
9 pages
Mini Project
No ratings yet
Mini Project
30 pages
Visual Assist
No ratings yet
Visual Assist
53 pages
Mini Project Report 7 Sem..
No ratings yet
Mini Project Report 7 Sem..
16 pages
AI-based Video Summarization Using FFmpeg and NLP
No ratings yet
AI-based Video Summarization Using FFmpeg and NLP
6 pages
AI-Powered MOOCs Video Lecture Generation - A Review
No ratings yet
AI-Powered MOOCs Video Lecture Generation - A Review
18 pages
Final - Document (01) (1) (Repaired)
No ratings yet
Final - Document (01) (1) (Repaired)
29 pages
Video and Text Summarisation Using NLP
No ratings yet
Video and Text Summarisation Using NLP
3 pages
AVUP Scriptify Research Paper Final
No ratings yet
AVUP Scriptify Research Paper Final
4 pages
Pre Synopsis Report I&K (1) C PDF
No ratings yet
Pre Synopsis Report I&K (1) C PDF
1 page
abstract3
No ratings yet
abstract3
4 pages
Group 16 Synopsis
No ratings yet
Group 16 Synopsis
7 pages
YouTube Transcript Summarizer Synopsis Updated (1)
No ratings yet
YouTube Transcript Summarizer Synopsis Updated (1)
9 pages
3586a356
No ratings yet
3586a356
8 pages
Survey Paper On Youtube Transcript Summarizer: Eesha Inamdar, Varada Kalaskar, Vaidehi Zade
No ratings yet
Survey Paper On Youtube Transcript Summarizer: Eesha Inamdar, Varada Kalaskar, Vaidehi Zade
4 pages
s10844-015-0356-5
No ratings yet
s10844-015-0356-5
25 pages
Mini ProjectA17
No ratings yet
Mini ProjectA17
25 pages
new yt research paper (1)
No ratings yet
new yt research paper (1)
13 pages
Nlp-Enriched Automatic Video Segmentation: Mohannad Almousa Rachid Benlamri Richard Khoury
No ratings yet
Nlp-Enriched Automatic Video Segmentation: Mohannad Almousa Rachid Benlamri Richard Khoury
6 pages
YTSummarizer
No ratings yet
YTSummarizer
26 pages
YouTube Video Summarizer Your Time Saving Tool
No ratings yet
YouTube Video Summarizer Your Time Saving Tool
17 pages
youtube nites generator
No ratings yet
youtube nites generator
24 pages
Video Transcription and Summarization Using NLP
No ratings yet
Video Transcription and Summarization Using NLP
5 pages
YouTube Video Transcript Summarization Tool (1)
No ratings yet
YouTube Video Transcript Summarization Tool (1)
10 pages
Guide To Transcripts and Captions: I. Overview
No ratings yet
Guide To Transcripts and Captions: I. Overview
4 pages
YouTube Video Summarizer Using NLP - A Review
No ratings yet
YouTube Video Summarizer Using NLP - A Review
7 pages
7sem_projectreport
No ratings yet
7sem_projectreport
33 pages
Video Summarizer A Mini Project
No ratings yet
Video Summarizer A Mini Project
10 pages
Dynamic Summarization of Videos Based On Descriptors in Space-Time Video Volumes and Sparse Autoencoder
No ratings yet
Dynamic Summarization of Videos Based On Descriptors in Space-Time Video Volumes and Sparse Autoencoder
11 pages
Video Summarization Using Deep Semantic Features
No ratings yet
Video Summarization Using Deep Semantic Features
16 pages
Automated Notes Maker From Audio Reccordings
No ratings yet
Automated Notes Maker From Audio Reccordings
4 pages
BTP Report
No ratings yet
BTP Report
4 pages
ENTERFACE 2010 Project Proposal: 1. Introduction and Project Objectives
No ratings yet
ENTERFACE 2010 Project Proposal: 1. Introduction and Project Objectives
7 pages
Batch8_Youtube Transcript Summarizer
No ratings yet
Batch8_Youtube Transcript Summarizer
1 page
Minor Project
No ratings yet
Minor Project
10 pages
Summary
No ratings yet
Summary
5 pages
Video Captioning Using Neural Networks
No ratings yet
Video Captioning Using Neural Networks
13 pages
cep report
No ratings yet
cep report
21 pages
Mini ProjectA17
0% (1)
Mini ProjectA17
25 pages
Paper 1
No ratings yet
Paper 1
3 pages
332040.332427
No ratings yet
332040.332427
8 pages
pythonmicroproject
No ratings yet
pythonmicroproject
12 pages
1 s2.0 S1877050916311309 Main
No ratings yet
1 s2.0 S1877050916311309 Main
8 pages
MATHS Report
No ratings yet
MATHS Report
15 pages
2018 - slideChangeDetection - Eruvaram2020 - Article - AnExperimentalComparativeStudy
No ratings yet
2018 - slideChangeDetection - Eruvaram2020 - Article - AnExperimentalComparativeStudy
8 pages
Open-Sora: Create High-Quality Videos From Text Prompts
No ratings yet
Open-Sora: Create High-Quality Videos From Text Prompts
8 pages
Documentation For AI Driven VideoDubbing
No ratings yet
Documentation For AI Driven VideoDubbing
60 pages
IJCRT22A6393
No ratings yet
IJCRT22A6393
6 pages
Synopsis On: Video Summarization
No ratings yet
Synopsis On: Video Summarization
11 pages
Mini Project Prestentation
No ratings yet
Mini Project Prestentation
12 pages
Automatic Lip Reading Classification of
No ratings yet
Automatic Lip Reading Classification of
5 pages
Article_Title
No ratings yet
Article_Title
23 pages
Extractive Text and Video Summarization Using TF-IDF Algorithm
No ratings yet
Extractive Text and Video Summarization Using TF-IDF Algorithm
8 pages
Text Analysis with Python: A Research-Oriented Guide
From Everand
Text Analysis with Python: A Research-Oriented Guide
Mamta Mittal
No ratings yet
Stay Positive
100% (1)
Stay Positive
4 pages
Expert Maths Tutoring in The UK - Boost Your Scores With Cuemath
No ratings yet
Expert Maths Tutoring in The UK - Boost Your Scores With Cuemath
1 page
LST PROJECT
No ratings yet
LST PROJECT
17 pages
Verran and Snyder-Halpern Sleep Scale (VSH)
No ratings yet
Verran and Snyder-Halpern Sleep Scale (VSH)
2 pages
Random
No ratings yet
Random
1 page
863636
No ratings yet
863636
5 pages
Download Complete Children, Childhood and Youth in the British World 1st Edition Shirleene Robinson PDF for All Chapters
100% (3)
Download Complete Children, Childhood and Youth in the British World 1st Edition Shirleene Robinson PDF for All Chapters
62 pages
SM New Syllabus 3
No ratings yet
SM New Syllabus 3
12 pages
Professional Education
No ratings yet
Professional Education
10 pages
Matveev (2008) CurriculumAlignment PDF
No ratings yet
Matveev (2008) CurriculumAlignment PDF
78 pages
Exam Help Booklet Writing LC 28-09-2016
33% (3)
Exam Help Booklet Writing LC 28-09-2016
33 pages
Ae Tt1021 Grammar Worksheet 8
No ratings yet
Ae Tt1021 Grammar Worksheet 8
1 page
Detailed Lesson Plan (DLP) Format: Learning Competency/Ies: Code: En11/12Rws-Iiigh-4.3
0% (1)
Detailed Lesson Plan (DLP) Format: Learning Competency/Ies: Code: En11/12Rws-Iiigh-4.3
2 pages
How To Calculate P Value
No ratings yet
How To Calculate P Value
9 pages
Media and Information Literacy Module 2
No ratings yet
Media and Information Literacy Module 2
24 pages
Lesson Plan Meeting GK 2019
No ratings yet
Lesson Plan Meeting GK 2019
5 pages
A.I Replacing Humans
No ratings yet
A.I Replacing Humans
1 page
Shubham Resume
No ratings yet
Shubham Resume
2 pages
Delowarhossen CV
No ratings yet
Delowarhossen CV
4 pages
Week 3 Attention-Grabbers WKST Answer Key
No ratings yet
Week 3 Attention-Grabbers WKST Answer Key
4 pages
MODULE Teaching Arts in Elemenraty Grades
No ratings yet
MODULE Teaching Arts in Elemenraty Grades
36 pages
Cook-Off Show Rubric: Planning/Preparati Ons
100% (1)
Cook-Off Show Rubric: Planning/Preparati Ons
2 pages
CLE 9 (1st & 2nd Sem)
No ratings yet
CLE 9 (1st & 2nd Sem)
20 pages
Ashley Flowers Psychology Lab Report 2-2
No ratings yet
Ashley Flowers Psychology Lab Report 2-2
12 pages
Candidate FAQ Flyer 2023
No ratings yet
Candidate FAQ Flyer 2023
3 pages
የአጠቃላይ_ትምህርት_ረቂቅ_አዋጅ_እንግሊዘኛ
0% (1)
የአጠቃላይ_ትምህርት_ረቂቅ_አዋጅ_እንግሊዘኛ
51 pages

Multimodal Content Processing Across Media Sources

Uploaded by

Multimodal Content Processing Across Media Sources

Uploaded by

Multimodal Content Processing across

● It integrates visual and audio elements seamlessly, utilizing advanced natural

Learning To Generate Headline Popularity

Ai-based video FFmpeg Small database for

Video summarization Automatic Speech Handling Unpunctuated Extractive summarization

● The extracted text undergoes summarization, which involves condensing the

[7] T. M. Moses and K. Balachandran. “A classified study on semantic analysis of video

You might also like