0% found this document useful (0 votes)

47 views6 pages

AIML PGCP Project B21

This project involves building an automatic speech recognition model to convert speech to text using the LibriSpeech dataset. The LibriSpeech dataset contains 1000 hours of English speech from 2000 speakers sampled at 16 kHz, derived from audiobooks from the LibriVox project. It is already divided into training, development, and test sets. The objective is to build an ASR model using one or a combination of the "dev-clean" and "test-clean" datasets for training. Tools that could be used include TensorFlow, PyTorch, and Keras. Final submissions would include a GitHub repository, technical report, presentation, and summaries of research papers.

Uploaded by

Murtuza Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views6 pages

AIML PGCP Project B21

Uploaded by

Murtuza Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

PROJECT 1

Title: Automatic Image Captioning

Objective: Build an image captioning model to generate captions of an image using CNN

Dataset Link: Flickr8k_dataset

Dataset description: A collection of sentence-based image description

● Dataset consists of 8k images in JPEG format with different shapes and sizes.
● Images are paired with five different captions which provide clear descriptions of the
salient entities and events.
● The images were chosen from six different Flickr groups and included a variety of
scenes and situations.

Project Overview: Captioning the images with proper description is a popular research area
of Artificial Intelligence. A good description of an image is often said as “Visualizing a
picture in the mind”. The generation of descriptions from the image is a challenging task that
can help and have a great impact in various applications such as usage in virtual assistants,
image indexing, a recommendation in editing applications, helping visually impaired persons,
and several other natural language processing applications. In this project, we need to create a
multimodal neural network that involves the concept of Computer Vision and Natural
Language Process in recognizing the context of images and describing them in natural
languages (English, etc). Deploy the model and evaluate the model on 10 different real-time
images.

Tools: Natural Language Toolkit, TensorFlow, PyTorch, Keras

Deployments: FastAPI, Cloud Application Platform | Heroku, Streamlit, Cloud Computing,

Hosting Services, and APIs | Google Cloud

Final Submissions:

● GitHub Repository of the project

● Project Technical Report
● Project Presentation with desired outcomes
● Summary of 3 research papers
PROJECT 2

Project Title: AI-based Generative QA System

Objective: Finetune any GPT variant model for two tasks:

1. Given the body of an email, generate a succinct subject for the same.
2. Given a question pertaining to the AIML subject, model a system to generate its
corresponding answer.

Project Overview: This project intends to familiarize the participants with generative text
systems. The project will consist of two distinct tasks pertaining to the field. In the first task,
the participants will get to work with a clean, prepared dataset and try hands-on fine tuning
with any GPT model of their choice. While learning how to implement the finetuning of a
GPT model on the subject line generation task, they will be creating a fresh, new dataset for
the next task. The trained QA model can be then deployed for its testing on answering new
AIML queries.

The project would offer a complete learning experience, with the project cycle consisting of
dataset curation, ideation, implementation and deployment. We provide an overview of each
of the two tasks below.

1. Email Subject Line Generation

As opposed to commonly solved tasks in the domain of news summarization or headline

generation – which are closely related works to this problem – the problem1 offers uniqueness
in having to generate extremely short, concise summary in the form of the email subject. This
involves identifying the most salient sentences from the email body, and abstracting the
message contained in those sentences into only a few words. From the implementation point
of view, this project offers an opportunity to play with generative models in NLP, using any
GPT2 variant of their choice. One would also get to study the evaluation of text generation
through different metrics.

2. Question Answering on AIML Queries

Having learnt the process of model finetuning and evaluation on the first task, the project
primarily revolves around fulfilling the objective of the second task: modeling a
domain-specific GPT-variant model that can answer the questions specific to the AIML
course. It has been observed that while pretrained models can produce relevant textual output
for general, open-domain textual prompts, the models lack the capability of producing finer
outputs when it comes to domain-specific tasks. For this purpose, we commonly finetune the
model on a dataset specific to that task, to tailor its expertise on it. Here, the participants will
work together to build a novel, relevant dataset for the task. Post finetuning, they will observe
its performance on unseen, related questions.
1
Introduced in “This Email Could Save Your Life: Introducing the Task of Email Subject Line
Genera@on”, ACL 2019

2
GPT 2 (an example of a GPT-variant model) implementa@on on Hugging Face:
https://huggingface.co/docs/transformers/model_doc/gpt2
Dataset :

1. The Annotated Enron Subject Line Corpus: https://github.com/ryanzhumich/AESLC

This will be used for the first task.
2. AIML QA Corpus: To be curated as a collective effort of all the NLP project team

Dataset description:

1. The Annotated Enron Subject Line Corpus

● The dataset consists of a subset of cleaned, filtered and deduplicated emails

from the Enron Email Corpus which consists of employee email inboxes from
the Enron Corporation.
● Evaluation (dev, test) split of the data contains 3 annotated subject lines by
human annotators. Multiple possible references facilitate a better evaluation
of the generated subject, since it is difficult to have only one unique,
appropriate subject per email.
● Some dataset statistics:
○ Sizes of train / dev / test splits: 14,436 / 1,960 / 1,906
○ An email contains an average of 75 words
○ A subject contains an average of 4 words
2. AIML QA Corpus
● This dataset will be created as a collective effort of all the teams participating
in NLP projects as a part of the AIML course, and later used for fine tuning
the GPT model.
● Each team will be provided with a question bank consisting of 250 questions
each. The questions are to be provided with a short, 1-2 line answer to be
entered in a CSV file.
● The given questions will be extracted from the course material already
covered through the AIML lectures.
● Participants will have to adhere to a strict deadline to complete the dataset
creation task (within 1 month of the project start) to facilitate sufficient time
for QA modeling.
● Post completion of the dataset, a common train / dev / test split will be
provided to the teams for experimenting on the main task.

Tools: Hugging Face, PyTorch, Tensorflow, Keras, WandB, NLTK

Deployments: FastAPI, Cloud Application Platform | Heroku, Streamlit, Cloud Computing,

Hosting Services, and APIs | Google Cloud

Final Submissions:

● Project technical report & presentation with desired outcomes

● An overview of the modeling techniques used for the problem
● GitHub Repository of the project
● Summary of 3 research papers
PROJECT 3

Title: Image tagging and road object detection

Objective: Detect object tagging in the video and examine how parallel object detection on
multiple patches can allow the detection of smaller objects in the overall image without
decreasing the resolution.

Dataset Link: BDD 100K Dataset.

Dataset description: The Berkeley Deep Drive (BDD) dataset is one of the largest and most
diverse video datasets for autonomous vehicles.

● The dataset contains 100,000 video clips collected from more than 50,000 rides
covering New York, San Francisco Bay Area, and other regions.
● The dataset contains diverse scene types such as city streets, residential areas, and
highways.
● Furthermore, the videos were recorded in diverse weather conditions at different times
of the day.

Project Overview: Object detection and segmentation methods are one of the most
challenging problems in computer vision which aim to identify all target objects and
determine the categories and position information. Numerous approaches have been proposed
to solve this problem, mainly inspired by methods of computer vision and deep learning. In
this project, we aim to build a model which detects multiple objects and segmentation in a
moving video. For eg. Image tagging, lane detection, drivable area segmentation, road object
detection, semantic segmentation, instance segmentation, multi-object detection tracking,
multi-object segmentation tracking, domain adaptation, and imitation learning.

Tools: TensorFlow, PyTorch, Keras

Deployments: FastAPI, Cloud Application Platform | Heroku, Streamlit, Cloud Computing,

Hosting Services, and APIs | Google Cloud

Final Submissions:

● GitHub Repository of the project

● Project Technical Report
● Project Presentation with desired outcomes
● Summary of 3 research papers
PROJECT 4

Title : Automatic Speech Recognition(ASR)

Objective: Build an ASR model for converting speech to text.

Dataset Link : LibriSpeech

Dataset description: LibriSpeech is a corpus of reading English speech, suitable for training
and evaluating speech recognition systems, published in 2015 by Johns Hopkins University.
It is derived from audiobooks that are part of the LibriVox project and contains 1000 hours of
speech sampled at 16 kHz of 2000 speakers. The LibriVox project1, a volunteer effort, is
responsible for the creation of approximately 8000 public domain audiobooks, the majority of
which are in English. Most of the recordings are based on texts from Project Gutenberg2, also
in the public domain. The data is already divided into train/dev/test sets. The total size of the
data is 60 GB and subsets are available of different sizes.
Initially, we recommend working only with 'dev-clean' and 'test-clean' datasets for building
the model. We can use any one or a combination of both data sets as a training set. A subset
of either 'dev-clean' or 'test-clean' can be used for testing purposes. Once modeling is done
with these smaller data sets, start modeling using 'train-clean'/'train-other' data sets of larger
sizes as a training set. Now, 'dev-clean', 'test-clean', and ‘test-other’ datasets are used for
validation/testing purposes only.

Project Overview: Automatic speech recognition is the application of Machine learning or

AI where human speech is processed and converted into readable text. We can find numerous
applications such as Instagram for real-time captions, Spotify for podcast transcriptions,
Youtube video transcription, Zoom meeting transcriptions, etc. The field has grown
exponentially over the last few years. An explosion of applications taking advantage of ASR
technology in their products to make audio and video data more accessible.

There are different approaches to Automatic Speech Recognition, viz. traditional HMM
(Hidden Markov Models) and GMM (Gaussian Mixture Models) and end-to-end deep
learning models. In this project, we aim to build and deploy a model that can generate the
written text from the speech with a decent accuracy.

Tools: Kaldi, PyTorch, Audio Processing tool/library.

Operating system: Linux - Ubuntu

Deployments: FastAPI, Cloud Application Platform | Heroku, Streamlit, Cloud Computing,

Hosting Services, and APIs | Google Cloud

Reference: Papers using libriSpeech

Final Submissions:

● GitHub Repository of the project

● Project Technical Report
● Project Presentation with desired outcomes
● Summary of 3 research papers
PROJECT 5

Title: Automatic Number Plate Recognition (ANPR)

Objective: Build a CV model for recognizing the Number Plate and Displaying the Number.

Dataset Link: Image Dataset

Dataset description: Dataset consists of 433 images with bounding box annotations of the
car license plates within the image. Annotations are provided in the PASCAL VOC format
i.e.; images are accompanied with an XML file containing the object annotations.

Project Overview:

AI and deep learning are being used everywhere, from voice assistants to self-driving cars.
One such application is the Automatic Number Plate Recognition (ANPR) of Vehicles.
ANPR is a technology that uses the power of AI and deep learning to automatically detect
and recognize the characters of a vehicle’s license plate.

With the increase in the number of vehicles, vehicle tracking has become an important
research area for efficient traffic control, surveillance, and finding stolen cars. The specific
use cases may be traffic violation control, parking management, toll booth payments, etc. For
this purpose, efficient real-time license plate detection and recognition are of great
importance.

Challenges: Due to the variation in the background and font color, font style, size of the
license plate, and non-standard character along with the issue of robustness in varying
weather conditions, license plate recognition is a great challenge, especially in developing
countries. The given dataset is for building a baseline model. You are expected to include an
additional 100 image data sets from Indian conditions and try to overcome these challenges
by applying different techniques. You may have to create a bounding box for the number
plate region in the additional images. Overall you have to complete two tasks. Task 1: Data
collection and creation of bounding box, Task 2: ANPR

Automatic Number Plate Recognition (ANPR) implementation involves following three

steps:
Step 1: Detect and localize a license plate in an input image/frame
Step 2: Extract the characters from the license plate
Step 3: Apply some form of Optical Character Recognition (OCR) to recognize the
extracted character

Tools: TensorFlow, PyTorch, Keras, OpenCV, OCR-Tool, Yolo

Deployments: FastAPI, Streamlit, Gradio | Cloud computing & Hosting Services Platform :
Heroku, AWS, Google Cloud etc.

Final Submissions:
● GitHub Repository of the project
● Project Technical Report
● Project Presentation with desired outcomes
● Summary of 3 research papers

Python for Mechanical and Aerospace Engineering
From Everand
Python for Mechanical and Aerospace Engineering
Alexander Kenan
No ratings yet
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
Googolopoly
No ratings yet
Googolopoly
8 pages
Taask
No ratings yet
Taask
18 pages
Core Objective-C in 24 Hours
From Everand
Core Objective-C in 24 Hours
Keith Lee
5/5 (1)
Learn C++
From Everand
Learn C++
Aishik Dutta
No ratings yet
Report1 2
No ratings yet
Report1 2
9 pages
ArIES Open Projects ML
No ratings yet
ArIES Open Projects ML
6 pages
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
From Everand
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
Mark Magic
No ratings yet
Course Project and Term Paper Logistics
No ratings yet
Course Project and Term Paper Logistics
7 pages
LLM Project Cards
No ratings yet
LLM Project Cards
30 pages
Objective-C Programming Nuts and bolts
From Everand
Objective-C Programming Nuts and bolts
Keith Lee
No ratings yet
Nidhish Resume NC
No ratings yet
Nidhish Resume NC
1 page
RAI AI Engineer Intern Assignments
No ratings yet
RAI AI Engineer Intern Assignments
3 pages
Natural Language Processing
No ratings yet
Natural Language Processing
8 pages
Project Ideas
No ratings yet
Project Ideas
5 pages
Project Details For LABS - AIML
No ratings yet
Project Details For LABS - AIML
3 pages
Project
No ratings yet
Project
11 pages
Getting Started with Oracle Data Integrator 11g: A Hands-On Tutorial
From Everand
Getting Started with Oracle Data Integrator 11g: A Hands-On Tutorial
David Hecksel
5/5 (2)
The Beginner’s Guide to AI - Aider
From Everand
The Beginner’s Guide to AI - Aider
Steven Mcananey
No ratings yet
Individual Report - CA 2 - 20000086
No ratings yet
Individual Report - CA 2 - 20000086
3 pages
Machine Learning with Python: A Comprehensive Guide with a Practical Example
From Everand
Machine Learning with Python: A Comprehensive Guide with a Practical Example
MARTIN NEEL
No ratings yet
Learning Advanced Programming
From Everand
Learning Advanced Programming
IT Campus Academy
No ratings yet
Gen Ai Lab - DS
No ratings yet
Gen Ai Lab - DS
26 pages
Text Classification and Processing Using NLP
No ratings yet
Text Classification and Processing Using NLP
21 pages
Black and White Both Sides Updated
No ratings yet
Black and White Both Sides Updated
25 pages
COBOL for the Approved Workman
From Everand
COBOL for the Approved Workman
Wesley Sweetser, Jr
No ratings yet
Fin Irjmets1715742677
No ratings yet
Fin Irjmets1715742677
6 pages
Action Recognition: Step-by-step Recognizing Actions with Python and Recurrent Neural Network
From Everand
Action Recognition: Step-by-step Recognizing Actions with Python and Recurrent Neural Network
Mark Magic
No ratings yet
DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Preparation
From Everand
DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Preparation
Georgio Daccache
No ratings yet
A Survey On Language Models For Code
No ratings yet
A Survey On Language Models For Code
125 pages
Bithack Tac
No ratings yet
Bithack Tac
3 pages
Pretraining and Evaluation CodeLLMs
No ratings yet
Pretraining and Evaluation CodeLLMs
71 pages
IGNOU Software Engineering Previous 10 Years Solved Papers
From Everand
IGNOU Software Engineering Previous 10 Years Solved Papers
Manish Soni
No ratings yet
Objective-C Language Reference and Techniques: Definitive Reference for Developers and Engineers
From Everand
Objective-C Language Reference and Techniques: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Professional Test Driven Development with C#: Developing Real World Applications with TDD
From Everand
Professional Test Driven Development with C#: Developing Real World Applications with TDD
James Bender
No ratings yet
Groovy for Domain-Specific Languages, Second Edition: Extend and enhance your Java applications with domain-specific scripting in Groovy
From Everand
Groovy for Domain-Specific Languages, Second Edition: Extend and enhance your Java applications with domain-specific scripting in Groovy
Fergal Dearle
No ratings yet
Introduction to Programming Languages
From Everand
Introduction to Programming Languages
IntroBooks Team
4/5 (1)
Aaquib Capstone Project Edited
No ratings yet
Aaquib Capstone Project Edited
2 pages
Spring 2.5 Aspect Oriented Programming
From Everand
Spring 2.5 Aspect Oriented Programming
Massimiliano DessÃ¬
No ratings yet
DL 20i0551 Project Proposal
No ratings yet
DL 20i0551 Project Proposal
3 pages
C++ OOP Made Simple: A Practical Guide with Examples
From Everand
C++ OOP Made Simple: A Practical Guide with Examples
William E. Clark
No ratings yet
Major2 Synopsis
No ratings yet
Major2 Synopsis
13 pages
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
From Everand
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
Timothy King
No ratings yet
Artificial Intelligence 2024 Book 2 of 2: AI, #2
From Everand
Artificial Intelligence 2024 Book 2 of 2: AI, #2
Yang Yen Thaw
No ratings yet
Rust Programming Basics: A Practical Guide with Examples
From Everand
Rust Programming Basics: A Practical Guide with Examples
William E. Clark
No ratings yet
Go Algorithms for Beginners: A Practical Guide with Examples
From Everand
Go Algorithms for Beginners: A Practical Guide with Examples
William E. Clark
No ratings yet
Agile Foundation Courseware – English
From Everand
Agile Foundation Courseware – English
Nader Rad
No ratings yet
(2023) A Survey On Language Models For Code
No ratings yet
(2023) A Survey On Language Models For Code
55 pages
Microsoft Visual C++ Windows Applications by Example
From Everand
Microsoft Visual C++ Windows Applications by Example
Stefan BjÃ¶rnander
3.5/5 (3)
Learning OpenCV 3 Application Development
From Everand
Learning OpenCV 3 Application Development
Samyak Datta
No ratings yet
C# OOP Step by Step: A Practical Guide with Examples
From Everand
C# OOP Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet
Internship Report (Sanjay Final)
No ratings yet
Internship Report (Sanjay Final)
45 pages
C# Algorithms for New Programmers: A Practical Guide with Examples
From Everand
C# Algorithms for New Programmers: A Practical Guide with Examples
William E. Clark
No ratings yet
Text Generation Using Deep Learning Abstract
No ratings yet
Text Generation Using Deep Learning Abstract
24 pages
Mastering Generic Programming in C++: Unlock the Secrets of Expert-Level Skills
From Everand
Mastering Generic Programming in C++: Unlock the Secrets of Expert-Level Skills
Larry Jones
No ratings yet
Automated Source Code Generation and Auto-Completion Using Deep Learning: Comparing and Discussing Current Language Model-Related Approaches
No ratings yet
Automated Source Code Generation and Auto-Completion Using Deep Learning: Comparing and Discussing Current Language Model-Related Approaches
16 pages
C# Debugging from Scratch: A Practical Guide with Examples
From Everand
C# Debugging from Scratch: A Practical Guide with Examples
William E. Clark
No ratings yet
Blender Pro Studio Advanced Techniques for Real-World Projects: Blender, #3
From Everand
Blender Pro Studio Advanced Techniques for Real-World Projects: Blender, #3
Steven Mcananey
No ratings yet
Learn Machine Learning in 24 Hours: The Ultimate Beginner’s Guide: Master Coding in 24 Hours
From Everand
Learn Machine Learning in 24 Hours: The Ultimate Beginner’s Guide: Master Coding in 24 Hours
Aniket Jain
No ratings yet
University of Hyderabad Presentation Template
No ratings yet
University of Hyderabad Presentation Template
27 pages
Nextgen Healthcare Ebook Data Analytics Healthcare Edu35
100% (1)
Nextgen Healthcare Ebook Data Analytics Healthcare Edu35
28 pages
EchoPoint User Interface Maunal
No ratings yet
EchoPoint User Interface Maunal
32 pages
PNP Ip/Network Camera: Quick Installation Guide PNP T Series
No ratings yet
PNP Ip/Network Camera: Quick Installation Guide PNP T Series
8 pages
Lybra ICS
No ratings yet
Lybra ICS
93 pages
8 Secrets To Making An ATS-Friendly Resume - The Muse
No ratings yet
8 Secrets To Making An ATS-Friendly Resume - The Muse
5 pages
IT Syllabus 2017
No ratings yet
IT Syllabus 2017
139 pages
Arvind P 4897139
No ratings yet
Arvind P 4897139
3 pages
Python For Econometrics
No ratings yet
Python For Econometrics
300 pages
Oop With Uml
No ratings yet
Oop With Uml
5 pages
GHMC
No ratings yet
GHMC
38 pages
Verilog Code
75% (8)
Verilog Code
85 pages
Parallel Wireless CWS-3050-x Outdoor Model PDF
100% (2)
Parallel Wireless CWS-3050-x Outdoor Model PDF
7 pages
EWM Availability Groups in SAP EWM
No ratings yet
EWM Availability Groups in SAP EWM
10 pages
Camelot Py Readthedocs Io en Master
No ratings yet
Camelot Py Readthedocs Io en Master
59 pages
Weka - Knowledgeflow Normalize
No ratings yet
Weka - Knowledgeflow Normalize
15 pages
2080LC5024AWB 1121 Datasheet
No ratings yet
2080LC5024AWB 1121 Datasheet
3 pages
Unit 1
No ratings yet
Unit 1
26 pages
0418 w07 QP 3
100% (2)
0418 w07 QP 3
10 pages
Nict b5g6g Whitepaperen v3 0
No ratings yet
Nict b5g6g Whitepaperen v3 0
115 pages
Architectural Changes To Vsphere 6:: Vcenter Server With Embedded PSC
No ratings yet
Architectural Changes To Vsphere 6:: Vcenter Server With Embedded PSC
8 pages
GPT & MBR Disk
No ratings yet
GPT & MBR Disk
5 pages
Acronis Certified Engineer Backup 11.5 Training Presentation Module 8 (EN)
No ratings yet
Acronis Certified Engineer Backup 11.5 Training Presentation Module 8 (EN)
19 pages
WLP Etech-Week6
No ratings yet
WLP Etech-Week6
2 pages
Summit Electric Case Study
No ratings yet
Summit Electric Case Study
6 pages
Sigir2013 Tutorial
No ratings yet
Sigir2013 Tutorial
1 page
PWM Freq Arduino Due
No ratings yet
PWM Freq Arduino Due
9 pages
Ati x530l Series Ds 3
No ratings yet
Ati x530l Series Ds 3
8 pages
An E-Learning System Architecture Based On
No ratings yet
An E-Learning System Architecture Based On
5 pages

AIML PGCP Project B21

Uploaded by

AIML PGCP Project B21

Uploaded by

PROJECT 1

Title: Automatic Image Captioning

Dataset Link: Flickr8k_dataset

Dataset description: A collection of sentence-based image description

Tools: Natural Language Toolkit, TensorFlow, PyTorch, Keras

Deployments: FastAPI, Cloud Application Platform | Heroku, Streamlit, Cloud Computing,

● GitHub Repository of the project

Project Title: AI-based Generative QA System

Objective: Finetune any GPT variant model for two tasks:

1. Email Subject Line Generation

As opposed to commonly solved tasks in the domain of news summarization or headline

2. Question Answering on AIML Queries

1. The Annotated Enron Subject Line Corpus: https://github.com/ryanzhumich/AESLC

1. The Annotated Enron Subject Line Corpus

● The dataset consists of a subset of cleaned, filtered and deduplicated emails

Tools: Hugging Face, PyTorch, Tensorflow, Keras, WandB, NLTK

Deployments: FastAPI, Cloud Application Platform | Heroku, Streamlit, Cloud Computing,

● Project technical report & presentation with desired outcomes

Title: Image tagging and road object detection

Dataset Link: BDD 100K Dataset.

Tools: TensorFlow, PyTorch, Keras

Deployments: FastAPI, Cloud Application Platform | Heroku, Streamlit, Cloud Computing,

● GitHub Repository of the project

Title : Automatic Speech Recognition(ASR)

Objective: Build an ASR model for converting speech to text.

Dataset Link : LibriSpeech

Project Overview: Automatic speech recognition is the application of Machine learning or

Tools: Kaldi, PyTorch, Audio Processing tool/library.

Operating system: Linux - Ubuntu

Deployments: FastAPI, Cloud Application Platform | Heroku, Streamlit, Cloud Computing,

Reference: Papers using libriSpeech

● GitHub Repository of the project

Title: Automatic Number Plate Recognition (ANPR)

Dataset Link: Image Dataset

Automatic Number Plate Recognition (ANPR) implementation involves following three

Tools: TensorFlow, PyTorch, Keras, OpenCV, OCR-Tool, Yolo

You might also like