manas-resume.pdf
manas-resume.pdf
Education
8/10 BTech in Computer Science, Guru Gobind Singh Indraprastha University | Delhi, India 2022‑26
Achievements: Finalist @ Hustle X (Business Incubator, IIM Lucknow) | Winner @ HackMait 4.0, Projexon BVP | Finalist @ Innovate X
(NSUT) | Beta MLSA
Courses: Supervised Machine Learning: Regression and Classification | Neural Networks and Deep Learning | Deep Learning and Trans‑
formers | Working with LLMs | Generative AI with LLMs
Skills
Programming Python, C/C++(DSA), Git, Scripting (Bash), HTML, CSS, Javascript
Software Tensorflow, Pytorch, Docker, OpenCV, Flask, Fast API, AWS
Projects
Dockerized‑Whisper Nov 2023 ‑ Dec 2023
Whisper ASR with FastAPI in Docker
• Developed a containerized application using Docker that integrates the powerful Whisper Automatic Speech Recognition (ASR)
model from OpenAI with a FastAPI backend.
• Utilized FastAPI for creating a high‑performance API, enabling efficient communication with the Whisper ASR model.
• Implemented robust containerization using Docker, ensuring easy deployment and scalability of the ASR service.
• Focused on a microservices architecture, separating the ASR functionality for modularity and flexibility.
• Applied RESTful API design principles to enhance the accessibility and usability of the ASR service.
Advocate Falcon Oct 2023 ‑ Nov 2023
Legal Chatbot
• The script utilizes PySimpleGUI to create a simple graphical user interface (GUI) for selecting a language. Users can choose between
English, Spanish, and French using a combo box.
• The project integrates with the Hugging Face Hub to access a language model for generating specific answers related to education.
The HuggingFaceHub class is used, and the model is initialized with specific parameters such as temperature and maximum new
tokens.
• Designed a chatbot using the Chainlit Framework and also enables users to input an image and ask questions out of it, even stores
the history of chat.
Visual‑Voice Sept 2023 ‑ Oct 2023
Revolutionizing accessibility and communication through seamless image captioning and spoken
language‑to‑Hindi speech conversion.
• Implemented a Streamlit web application using the Transformers library to perform image captioning on uploaded images.
• Extended functionality to translate the generated English caption to Hindi using the ’translate’ library and converted the translated
caption to speech using the gTTS library, enhancing accessibility.
• Utilized the VisionEncoderDecoderModel, ViTFeatureExtractor, and AutoTokenizer from the Hugging Face Transformers library
to seamlessly process images and generate captions, showcasing proficiency in model integration and usage.