Danyal Alam’s Post

ML Engineer | GenAI | RAG | MLOps

1mo

My 2nd blog post for the Vizuara Substack is live now This blog has been written in a lot of detail about the Embeddings in NLP: Firstly, we went through the evolution of approaches to work with texts. - Bag of words - tf-idf -Word2Vec -Transformers and sentence Embeddings. Then, we discussed how to understand whether texts have similar meanings to each other. After that, we saw different approaches to text embedding visualisation. Here is the link to the blog post: https://lnkd.in/gn3YGZqM Raj Abhijit Dandekar #nlp #embeddings #vectors #word2vec #tf-idf #bagofwords #sentenceembeddings

Embeddings in Natural Language Processing (NLP)

vizuara.substack.com

1 Comment

Sayiqa Jabeen

1mo

Interesting

1 Reaction

To view or add a comment, sign in

More Relevant Posts

Mahadevia Hitarth Hiten

Aspiring Data Scientist and Machine Learning Enthusiast | Tech Trailblazer
8mo
Report this post
🚀📚 Day 03/50: Spam Detection Today, I took on the challenge of building a spam filter using logistic regression as part of my 50 Days of NLP journey! 📧🚫 Objective: To classify emails as spam or ham (non-spam) using a logistic regression model with TF-IDF features. Key Learnings: 📊 Data Exploration and Preprocessing: Explored the dataset, tokenized the text, removed stop words, and used TF-IDF for feature extraction. 🔍 Model Building: Trained a logistic regression model to distinguish between spam and ham emails. 📈 Evaluation Metrics: Accuracy: 82.38% Precision: 81.48% Recall: 83.54% F1 Score: 82.50% Visual Insights: Confusion Matrix: Displayed the true positives, true negatives, false positives, and false negatives. Performance Metrics: Visualized the accuracy, precision, recall, and F1 score with a bar plot. Conclusion: Building a spam filter highlights the importance of effective text preprocessing and feature extraction in NLP tasks. The logistic regression model performed well, offering valuable insights into text classification. Stay tuned as I continue this exciting journey into the world of NLP! Connect with me to follow my progress and learn more about NLP applications. #NLP #SpamDetection #MachineLearning #DataScience #50DaysOfNLP #NaturalLanguageProcessing GitHub Repository: https://lnkd.in/dMJYN3XR

GitHub - HitarthMahadevia/50-days-of-NLP: This repository is dedicated to my journey of exploring and mastering Natural Language Processing (NLP) through a series of 50 projects.

github.com
Like Comment
To view or add a comment, sign in
Mahadevia Hitarth Hiten

Aspiring Data Scientist and Machine Learning Enthusiast | Tech Trailblazer
8mo
Report this post
🌟 Day 05/50 of the 50 Days of NLP Challenge 🌟 Today's adventure in the world of Natural Language Processing was all about Text Summarization! 📄✨ 🔍 Objective: Generate extractive and abstractive summaries of news articles to provide concise and informative content. 🔧 Approaches Explored: 1. Extractive Summarization: Identified key sentences to form a summary. 2. Abstractive Summarization with BART: Generated new sentences to create a summary using the BART model from Hugging Face. 📊 Results: Extractive Summarization: - ROUGE-1: Precision: 0.3074, Recall: 0.9326, F-measure: 0.4624 - ROUGE-2: Precision: 0.2119, Recall: 0.6477, F-measure: 0.3193 - ROUGE-L: Precision: 0.2630, Recall: 0.7978, F-measure: 0.3955 Abstractive Summarization: - ROUGE-1: Precision: 1.0, Recall: 0.1686, F-measure: 0.2886 - ROUGE-2: Precision: 0.8710, Recall: 0.1011, F-measure: 0.1812 - ROUGE-L: Precision: 1.0, Recall: 0.1686, F-measure: 0.2886 🔍 Insights: - Extractive Summarization excels in recall, capturing most key information but including some irrelevant details. - Abstractive Summarization with BART achieves high precision and generates concise summaries but may miss some key points. 🚀 Future Directions: 1. Integrate extractive and abstractive techniques. 2. Fine-tune the BART model parameters. 3. Experiment with other advanced models like T5 or PEGASUS. 4. Test on a variety of articles to enhance robustness. Text summarization is a powerful tool for quickly understanding lengthy documents, which is crucial for news aggregation, research, and content curation. This journey into text summarization has provided valuable insights and set the stage for further exploration and innovation! 🌐📚 Stay tuned for more exciting projects in the 50 Days of NLP Challenge! 💡✨ #NLP #DataScience #MachineLearning #ArtificialIntelligence #TextSummarization #BART #50DaysOfNLP #Python #DataScienceJourney GitHub Repository: [50 Days of NLP](https://lnkd.in/dmAQAaut)

GitHub - HitarthMahadevia/50-days-of-NLP: This repository is dedicated to my journey of exploring and mastering Natural Language Processing (NLP) through a series of 50 projects.

github.com
Like Comment
To view or add a comment, sign in
Mahadevia Hitarth Hiten

Aspiring Data Scientist and Machine Learning Enthusiast | Tech Trailblazer
8mo
Report this post
🚀 NLP Journey: Days 06 to 12 🚀 Excited to share my latest progress in my Natural Language Processing (NLP) journey! Over the past week, I've dived deep into several fascinating projects and learned a ton about the intricacies of NLP. Here’s a recap of what I’ve accomplished: 📌 Day 06: Part-of-Speech Tagging I explored how to tag parts of speech in sentences using the NLTK library. Understanding the grammatical structure of sentences has been crucial in enhancing text analysis. 📌 Day 07: Language Detection I built a model to detect the language of a given text using character n-grams. This project was enlightening in terms of how subtle patterns can reveal the language of a text. 📌 Day 08: Word Cloud Generator I created a word cloud from a collection of text documents. This visual representation helps in quickly grasping the most frequent terms in a dataset. 📌 Day 09: Text Classification I classified news articles into various categories such as sports, politics, and technology. Leveraging models like Logistic Regression, Naive Bayes, SVM, Random Forests, and Neural Networks provided insights into the strengths of different approaches. 📌 Day 10: Chatbot Basics I developed a simple rule-based chatbot. It was fascinating to see how even basic rules can enable a machine to simulate a conversation. 📌 Day 11: Text Similarity I calculated the similarity between two sentences using cosine similarity. This project highlighted the importance of measuring text similarity in tasks like plagiarism detection and document clustering. 📌 Day 12: Keyword Extraction I extracted keywords from a document using TF-IDF. This technique is incredibly powerful for identifying the most significant terms in a text, aiding in tasks like summarization and topic modeling. This week has been a whirlwind of learning and growth, and I’m excited to continue this journey. Each project has not only broadened my technical skills but also deepened my understanding of how NLP can transform data into meaningful insights. Stay tuned for more updates! GitHub Repository: https://lnkd.in/dmAQAaut #NLP #MachineLearning #ArtificialIntelligence #DataScience #TechLearning #Python #DeepLearning #NaturalLanguageProcessing

GitHub - HitarthMahadevia/50-days-of-NLP: This repository is dedicated to my journey of exploring and mastering Natural Language Processing (NLP) through a series of 50 projects.

github.com
Like Comment
To view or add a comment, sign in
Pratik Lahamge
8mo
Report this post
Just finished Introduction to Fundamentals of NLP: Introducing Natural Language Processing.

Fundamentals of NLP: Introducing Natural Language Processing • Pratik Lahamge • Skillsoft® Digital Badge: Fundamentals of NLP: Introducing Natural Language Processing

skillsoft.digitalbadges.skillsoft.com
Like Comment
To view or add a comment, sign in
Suramya Pokharel

AI/ML Engineer | AI Researcher | Data Science
10mo
Report this post
Excited to share my latest article diving deep into the intricacies of BERT! 🚀 Discover how BERT's attention mechanism and novel pretraining techniques are revolutionizing natural language processing. Read now for a comprehensive understanding of BERT's role in shaping the future of language processing! #BERT #NLP #AI https://lnkd.in/g3T5vmRJ

Understanding BERT: A Comprehensive Theoretical Study

medium.com

1 Comment
Like Comment
To view or add a comment, sign in
Devmallya "Dev" Karar

Lead AI Scientist @HCLTech | Quantum Machine Learning | Quantum Physics Entangled with Quantum Spacetime & AI ☄️
5mo
Report this post
🚀 New Medium Post 🚀 Excited to share my latest blog on Transformers: Self-Attention Mechanism from Scratch using PyTorch! 🤖 In this post, I break down the fundamentals of self-attention and walk through a step-by-step implementation using PyTorch. Whether you're diving into NLP or interested in understanding the mechanics behind state-of-the-art models, this guide will provide you with hands-on experience. Check it out and let me know your thoughts! 🛠️ #Transformers #DeepLearning #PyTorch #NLP #SelfAttention #AI #MachineLearning #ArtificialIntelligence #DataScience Read “Transformers: Self-Attention Mechanism from scratch using PyTorch.“ by Devmallya Karar on Medium: https://lnkd.in/gvxnFA2X

Transformers: Self-Attention Mechanism from scratch using PyTorch.

medium.com
Like Comment
To view or add a comment, sign in
Ketan Kumar

Data Scientist 2 @ Juniper Networks| Ex-Syngene | M.Sc. Data Science | B.Sc. Statistics
10mo
Report this post
🔍 Curious about the technology behind modern breakthroughs in natural language processing? Look no further! 🚀 Dive into my latest blog post where I unravel the intricate workings of transformers, the powerhouse behind NLP's evolution. Here's a glimpse of what awaits you: 🔍 Unlocking NLP's Potential: Explore the significance of transformers and their pivotal role in shaping the future of language understanding. 💡 Insider Insights: Gain a deeper understanding of attention mechanisms and their variants, unraveling how transformers effectively capture contextual information. 🛠️ Practical Implementation: Follow along as I break down the fundamental components of the transformer model with Python code snippets, allowing you to roll up your sleeves and experiment firsthand. 🔮 Beyond the Basics: Journey into the realm of advanced concepts like BERT and GPT, and discover their real-world applications across various NLP tasks. Ready to embark on this enlightening journey with me? Check out the full blog post and unlock the secrets of NLP's core technology! 💬✨ https://lnkd.in/gT_bSYeU #NLP #Transformers #MachineLearning #DeepLearning #Python #DataScience #ArtificialIntelligence #Technology #Programming #TechTrends

Understanding Transformers: A Deep Dive into NLP's Core Technology

analyticsvidhya.com
Like Comment
To view or add a comment, sign in
Shreya Srivastava

Data Scientist
6mo
Report this post
𝐍𝐞𝐰 𝐀𝐫𝐭𝐢𝐜𝐥𝐞 𝐀𝐥𝐞𝐫𝐭: 𝐄𝐱𝐩𝐥𝐨𝐫𝐢𝐧𝐠 𝐒𝐞𝐥𝐟-𝐀𝐭𝐭𝐞𝐧𝐭𝐢𝐨𝐧 𝐌𝐞𝐜𝐡𝐚𝐧𝐢𝐬𝐦 I've just published a new blog post diving deep into the Self-Attention mechanism— one of the most transformative concepts behind the recent advancements in NLP models like Transformers. 🌐 Check it out here: https://lnkd.in/d_4WdC4M Your thoughts and feedback would mean a lot to me! If you spot anything that could use some improvement or have any suggestions, feel free to share. 🙌 Let’s keep learning and growing together. #selfattention #nlp #transformers #machinelearning #datascience

A Deep Dive into the Self-Attention Mechanism of Transformers

medium.com
Like Comment
To view or add a comment, sign in
Mohammad Adde

Machine Learning Advocate | Helping the Medical Industry Integrate AI with Medicine | Software Engineer | Building Telemedicine Startups
10mo
Report this post
🎉 🎊 Thrilled to announce the publication of my latest article, 'Navigating the GenAI Frontier: Transformers, GPT, and the Path to Accelerated Innovation,' crafted during my internship at Innomatics Research Labs. Even amidst the hustle, the urge to explore groundbreaking NLP research couldn't be stifled! Under the guidance of Kanav Bansal, my mentor at Innomatics Research Labs, I embarked on a journey through pivotal moments in NLP. From the pioneering Seq2Seq and NMT models to the revolutionary Transformers and the ascent of GPT, I've provided a concise yet comprehensive analysis of these game-changing works. Unveiling their historical context, significance, and impact on NLP's evolution. I extend an open invitation to dive into the article, share your insights, and witness firsthand the transformative power of these architectures. Your feedback fuels my learning journey as we pave the way for the future of NLP. #NLP #MachineLearning #DeepLearning #AI #InnomaticsResearchLabs

Navigating the GenAI Frontier: Transformers, GPT, and the Path to Accelerated Innovation

adde.hashnode.dev
Like Comment
To view or add a comment, sign in
Mahadevia Hitarth Hiten

Aspiring Data Scientist and Machine Learning Enthusiast | Tech Trailblazer
8mo
Report this post
🚀 Day 04/50: Exploring Named Entity Recognition (NER) 🌐 Today, I embarked on a journey into Named Entity Recognition (NER) as part of my 50 Days of NLP challenge. NER is a crucial task in Natural Language Processing that involves identifying and categorizing names of entities such as people, organizations, and locations within text. Project Highlights: 🔍 Objective: Implement a NER model to automatically identify and classify named entities in textual data. 🔧 Techniques: Leveraged SpaCy's 'en_core_web_sm' model for robust entity recognition and classification. 🌟 Achievements: Successfully extracted and categorized entities across diverse texts, demonstrating the model's accuracy and versatility. Why NER Matters: Named Entity Recognition is pivotal in: - Information Extraction: Automating the extraction of key entities from unstructured text for further analysis. - Entity Linking: Connecting recognized entities to external knowledge bases for enriched understanding. - Entity-centric Applications: Enhancing search engines, chatbots, and sentiment analysis systems. Next Steps: Looking ahead, I plan to: - Dive deeper into advanced NER techniques such as domain-specific entity recognition. - Explore integrations with other NLP tasks like sentiment analysis and text summarization. - Apply NER in real-world scenarios to solve industry-specific challenges and improve data-driven decision-making. Join me on this journey as I continue to unravel the power of NLP in unlocking insights from textual data. Let's connect and exchange ideas on the transformative potential of Named Entity Recognition! #NLP #NamedEntityRecognition #MachineLearning #DataScience #LinkedInLearning #50DaysOfNLP #NaturalLanguageProcessing GitHub Repository: https://lnkd.in/dMJYN3XR

GitHub - HitarthMahadevia/50-days-of-NLP: This repository is dedicated to my journey of exploring and mastering Natural Language Processing (NLP) through a series of 50 projects.

github.com
Like Comment
To view or add a comment, sign in

3,572 followers

17 Posts

View Profile Follow

Danyal Alam’s Post

More Relevant Posts

Explore topics