🚀 Excited to Share My Latest AI Project! 🚀 I'm thrilled to present my recent work on training a small language model (SLM) inspired by Andrej Karpathy's nanoGPT. This experiment aimed to push the boundaries of what a relatively small model can achieve in terms of generating coherent text. 🔍 Project Highlights: - Model Size: 123.59 million parameters. - Datasets: High Quality datasets sourced from Hugging Face. - Training: Initial training on ~1.4 billion tokens, fine-tuning to enhance specific task performance. - Results: The model generates coherent text but faces challenges with instruction-following and information retrieval—demonstrating the limitations and capabilities of smaller models. 📂 Explore the project on GitHub: https://lnkd.in/eMSBAQwx 💡 Acknowledgements: A huge thank you to Andrej Karpathy for nanoGPT and to the dataset providers on Hugging Face. Feel free to fork the repo and send in your pull requests. Let's push the boundaries of AI together! 🌟 #AI #MachineLearning #LanguageModel #DeepLearning #HuggingFace #OpenSource #ArtificialIntelligence #DataScience
Daniel Sarfraz’s Post
More Relevant Posts
-
Free Resources to harness the power of AI: Embrace the wealth of free AI resources, Dive in, BUT remember, transforming knowledge into AI expertise requires dedication and consistent effort. 𝗬𝗼𝘂𝗧𝘂𝗯𝗲 𝗖𝗵𝗮𝗻𝗻𝗲𝗹𝘀 • mattwolfe • dirkzee • csdojo • analyticsvidhya • twominutepapers 𝗕𝗹𝗼𝗴 𝗪𝗲𝗯𝘀𝗶𝘁𝗲𝘀: • towardsdatascience • machinelearningisfun • machinelearningmastery • fastml • Ai.googleblog 𝗗𝗮𝘁𝗮𝘀𝗲𝘁 𝗪𝗲𝗯𝘀𝗶𝘁𝗲𝘀: • paperswithcode • huggingfacedatasets • openml • machinehackdatasets • Googleplatformdatasets 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗪𝗲𝗯𝘀𝗶𝘁𝗲𝘀: • mygreatlearning • classcentral • dirkzee • simplilearn • Edx 𝗣𝗼𝗱𝗰𝗮𝘀𝘁𝘀: • alunleashed • theneuralnexus • bytesofintelligence • mindsandmachines • Algorithmalley 𝗕𝗼𝗼𝗸𝘀: • almodernapproach - Russell • deeplearning - Goodfellow • mlprobabilisticperspective - Murphy • pythonml - Raschka • aianewsynthesis - Nilsson 𝗔𝗜 𝗖𝗼𝗺𝗺𝘂𝗻𝗶𝘁𝘆: • rmachinelearning (Reddit) • stackoverflowal • kaggleforums • towardsdatascience • alforumdscentral What is your favorite? What would you add? Follow Dirk Zee to stay ahead with AI: https://lnkd.in/gzfFuk4X
To view or add a comment, sign in
-
-
Free Resources to harness the power of AI: Embrace the wealth of free AI resources, Dive in, BUT remember, transforming knowledge into AI expertise requires dedication and consistent effort. #ai #technology #innovation #datagovernance
Free Resources to harness the power of AI: Embrace the wealth of free AI resources, Dive in, BUT remember, transforming knowledge into AI expertise requires dedication and consistent effort. 𝗬𝗼𝘂𝗧𝘂𝗯𝗲 𝗖𝗵𝗮𝗻𝗻𝗲𝗹𝘀 • mattwolfe • dirkzee • csdojo • analyticsvidhya • twominutepapers 𝗕𝗹𝗼𝗴 𝗪𝗲𝗯𝘀𝗶𝘁𝗲𝘀: • towardsdatascience • machinelearningisfun • machinelearningmastery • fastml • Ai.googleblog 𝗗𝗮𝘁𝗮𝘀𝗲𝘁 𝗪𝗲𝗯𝘀𝗶𝘁𝗲𝘀: • paperswithcode • huggingfacedatasets • openml • machinehackdatasets • Googleplatformdatasets 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗪𝗲𝗯𝘀𝗶𝘁𝗲𝘀: • mygreatlearning • classcentral • dirkzee • simplilearn • Edx 𝗣𝗼𝗱𝗰𝗮𝘀𝘁𝘀: • alunleashed • theneuralnexus • bytesofintelligence • mindsandmachines • Algorithmalley 𝗕𝗼𝗼𝗸𝘀: • almodernapproach - Russell • deeplearning - Goodfellow • mlprobabilisticperspective - Murphy • pythonml - Raschka • aianewsynthesis - Nilsson 𝗔𝗜 𝗖𝗼𝗺𝗺𝘂𝗻𝗶𝘁𝘆: • rmachinelearning (Reddit) • stackoverflowal • kaggleforums • towardsdatascience • alforumdscentral What is your favorite? What would you add? Follow Dirk Zee to stay ahead with AI: https://lnkd.in/gzfFuk4X
To view or add a comment, sign in
-
-
Free Resources to harness the power of AI: Embrace the wealth of free AI resources, Dive in, BUT remember, transforming knowledge into AI expertise requires dedication and consistent effort. 𝗬𝗼𝘂𝗧𝘂𝗯𝗲 𝗖𝗵𝗮𝗻𝗻𝗲𝗹𝘀 • mattwolfe • dirkzee • csdojo • analyticsvidhya • twominutepapers 𝗕𝗹𝗼𝗴 𝗪𝗲𝗯𝘀𝗶𝘁𝗲𝘀: • towardsdatascience • machinelearningisfun • machinelearningmastery • fastml • Ai.googleblog 𝗗𝗮𝘁𝗮𝘀𝗲𝘁 𝗪𝗲𝗯𝘀𝗶𝘁𝗲𝘀: • paperswithcode • huggingfacedatasets • openml • machinehackdatasets • Googleplatformdatasets 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗪𝗲𝗯𝘀𝗶𝘁𝗲𝘀: • mygreatlearning • classcentral • dirkzee • simplilearn • Edx 𝗣𝗼𝗱𝗰𝗮𝘀𝘁𝘀: • alunleashed • theneuralnexus • bytesofintelligence • mindsandmachines • Algorithmalley 𝗕𝗼𝗼𝗸𝘀: • almodernapproach - Russell • deeplearning - Goodfellow • mlprobabilisticperspective - Murphy • pythonml - Raschka • aianewsynthesis - Nilsson 𝗔𝗜 𝗖𝗼𝗺𝗺𝘂𝗻𝗶𝘁𝘆: • rmachinelearning (Reddit) • stackoverflowal • kaggleforums • towardsdatascience • alforumdscentral What is your favorite? What would you add? Follow Dirk Zee to stay ahead with AI: https://lnkd.in/gzfFuk4X
To view or add a comment, sign in
-
-
Free Resources to harness the power of AI: Embrace the wealth of free AI resources, Dive in, BUT remember, transforming knowledge into AI expertise requires dedication and consistent effort. 𝗬𝗼𝘂𝗧𝘂𝗯𝗲 𝗖𝗵𝗮𝗻𝗻𝗲𝗹𝘀 • mattwolfe • dirkzee • csdojo • analyticsvidhya • twominutepapers 𝗕𝗹𝗼𝗴 𝗪𝗲𝗯𝘀𝗶𝘁𝗲𝘀: • towardsdatascience • machinelearningisfun • machinelearningmastery • fastml • Ai.googleblog 𝗗𝗮𝘁𝗮𝘀𝗲𝘁 𝗪𝗲𝗯𝘀𝗶𝘁𝗲𝘀: • paperswithcode • huggingfacedatasets • openml • machinehackdatasets • Googleplatformdatasets 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗪𝗲𝗯𝘀𝗶𝘁𝗲𝘀: • mygreatlearning • classcentral • dirkzee • simplilearn • Edx 𝗣𝗼𝗱𝗰𝗮𝘀𝘁𝘀: • alunleashed • theneuralnexus • bytesofintelligence • mindsandmachines • Algorithmalley 𝗕𝗼𝗼𝗸𝘀: • almodernapproach - Russell • deeplearning - Goodfellow • mlprobabilisticperspective - Murphy • pythonml - Raschka • aianewsynthesis - Nilsson 𝗔𝗜 𝗖𝗼𝗺𝗺𝘂𝗻𝗶𝘁𝘆: • rmachinelearning (Reddit) • stackoverflowal • kaggleforums • towardsdatascience • alforumdscentral What is your favorite? What would you add? Post credit: Dirk Zee Follow Dirk Zee to stay ahead with AI
To view or add a comment, sign in
-
-
🌟 Excited to be diving deeper into the world of LLMs! 🌟 I've recently been exploring how to fine-tune large language models, and I came across a fantastic beginner-friendly tutorial that provided valuable insights. 📘 The learning journey has been rewarding—especially around concepts like Retrieval-Augmented Generation (RAG) and vector databases. For anyone starting out with LLMs, RAG is a game-changer. It enhances models' ability to retrieve specific, relevant information from vast datasets, making responses more accurate and context-aware. And vector databases? They’re crucial for storing and retrieving embeddings effectively, supporting RAG and taking AI-driven applications to the next level. A big thank you to the creators of this tutorial for making complex topics accessible! 🚀 Looking forward to implementing these techniques in upcoming projects. If you want to check out the video here's the youtube link to video: https://lnkd.in/dqw4QkD6 #LLM #MachineLearning #AI #RAG #VectorDatabase #AICommunity
To view or add a comment, sign in
-
-
Good to know
Free Resources to harness the power of AI: Embrace the wealth of free AI resources, Dive in, BUT remember, transforming knowledge into AI expertise requires dedication and consistent effort. 𝗬𝗼𝘂𝗧𝘂𝗯𝗲 𝗖𝗵𝗮𝗻𝗻𝗲𝗹𝘀 • mattwolfe • dirkzee • csdojo • analyticsvidhya • twominutepapers 𝗕𝗹𝗼𝗴 𝗪𝗲𝗯𝘀𝗶𝘁𝗲𝘀: • towardsdatascience • machinelearningisfun • machinelearningmastery • fastml • Ai.googleblog 𝗗𝗮𝘁𝗮𝘀𝗲𝘁 𝗪𝗲𝗯𝘀𝗶𝘁𝗲𝘀: • paperswithcode • huggingfacedatasets • openml • machinehackdatasets • Googleplatformdatasets 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗪𝗲𝗯𝘀𝗶𝘁𝗲𝘀: • mygreatlearning • classcentral • dirkzee • simplilearn • Edx 𝗣𝗼𝗱𝗰𝗮𝘀𝘁𝘀: • alunleashed • theneuralnexus • bytesofintelligence • mindsandmachines • Algorithmalley 𝗕𝗼𝗼𝗸𝘀: • almodernapproach - Russell • deeplearning - Goodfellow • mlprobabilisticperspective - Murphy • pythonml - Raschka • aianewsynthesis - Nilsson 𝗔𝗜 𝗖𝗼𝗺𝗺𝘂𝗻𝗶𝘁𝘆: • rmachinelearning (Reddit) • stackoverflowal • kaggleforums • towardsdatascience • alforumdscentral What is your favorite? What would you add? Follow Dirk Zee to stay ahead with AI: https://lnkd.in/gzfFuk4X
To view or add a comment, sign in
-
-
🤯 𝗔 𝗪𝗲𝗲𝗸𝗲𝗻𝗱 𝗘𝘅𝗽𝗲𝗿𝗶𝗺𝗲𝗻𝘁 𝗧𝘂𝗿𝗻𝗲𝗱 𝗶𝗻𝘁𝗼 𝗢𝗯𝘀𝗲𝘀𝘀𝗶𝗼𝗻 𝘄𝗶𝘁𝗵 𝗩𝗲𝗰𝘁𝗼𝗿 𝗦𝗲𝗮𝗿𝗰𝗵! A month ago, I began exploring LLMs (Large Language Models) to better understand all the buzz around AI. But something wasn’t quite clicking. My experiments felt surface-level – like I was just getting started but not diving deep enough. Then, I discovered 𝘃𝗲𝗰𝘁𝗼𝗿 𝗱𝗮𝘁𝗮𝗯𝗮𝘀𝗲𝘀, and everything changed. I started small – transforming text into BERT embeddings and experimenting with pgvector. Before I knew it, I had built a full-fledged vector search API! 💻🔍 The moment I saw the similarity scores pop up in my terminal, it was like hitting a breakthrough. The sense of accomplishment was incredible! 🔥 Currently, I’m exploring 𝗥𝗔𝗚 (𝗥𝗲𝘁𝗿𝗶𝗲𝘃𝗮𝗹-𝗔𝘂𝗴𝗺𝗲𝗻𝘁𝗲𝗱 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗼𝗻) applications and how they can enhance knowledge bases. I’m excited to build my first 𝗔𝗜 𝗮𝗴𝗲𝗻𝘁 using this foundation. The possibilities seem endless! 🚀 I’m still in learning mode, but the journey has been incredibly rewarding. If you’ve worked with 𝘃𝗲𝗰𝘁𝗼𝗿 𝗱𝗮𝘁𝗮𝗯𝗮𝘀𝗲𝘀 or have experiences to share, I’d love to hear from you! Drop a comment or connect. 🙌 If you're interested in my learnings, and code snippets, or would like me to share some insights from this adventure, just let me know! 🧑💻 #VectorDatabases #AI #MachineLearning #DataScience #LLMs #AI #RAG
To view or add a comment, sign in
-
-
🚀 OpenAI Announces New AI Model "Strawberry" for Advanced Problem Solving! 🤖🍓 OpenAI has just unveiled a groundbreaking AI model, code-named Strawberry (officially known as OpenAI o1), that takes AI to a new level. Unlike previous models, Strawberry doesn’t just give instant answers—it reasons step by step to solve complex problems! 🧠✨ 🔍 Key Features of OpenAI o1: Logical Reasoning: Solves problems that stump even GPT-4o by thinking out loud like a human. Reinforcement Learning: Improves its reasoning process with positive and negative feedback. Advanced Performance: Demonstrates remarkable improvements in math, chemistry, biology, physics, and coding problems. OpenAI o1 represents a new paradigm in AI, focusing not just on scale, but on reasoning capabilities. The model performed exceptionally well on the American Invitational Mathematics Examination (AIME), solving 83% of problems, compared to 12% for GPT-4o! 💡📊 Exciting times ahead as OpenAI integrates this technology into future models like GPT-5. 🚀 🔗 Read more: https://lnkd.in/d4JbExJH #AI #OpenAI #ArtificialIntelligence #TechInnovation #MachineLearning #ProblemSolving #StrawberryAI
To view or add a comment, sign in
-
-
🌟 Exploring the Power of Finetuning in Large Language Models (LLMs) 🌟 In the world of Generative AI and machine learning, finetuning is an essential step that takes a general-purpose, pre-trained model and refines it for specific tasks or industries. 🚀 What is Finetuning? LLMs are initially trained on vast datasets (web, wiki, books, etc.), which give them a broad understanding of language. However, to make them more accurate and effective for specific needs, finetuning comes into play. It’s the process of training these models further on domain-specific data, enabling them to provide more precise and context-aware responses. 🤖 How does it work? A base LLM is pre-trained on a massive dataset, making it good at general knowledge tasks. Through finetuning with an organization’s or domain-specific dataset, the LLM is further optimized for specific needs. The resulting finetuned LLM becomes more effective at understanding and generating responses tailored to that particular domain. 🔍 Whether you're working with legal documents, medical data, or customer service queries, finetuning can make AI-powered tools much more relevant and powerful. Feel free to share your thoughts on the significance of finetuning or how it's benefiting your field! #AI #MachineLearning #LLM #Finetuning #DataScience #GenerativeAI #Innovation #Technology
To view or add a comment, sign in
-
-
Unlock AI Mastery for Free—But Only If You're Ready to Challenge Your Limits! No fluff, just actionable insights and hands-on experience to propel your AI journey forward. Exciting news for those diving into the world of Large Language Models (LLMs)! The newly released "Mastering LLMs" course is a treasure trove of insights from over 25 industry veterans. This free, open course covers essential topics like retrieval-augmented-generation (RAG), fine-tuning, and more, all geared towards practical applications in AI product development. What sets this course apart is its practical focus and the wealth of real-world experience shared by experts in machine learning, data science, and MLOps. The course is meticulously organized, with chapter summaries, notes, and additional resources to help you navigate over 40 hours of content efficiently. For those looking to deepen their understanding, the course encourages applying learned concepts to personal projects. Testimonials from past students highlight the course's impact, noting its comprehensive coverage and the invaluable community support on Discord. Embark on this learning journey and elevate your AI expertise. Dive into the course, leverage the resources, and transform your understanding of LLMs. Check this article: https://lnkd.in/dqBwNFZQ #AI #MachineLearning #DataScience #MLOps #LLMs #AIProductDevelopment #OnlineLearning ====================== Hit 'Connect' now - I may have opportunities for business and jobs in the future If you're interested in content like this, hit 'Observe' button now.
To view or add a comment, sign in