Daniel Sarfraz’s Post

View profile for Daniel Sarfraz

3D Generalist - Bringing Fantastic CG Visuals To Films and Commercials.

🚀 Excited to Share My Latest AI Project! 🚀 I'm thrilled to present my recent work on training a small language model (SLM) inspired by Andrej Karpathy's nanoGPT. This experiment aimed to push the boundaries of what a relatively small model can achieve in terms of generating coherent text. 🔍 Project Highlights: - Model Size: 123.59 million parameters. - Datasets: High Quality datasets sourced from Hugging Face. - Training: Initial training on ~1.4 billion tokens, fine-tuning to enhance specific task performance. - Results: The model generates coherent text but faces challenges with instruction-following and information retrieval—demonstrating the limitations and capabilities of smaller models. 📂 Explore the project on GitHub: https://lnkd.in/eMSBAQwx 💡 Acknowledgements: A huge thank you to Andrej Karpathy for nanoGPT and to the dataset providers on Hugging Face. Feel free to fork the repo and send in your pull requests. Let's push the boundaries of AI together! 🌟 #AI #MachineLearning #LanguageModel #DeepLearning #HuggingFace #OpenSource #ArtificialIntelligence #DataScience

  • text, letter

To view or add a comment, sign in

Explore topics