# Great Deep Learning Tutorials for Natural Language Processing (NLP)
A Great Collection of Deep Learning Tutorials and Repositories for Natural Language Processing (NLP)

## General:
- [Great NLP Posts](http://jalammar.github.io/)  
- [Awesome NLP Paper Discussions - Hugging Face](https://github.com/huggingface/awesome-papers) [_Excellent_]  
- [Ten trends in Deep learning NLP](https://blog.floydhub.com/ten-trends-in-deep-learning-nlp/)  
- [Attention in RNNs](https://medium.com/datadriveninvestor/attention-in-rnns-321fbcd64f05)
- [Understanding self-attention and other types of attention mechanisms](https://www.linkedin.com/posts/sebastianraschka_understanding-and-coding-self-attention-activity-7152300807080546304-uu21?utm_source=share&utm_medium=member_desktop)  
- [BERT - TensorFlow](https://github.com/google-research/bert)  
- [Understanding XLNet](https://www.borealisai.com/en/blog/understanding-xlnet/)  
- [XLNet - TensorFlow](https://github.com/zihangdai/xlnet)  
- [XLM (PyTorch implementation of Cross-lingual Language Model Pretraining)](https://github.com/facebookresearch/XLM)  
- [Pretrained PyTorch models for BERT](https://github.com/huggingface/pytorch-pretrained-BERT)  
- [Library of state-of-the-art pretrained models for NLP](https://github.com/huggingface/pytorch-transformers#quick-tour) [_Excellent_]  
- [DistilBERT](https://medium.com/huggingface/distilbert-8cf3380435b5)
- [FastBert](https://arxiv.org/abs/2311.10770)
- [FastBert Linkedin Post](https://www.linkedin.com/posts/activity-7132888497119485952-GMsV?utm_source=share&utm_medium=member_desktop)  
- [PyTorch Hub - BERT](https://pytorch.org/hub/huggingface_pytorch-pretrained-bert_bert/)  
- [A Simple Guide On Using BERT for Binary Text Classification](https://medium.com/swlh/a-simple-guide-on-using-bert-for-text-classification-bbf041ac8d04)  
- [Core ML 3 implementation of BERT for Question answering](https://github.com/huggingface/swift-coreml-transformers)  
- [NLP - Keras - Intro](https://nlpforhackers.io/keras-intro/)  
- [AllenNLP](https://allennlp.org/)  [_General NLP_]
- [Stanza - A Python NLP Library for Many Human Languages](https://stanfordnlp.github.io/stanza/)  
- [The Best NLP Papers From ICLR 2020](https://www.topbots.com/best-nlp-papers-from-iclr-2020)  
- [Deep learning for natural language processing and information retrieval at the University of Waterloo](https://github.com/castorini)  
- [Natural Language Processing With spaCy in Python](https://realpython.com/natural-language-processing-spacy-python/)  [_Great_]  
- [NLP Papers](https://github.com/AliAkbarBadri/nlp-papers)   
- [A Great NLP Course](https://lena-voita.github.io/nlp_course.html)  
- [KerasNLP: Modular NLP Workflows for Keras](https://github.com/keras-team/keras-nlp)  
- [NLP Test: Deliver Safe & Effective Models](https://github.com/JohnSnowLabs/nlptest)
- [Karpathy minbpe](https://github.com/karpathy/minbpe)
- [Karpathy's 2 Hours Tutorial for Building GPT Tokenizer](https://www.linkedin.com/posts/liorsinclair_andrej-karpathy-just-uploaded-a-new-2-hour-activity-7165765602492571650-io92?utm_source=share&utm_medium=member_desktop)
- [Learning Core Foundational Concepts in NLP by Examples and by calculation by Hand](https://www.linkedin.com/posts/alphasignal_can-foundational-concepts-like-transformers-activity-7163890641054232576-B1ai?utm_source=share&utm_medium=member_android)  
- [SetFit: Efficient Few-shot Learning with Sentence Transformers](https://github.com/huggingface/setfit)  

## General Persian based libraries & Data Sets:
- [Parsivar: library for Persian text preprocessing](https://github.com/ICTRC/Parsivar)   
- [Hazm](https://github.com/sobhe/hazm)  
- [persianNLP](https://github.com/persiannlp)  
- [ParsiNLU: Comprehensive suit of high-level NLP tasks for Persian language](https://github.com/persiannlp/parsinlu)   
- [FarsTail: A Persian Natural Language Inference Dataset](https://github.com/dml-qom/FarsTail)  
- [wordfreq: Access a database of word frequencies](https://github.com/rspeer/wordfreq)  
- [Persian Stop Words List](https://github.com/kharazi/persian-stopwords)  
- [Persian Stop Words List in Hazm Repo](https://github.com/sobhe/hazm/blob/master/hazm/data/stopwords.dat)
- [PCoQA: Persian Conversational Question Answering Dataset](https://github.com/HamedHematian/PCoQA)
- [Khayyam Challenge (PersianMMLU): Is Your LLM Truly Wise to The Persian Language?](https://arxiv.org/html/2404.06644v1) [Good paper & dataset]
- [Basalam Dataset via RadeAI Team](https://www.linkedin.com/posts/rade-ai_datascience-machinelearning-basalam-activity-7193561781280157696-NF8T?utm_source=share&utm_medium=member_desktop)
- [Basalam Datasets for LLM Fine-tuning](https://www.linkedin.com/posts/mohammadreza-esmaeilian-572ba9193_%D8%A7%D9%86%D8%AA%D8%B4%D8%A7%D8%B1-%D8%AF%DB%8C%D8%AA%D8%A7%D8%B3%D8%AA%D9%87%D8%A7-%D9%88-llm%D9%87%D8%A7%DB%8C-%D9%81%D8%A7%DB%8C%D9%86%D8%AA%DB%8C%D9%88%D9%86-%D8%B4%D8%AF%D9%87-%D8%A7%D8%AE%D8%AA%D8%B5%D8%A7%D8%B5%DB%8C-activity-7204220860142989314-VDUO?utm_source=share&utm_medium=member_desktop)
- [ParsBench](https://www.linkedin.com/posts/shahriarshm_llm-dataset-syntheticabrdataset-activity-7278063501909098496-KR0O?utm_source=share&utm_medium=member_desktop)  

## Text Representation:
- [Beyond Word Embeddings Part 1](https://towardsdatascience.com/beyond-word-embeddings-part-1-an-overview-of-neural-nlp-milestones-82b97a47977f)  
- [Beyond Word Embeddings Part 2](https://towardsdatascience.com/beyond-word-embeddings-part-2-word-vectors-nlp-modeling-from-bow-to-bert-4ebd4711d0ec)  
- [Learning Word Embedding](https://lilianweng.github.io/lil-log/2017/10/15/learning-word-embedding.html)  
- [Introduction to Word Embedding and Word2Vec](https://towardsdatascience.com/introduction-to-word-embedding-and-word2vec-652d0c2060fa)  
- [Word Embedding](https://medium.com/data-science-group-iitr/word-embedding-2d05d270b285)  
- [Understanding Word Embeddings](https://hackernoon.com/understanding-word-embeddings-a9ff830403ce)  
- [Introduction to Word Vectors](https://medium.com/@jayeshbahire/introduction-to-word-vectors-ea1d4e4b84bf) 
- [Word2vec Made Easy](https://towardsdatascience.com/word2vec-made-easy-139a31a4b8ae)  
- [What is GloVe? Part I](https://towardsdatascience.com/emnlp-what-is-glove-part-i-3b6ce6a7f970)  
- [What is GloVe? Part II](https://towardsdatascience.com/emnlp-what-is-glove-part-ii-9e5ad227ee0)  
- [What is GloVe? Part III](https://towardsdatascience.com/emnlp-what-is-glove-part-iii-c6090bed114)  
- [What is GloVe? Part IV](https://towardsdatascience.com/emnlp-what-is-glove-part-iv-e605a4c407c8)  
- [What is GloVe? Part V](https://towardsdatascience.com/emnlp-what-is-glove-part-v-fa888272c290)  
- [ELMo: Deep Contextualized Word Representation](https://allennlp.org/elmo)  
- [A Step-by-Step NLP Guide to Learn ELMo](https://www.analyticsvidhya.com/blog/2019/03/learn-to-use-elmo-to-extract-features-from-text/)  
- [ELMo: Contextual language embedding](https://towardsdatascience.com/elmo-contextual-language-embedding-335de2268604)  
- [word embeddings with ELMo](https://medium.com/saarthi-ai/elmo-for-contextual-word-embedding-for-text-classification-24c9693b0045)  
- [Doc2Vec - Gensim](https://radimrehurek.com/gensim/models/doc2vec.html)  

## Self-Supervised Learning in NLP:
- [https://amitness.com/2020/05/self-supervised-learning-nlp/](https://amitness.com/2020/05/self-supervised-learning-nlp/)  
- [COSINE: Fine-Tuning Pre-trained Language Model with Weak Supervision](https://github.com/yueyu1030/COSINE)  

## RNN, LSTM, and GRU:
- [Understanding LSTM Networks](https://colah.github.io/posts/2015-08-Understanding-LSTMs/)  
- [Illustrated Guide to LSTM’s and GRU’s](https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21)  
- [Animated RNN, LSTM and GRU](https://towardsdatascience.com/animated-rnn-lstm-and-gru-ef124d06cf45)  
- [Recurrent Neural Networks and LSTM explained](https://medium.com/@purnasaigudikandula/recurrent-neural-networks-and-lstm-explained-7f51c7f6bbb9)  
- [Long Short-Term Memory (LSTM): Concept](https://medium.com/@kangeugine/long-short-term-memory-lstm-concept-cb3283934359)  
- [Understanding architecture of LSTM cell from scratch](https://hackernoon.com/understanding-architecture-of-lstm-cell-from-scratch-with-code-8da40f0b71f4)  
- [Basic understanding of LSTM](https://blog.goodaudience.com/basic-understanding-of-lstm-539f3b013f1e)  
- [Taming LSTMs with PyTorch](https://towardsdatascience.com/taming-lstms-variable-sized-mini-batches-and-why-pytorch-is-good-for-your-health-61d35642972e)  
- [Introduction to LSTM](https://www.analyticsvidhya.com/blog/2017/12/fundamentals-of-deep-learning-introduction-to-lstm/?utm_medium=ELMoNLParticle&utm_source=blog)  
- [Introduction to RNNs](https://www.jeremyjordan.me/introduction-to-recurrent-neural-networks/)
- [xLSTM - Post1](https://www.linkedin.com/posts/liorsinclair_is-this-the-end-of-transformers-the-team-activity-7194350205318701056-8yBr?utm_source=share&utm_medium=member_desktop)
- [Were RNNs All We Needed?](https://arxiv.org/abs/2410.01201) [Interesting Paper]  

## Transformers:
- [How Transformers Work](https://towardsdatascience.com/transformers-141e32e69591)  
- [The Illustrated Transformer](http://jalammar.github.io/illustrated-transformer/)
- [Transformers from Scratch](https://e2eml.school/transformers.html)  
- [What is a Transformer?](https://medium.com/inside-machine-learning/what-is-a-transformer-d07dd1fbec04)  
- [How Transformers work in deep learning and NLP](https://theaisummer.com/transformer/)    
- [Transformer: A Novel Neural Network Architecture for Language Understanding](https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html)  
- [How do Transformers Work in NLP?](https://www.analyticsvidhya.com/blog/2019/06/understanding-transformers-nlp-state-of-the-art-models/)  
- [The Essence of Transformers](https://towardsdatascience.com/the-essence-of-transformers-9fb8e14cc465) [Good]    
- [Transformers and Multi Head Attention](https://uvadlc-notebooks.readthedocs.io/en/latest/tutorial_notebooks/tutorial6/Transformers_and_MHAttention.html)  
- [Multi Head Attention](https://d2l.ai/chapter_attention-mechanisms-and-transformers/multihead-attention.html)  
- [BERT for Dummies](https://towardsdatascience.com/bert-for-dummies-step-by-step-tutorial-fb90890ffe03)  
- [The Dark Secrets of BERT](https://text-machine-lab.github.io/blog/2020/bert-secrets/)  
- [A Survey of Long-Term Context in Transformers](https://www.pragmatic.ml/a-survey-of-methods-for-incorporating-long-term-context/) [_Great_]  
- [The Transformer Family](https://lilianweng.github.io/lil-log/2020/04/07/the-transformer-family.html)  
- [The Transformer Isn’t As Hard To Understand As You Might Think](https://towardsdatascience.com/knocking-on-transformers-door-attention-mechanism-explained-intuitively-df5d4fcecdf8)  
- [Review of Compact Transformer Architectures](https://medium.com/@jfd2139/review-of-compact-transformer-architectures-c477b797e2d5) [**Great**]   
- [REFORMER: The Efficient Transformer](https://arxiv.org/pdf/2001.04451.pdf)  
- [GPT-3: Language Models are Few-Shot Learners](https://github.com/openai/gpt-3)  
- [GPT-3 Sandbox](https://github.com/shreyashankar/gpt3-sandbox)  
- [Microsoft will launch GPT-4](https://medium.com/@yablonassaf/microsoft-will-launch-gpt-4-with-ai-videos-on-wednesday-75d882e0260e)  
- [OpenAI GPT-4](https://openai.com/research/gpt-4)  
- [Some information about GPT-4](https://www.linkedin.com/posts/damienbenveniste_machinelearning-datascience-artificialintelligence-activity-7041793426530390016-5P-n/?utm_source=share&utm_medium=member_android)  
- [Regular Expressions (Regex) Generated by GPT-3](https://losslesshq.com/)  
- [Auto Regex: Converting English description to Regex](https://www.autoregex.xyz/) [Good]  
- [minGPT](https://github.com/karpathy/minGPT)  
- [NVIDIA FasterTransformer: Transformer related optimization, including BERT & GPT](https://github.com/NVIDIA/FasterTransformer)  
- [OpenNMT CTranslate2: Fast inference engine for Transformer models](https://github.com/OpenNMT/CTranslate2/)  
- [Deploying GPT-J and T5 with FasterTransformer and Triton Inference Server](https://developer.nvidia.com/blog/deploying-gpt-j-and-t5-with-fastertransformer-and-triton-inference-server/?ncid=so-link-499508#cid=dl05_so-link_en-us) [Interesting]  
- [MEND: Fast Model Editing at Scale](https://github.com/eric-mitchell/mend) [**Excellent Work**]   
- [BorealisAI Transformers I: Introduction](https://www.borealisai.com/research-blogs/tutorial-14-transformers-i-introduction/)  
- [OpenAI Best Practices for Deploying Language Models](https://openai.com/blog/best-practices-for-deploying-language-models/)  
- [OPT-IML](https://github.com/facebookresearch/metaseq/tree/main/projects/OPT-IML)
- [RetNet: an Alternative to Transformers](https://www.linkedin.com/posts/aleksagordic_an-alternative-to-transformers-whoa-activity-7087790555190980608-66ZM?utm_source=share&utm_medium=member_android)
- [What comes after Transformers?](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_what-comes-after-transformers-neural-memory-activity-7402992391957270528-mj34?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAgksdYBFu3_vG0bwXWdh93rSqV1J1ghMP4)  
- [Transformer Taxonomy](https://kipp.ly/blog/transformer-taxonomy/) [Great]
- [Generative AI exists because of the transformer: Great Visual Explanation](https://ig.ft.com/generative-ai/) [Great]  

### Reinforcement Learning from Human Feedback (RLHF):
- [RLHF Tutorial](https://vinija.ai/concepts/RLHF/)
- [New method instead of RLHF: Direct Preference Optimization: Your Language Model is Secretly a Reward Model](https://www.linkedin.com/posts/yoelzeldes_to-get-llms-as-good-as-openais-gpt-4-is-activity-7078958558519656451-N6Wo/?utm_source=share&utm_medium=member_android)
- [Finetuning an LLM: RLHF and alternatives (Part I)](https://argilla.io/blog/mantisnlp-rlhf-part-1/)
- [Finetuning an LLM: RLHF and alternatives (Part II)](https://argilla.io/blog/mantisnlp-rlhf-part-2/)
- [Finetuning an LLM: RLHF and alternatives (Part III)](https://argilla.io/blog/mantisnlp-rlhf-part-3/)
- [How good is AI feedback?](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_how-good-is-ai-feedback-and-does-it-really-activity-7171174171413102592-eVs9?utm_source=share&utm_medium=member_desktop)
- [Direct Preference Optimization (DPO) for LLM Alignment (From Scratch)](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb)  

### Tokenizer Notes:
- [𝗻𝗲𝘄 𝗽𝗮𝗽𝗲𝗿 𝗯𝘆 𝗠𝗲𝘁𝗮 𝗰𝗹𝗮𝗶𝗺𝘀 𝘁𝗵𝗮𝘁 𝘄𝗲 𝗰𝗮𝗻 𝗴𝗲𝘁 𝗿𝗶𝗱 𝗼𝗳 𝘁𝗼𝗸𝗲𝗻𝗶𝘇𝗲𝗿𝘀: Byte Latent Transformer: Patches Scale Better Than Tokens --> we could get rid of tokenizers](https://www.linkedin.com/posts/a-roucher_%F0%9D%97%A3%F0%9D%97%BC%F0%9D%98%81%F0%9D%97%B2%F0%9D%97%BB%F0%9D%98%81%F0%9D%97%B6%F0%9D%97%AE%F0%9D%97%B9-%F0%9D%97%BD%F0%9D%97%AE%F0%9D%97%BF%F0%9D%97%AE%F0%9D%97%B1%F0%9D%97%B6%F0%9D%97%B4%F0%9D%97%BA-%F0%9D%98%80%F0%9D%97%B5%F0%9D%97%B6%F0%9D%97%B3%F0%9D%98%81-activity-7273382398891810816-QfQo?utm_source=share&utm_medium=member_desktop)
- [Byte Latent Transformer: Patches Scale Better Than Tokens (paper)](https://dl.fbaipublicfiles.com/blt/BLT__Patches_Scale_Better_Than_Tokens.pdf)  

### Large Language Models (LLMs):
- [LLM Reading Papers](https://www.linkedin.com/posts/eric-vyacheslav-156273169_new-must-read-the-anti-hype-llm-reading-activity-7247244292568625152-DQsb?utm_source=share&utm_medium=member_desktop)  
- [LLaMA](https://github.com/facebookresearch/llama)  
- [Toolformer: Language Models Can Teach Themselves to Use Tools](https://arxiv.org/abs/2302.04761) [Great]  
- [Toolformer GitHub](https://github.com/lucidrains/toolformer-pytorch)  
- [Amazon Multimodal Chain-of-Thought Reasoning in Language Models](https://github.com/amazon-science/mm-cot)  
- [LLaMA-based ChatGPT Training](https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/chatllama) [Great]  
- [The Wisdom of Hindsight Makes Language Models Better Instruction Followers](https://arxiv.org/abs/2302.05206)  
- [Stanford Alpaca: An Instruction-following LLaMA model](https://github.com/tatsu-lab/stanford_alpaca)  
- [Alpaca: A Strong, Replicable Instruction-Following Model](https://crfm.stanford.edu/2023/03/13/alpaca.html)  
- [Fine-Tune Alpaca in Arabic](https://www.linkedin.com/posts/yassine-boukhari-006748217_alpaca-a-strong-replicable-instruction-following-activity-7043223149710036992-YUJb?utm_source=share&utm_medium=member_android)  
- [TRL: Transformer Reinforcement Learning](https://github.com/lvwerra/trl)  
- [Large Language Model (LLM) Primers Tutorial](https://www.linkedin.com/posts/amanc_artificialintelligence-machinelearning-ai-activity-7045245910850695168-Fp9K/?utm_source=share&utm_medium=member_android) [Great]  
- [Dolly](https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html)  
- [Microsoft JARVIS & HuggingGPT](https://github.com/microsoft/JARVIS) [Interesting]  
- [open-source LLMs](https://www.linkedin.com/posts/sahar-mor_artificialintelligence-machinelearning-activity-7049789761728770049-QLsv/?utm_source=share&utm_medium=member_android)  
- [GPT4Free](https://github.com/xtekky/gpt4free)  
- [HuggingChat](https://huggingface.co/chat/)  
- [LaMini-LM: A Diverse Herd of Distilled Models](https://github.com/mbzuai-nlp/LaMini-LM/)  
- [RedPajama-Data: An Open Source Recipe to Reproduce LLaMA training dataset](https://github.com/togethercomputer/RedPajama-Data)  
- [BigCode](https://huggingface.co/bigcode)  
- [OpenLLaMA](https://github.com/openlm-research/open_llama)
- [Dromedary: towards helpful, ethical and reliable LLMs](https://github.com/IBM/Dromedary)  
- [MPT-7B Model with Commercial Licence](https://huggingface.co/mosaicml/mpt-7b/blob/main/README.md)  
- [MPT-7B Story Writer](https://huggingface.co/mosaicml/mpt-7b-storywriter)  
- [MPT-7B](https://github.com/mosaicml/llm-foundry)  
- [MPT-7B Blog](https://www.mosaicml.com/blog/mpt-7b)  
- [Open LLMs](https://github.com/eugeneyan/open-llms)  
- [Google PaLM 2](https://ai.google/discover/palm2)  
- [BLOOMChat](https://github.com/sambanova/bloomchat)
- [LLMs Practical Guide](https://github.com/Mooler0410/LLMsPracticalGuide)
- [FrugalGPT](https://www.linkedin.com/posts/sanyambhutani_saving-98-llm-usage-costs-stanford-activity-7062420577357037568-t0a8/?utm_source=share&utm_medium=member_android)  
- [ChatALL](https://github.com/sunner/ChatALL) [Great]   
- [Falcon LLM](https://falconllm.tii.ae/)  
- [The Falcon has landed in the Hugging Face ecosystem](https://huggingface.co/blog/falcon) [Great]  
- [Open LLMs](https://github.com/eugeneyan/open-llms) [Great]
- [OpenLLMs: Less is More for Open-source Models](https://github.com/imoneoi/openchat) [Great]
- [LLaMA2](https://www.llama2.ai/)
- [source code of llama2-chatbot](https://github.com/a16z-infra/llama2-chatbot/tree/main)  
- [Notes about OpenAI's GPT-4 Model](https://www.linkedin.com/posts/aleksagordic_openais-gpt-4-details-have-apparently-been-activity-7085226267712614400-T1d3/?utm_source=share&utm_medium=member_android)
- [GPT-4 is getting worse over time](https://www.linkedin.com/posts/svpino_gpt-4-is-getting-worse-over-time-not-better-activity-7087379892077481984-uORp/?utm_source=share&utm_medium=member_android)  
- [OpenChat: Less is More for Open-source Models](https://huggingface.co/openchat/openchat)
- [Instruction Tuning Datasets](https://github.com/raunak-agarwal/instruction-datasets)
- [ToolLLM](https://www.linkedin.com/posts/omarsar_enabling-llms-with-tool-use-capabilities-activity-7093299751571320832-1WHU/?utm_source=share&utm_medium=member_android)
- [Falcon 180B](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_falcon-180b-released-tii-just-released-activity-7105166508376367105-P7ws?utm_source=share&utm_medium=member_desktop)
- [Fine-tune Falcon 180B using QLoRA and Flash Attention on Amazon SageMaker](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_fine-tune-falcon-180b-with-qlora-and-flash-activity-7107387875515580416-zhSe?utm_source=share&utm_medium=member_desktop)
- [Large Language Models as Optimizers](https://arxiv.org/abs/2309.03409)
- [Favourite LLM Authors](https://www.linkedin.com/posts/sanyambhutani_curated-list-of-my-favourite-llm-authors-activity-7105896422226423808-Unev?utm_source=share&utm_medium=member_desktop)
- [Open Source LLMs for Commercial Use](https://www.linkedin.com/posts/armand-ruiz_top-open-source-llms-available-for-commercial-activity-7137772625468002304-jkMM?utm_source=share&utm_medium=member_desktop)  
- [Optimizing your LLM in production](https://huggingface.co/blog/optimize-llm) [Important]
- [In Context Vectors (ICV): an alternative to Few-Shot Learning and Finetuning techniques like LoRA to improve an LLMs performance](https://www.linkedin.com/posts/pramodith_in-context-vectors-icv-is-an-alternative-activity-7131970618467471360-67Z3?utm_source=share&utm_medium=member_desktop)
- [NexusRavan v2 13B Fuction Calling LLM Surpassing GPT-4](https://www.linkedin.com/posts/nexusflow-ai_nexusravenv2-opensource-genai-activity-7137805301323362304-U2Pl?utm_source=share&utm_medium=member_desktop)
- [Phixtral model](https://www.linkedin.com/posts/maxime-labonne_phixtral-i-made-the-first-efficient-mixture-activity-7150758415961620481-v0qx?utm_source=share&utm_medium=member_desktop)
- [Eagle-7B LLM: 100% attention-free RNN Model!](https://www.linkedin.com/posts/maxime-labonne_rwkv-released-eagle-7b-its-an-llm-that-activity-7157700712330661888-cdd1?utm_source=share&utm_medium=member_desktop)
- [Eagle-7B LLM: Blog Post](https://blog.rwkv.com/p/eagle-7b-soaring-past-transformers)   
- [Can LLMs improve themselves? Self-play fine-tuning (SPIN)](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_can-llms-improve-themselves-self-play-fine-tuning-activity-7150501901665542144-mk4K?utm_source=share&utm_medium=member_desktop)
- [AI2 OLMo Model: Linkedin Post](https://www.linkedin.com/posts/natolambert_allenaiolmo-7b-hugging-face-activity-7158834284689035264-vfu7?utm_source=share&utm_medium=member_desktop)
- [AI2 OLMo Model: HuggingFace](https://huggingface.co/allenai/OLMo-7B)
- [AI2 OLMo Model: Original Blog post](https://www.interconnects.ai/p/olmo)
- [Some Notes about OLMo Model](https://www.linkedin.com/posts/sebastianraschka_ive-been-working-with-the-1b7b-olmo-models-activity-7166067492778360832-kc3T?utm_source=share&utm_medium=member_desktop)  
- [Mixtral in colab](https://github.com/dvmazur/mixtral-offloading/blob/master/notebooks/demo.ipynb) [Great]
- [Grok-1 LLM with 314B Size: Post1](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_elon-musk-kept-his-word-and-released-grok-activity-7175221121472983040-F7zS?utm_source=share&utm_medium=member_desktop)
- [Grok-1 LLM: Post2](https://www.linkedin.com/posts/liorsinclair_big-news-grok-is-finally-open-source-with-activity-7175496738948968448--Ewx?utm_source=share&utm_medium=member_desktop)
- [Grok-3 LLM from xAI](https://x.com/lmarena_ai/status/1891706264800936307)
- [Grok-3 LLM from xAI - karpathy](https://x.com/karpathy/status/1891720635363254772)  
- [DBRX LLM](https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm)
- [DBRX LLM: Post1](https://www.linkedin.com/posts/mateizaharia_at-databricks-weve-built-an-awesome-model-activity-7178738621099769857-v4X8?utm_source=share&utm_medium=member_desktop)
- [DBRX LLM: Post2](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_new-state-of-the-art-open-llm-databricks-activity-7178748050117451776-Otgg?utm_source=share&utm_medium=member_desktop)
- [LLMs via Multi-Token Prediction](https://www.linkedin.com/posts/aiatmeta_new-research-from-fair-better-faster-large-activity-7194022959609438208-TH1u?utm_source=share&utm_medium=member_android)
- [Test Time Computing for Open LLMs](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_how-we-implemented-test-time-computing-for-activity-7274685354895458304-elNI?utm_source=share&utm_medium=member_desktop)  

### Merge LLMs:
- [Linkedin Post](https://www.linkedin.com/posts/maxime-labonne_merge-large-language-models-with-mergekit-activity-7150044812337901569-3zIu?utm_source=share&utm_medium=member_android)  
- [Colab Notebook](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb)
- [Main Github of Mergekit](https://github.com/cg123/mergekit)  
- [huggingface merge-models blog post](https://huggingface.co/blog/mlabonne/merge-models)
- [Making the NeuralBeagle14-7B LLM Model (via Merging models and other methods)](https://www.linkedin.com/posts/maxime-labonne_heres-how-i-made-the-new-best-performing-activity-7153302680780640256-1Sv7?utm_source=share&utm_medium=member_desktop)
- [Merge Large Language Models with mergekit](https://towardsdatascience.com/merge-large-language-models-with-mergekit-2118fb392b54)
- [Fine-tune a Mistral-7b model with Direct Preference Optimization](https://towardsdatascience.com/fine-tune-a-mistral-7b-model-with-direct-preference-optimization-708042745aac)
- [AutoMerger](https://www.linkedin.com/posts/maxime-labonne_automerger-how-i-automated-the-model-merging-activity-7172890188430454786-Djs7?utm_source=share&utm_medium=member_desktop)
- [Evolutionary LLM Merging - Post1](https://www.linkedin.com/posts/maxime-labonne_evolutionary-model-merge-sakana-ai-released-activity-7176527260097597440-52JT?utm_source=share&utm_medium=member_desktop)  
- [Evolutionary LLM Merging - Post2](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_the-evolution-of-llms-model-merging-is-activity-7176561819933671424-NNNX?utm_source=share&utm_medium=member_desktop)
- [Mixture of Experts (MoEs) Explained](https://huggingface.co/blog/moe) [Great]
- [Mixture of Experts (MoEs) Papers List](https://huggingface.co/collections/osanseviero/moes-papers-reading-list-65a83f8a9aec16459920ffe0)
- [Mixture of Experts (MoEs) Linkedin Post](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_mixture-of-experts-explained-activity-7179478562398187520-dbzM?utm_source=share&utm_medium=member_desktop)
- [Mixture-of-Depths - Post1](https://www.linkedin.com/posts/zaiinulabideen_crazy-ai-week-mixture-of-depths-qwen15-activity-7182746449921658880-aLVO?utm_source=share&utm_medium=member_desktop)
- [Mixture-of-Depths (MoD) - Post2](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_can-we-train-llms-to-allocate-flops-compute-activity-7182303286429917184-jkOm?utm_source=share&utm_medium=member_desktop)
- [AutoLoRA-Merging Linkedin Post](https://www.linkedin.com/posts/zaiinulabideen_autolora-merging-ties-dare-magnitudeprune-activity-7166081059166662658-OzxA?utm_source=share&utm_medium=member_desktop)  

### LLaMA2 Related Links:
- [A colab gradio web UI for running Large Language Models](https://github.com/camenduru/text-generation-webui-colab) [Great]  
- [llama-2-7b-chat-GPTQ-4bit](https://colab.research.google.com/github/camenduru/text-generation-webui-colab/blob/main/llama-2-7b-chat-GPTQ-4bit.ipynb)
- [camenduru](https://github.com/camenduru)
- [llama-2 philschmid](https://www.philschmid.de/llama-2)
- [fine-tuning LLMs with TRL](https://www.linkedin.com/posts/lvwerra_it-crazy-how-far-the-ml-field-has-come-when-activity-7087699813009383425-Sr1y/?utm_source=share&utm_medium=member_android)
- [lora tuning peft finetuning llama2](https://huggingface.co/docs/trl/main/en/lora_tuning_peft#finetuning-llama2-model)
- [LLaMA2 with PEFT](https://www.linkedin.com/posts/gante_unleash-the-true-llama-2-potential-from-day-activity-7087363261666328577-38jV/?utm_source=share&utm_medium=member_android)
- [Baby LLaMA2 in C](https://github.com/karpathy/llama2.c)
- [Releasing LLongMA-2 16k](https://www.linkedin.com/posts/enrico-shippole-495521b8_conceptofmindllongma-2-13b-16k-hugging-activity-7090718505183928320-DYtD/?utm_source=share&utm_medium=member_android)
- [LLaMA2 API in Hugging Face Inference](https://www.linkedin.com/feed/update/urn:li:activity:7089986843839979521/?utm_source=share&utm_medium=member_android)
- [LLaMA2 API in Monster API](https://monsterapi.ai/llama-2-7b-chat-api)  
- [LLaMA2-Accessory](https://github.com/Alpha-VLLM/LLaMA2-Accessory)
- [Hermes-LLongMA-2 8k](https://www.linkedin.com/posts/enrico-shippole-495521b8_conceptofmindhermes-llongma-2-13b-8k-hugging-activity-7092178977217282049-JZB8/?utm_source=share&utm_medium=member_android)
- [Training Llama 2](https://www.linkedin.com/posts/bhavsarpratik_llama2-finetuning-genai-activity-7092496767870509056-RojZ/?utm_source=share&utm_medium=member_android)
- [Llama-2-7B-32K-Instruct — and fine-tuning for Llama-2 models with Together API](https://together.ai/blog/llama-2-7b-32k-instruct)
- [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)
- [LLaMA-Factory Notes](https://www.linkedin.com/posts/rorcde_llama-factory-ai-library-of-the-day-llama-activity-7138958059506143234-t5p2?utm_source=share&utm_medium=member_desktop)
- [Purple llama by Meta - Link1](https://github.com/facebookresearch/PurpleLlama)
- [Purple llama by Meta - Link2](https://www.linkedin.com/posts/aiatmeta_announcing-purple-llama-towards-open-trust-activity-7138536031858937857-edXE?utm_source=share&utm_medium=member_desktop)  
- [Purple llama by Meta - Link3](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_purple-llama-just-got-released-by-meta-activity-7138538944115200001-WKAR?utm_source=share&utm_medium=member_desktop)
- [TinyLLaMa-1.1B](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0)
- [Can llama learn new language?](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_is-it-possible-to-teach-llms-a-different-activity-7148653756165812226--l7o?utm_source=share&utm_medium=member_desktop)
- [Persian LLaMa](https://huggingface.co/spaces/mostafaamiri/persianllama)  

### LLaMA3 Related Links:
- [LLaMA3 Linkedin Post1](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_welcome-llama-3-metas-new-open-llm-activity-7186762894989012992-SBLe?utm_source=share&utm_medium=member_desktop)
- [Meta LLaMA3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)
- [Fine tune LLaMA3](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_efficiently-fine-tune-llama-3-with-pytorch-activity-7188186109363859456-sYSR?utm_source=share&utm_medium=member_desktop)
- [LLaMA3 Long Context](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_llama-3-extended-to-almost-100000-token-activity-7189518531300904963-9Y9V?utm_source=share&utm_medium=member_desktop)
- [LLaMA3.1](https://ollama.com/library/llama3.1)
- [LLaMA 3.1 Some Notes](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_llama-405b-is-here-and-it-comes-with-more-activity-7221533382025822208-K-Zm?utm_source=share&utm_medium=member_desktop)
- [LLaMA 3.1 Model Finetunning](https://www.linkedin.com/posts/danielhanchen_google-colab-activity-7221621362417700867-y935/?utm_source=share&utm_medium=member_android)
- [LLaMA 3.1 Detail Notes](https://www.linkedin.com/posts/sebastianraschka_yesterdays-llama-31-release-marked-a-big-activity-7221861717876645888-wz3H?utm_source=share&utm_medium=member_android)
- [LLaMA 3.2 Detail Notes](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_llama-can-now-see-and-run-on-your-phone-activity-7244763879690354688-Iaan?utm_source=share&utm_medium=member_android)
- [Mobile LLaMA 3.2](https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/)
- [Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct)  
- [How an online gifting site is using Llama to help protect customer privacy](https://ai.meta.com/blog/untukmu-built-with-llama/) [interesting]  

### DeepSeek Models Related Links: 
- [DeepSeek-V3 Linkedin Post](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_yesterday-the-best-open-model-to-date-was-activity-7278313766679658498-6BCl?utm_source=share&utm_medium=member_desktop)
- [Train your own R1 reasoning model with Unsloth (GRPO)](https://unsloth.ai/blog/r1-reasoning)  

### Phi-3 Related Links:
- [Phi-3 Linkedin Post1](https://www.linkedin.com/posts/sebastianraschka_microsoft-just-casually-shared-theirnew-activity-7188544168380510208-AdDG?utm_source=share&utm_medium=member_desktop)  
- [Phi-3 Linkedin Post2](https://www.linkedin.com/posts/julienchaumond_in-case-you-missed-it-earlier-this-week-activity-7189273186256003072-91B0?utm_source=share&utm_medium=member_desktop)  

### Mistral & Mixtral Models Related Links:
- [Mistral AI models](https://github.com/mistralai/mistral-src)
- [Is Mistral's first model a good replacement for OpenAI?](https://blog.quivr.app/is-mistral-a-good-replacement-for-openai/)
- [Mistral Mixture of Experts (MoE) Model](https://www.linkedin.com/posts/liorsinclair_big-news-mistral-just-released-an-open-source-activity-7139323993253228544-5coS?utm_source=share&utm_medium=member_desktop)
- [Mixtral - a SOTA Mixture of Experts](https://huggingface.co/blog/mixtral)
- [Mistraltrx](https://www.linkedin.com/posts/allen-roush-27721011b_cultrixmistraltrix-v1-hugging-face-activity-7149086757945298944-T7IA?utm_source=share&utm_medium=member_desktop)
- [Nous-Hermes-Mixtral model](https://www.linkedin.com/posts/maxime-labonne_nousresearch-just-released-nous-hermes-activity-7152787405815566337-4aTY?utm_source=share&utm_medium=member_desktop)
- [Mixtral in colab](https://github.com/dvmazur/mixtral-offloading/blob/master/notebooks/demo.ipynb) [Great]  
- [Brev.dev Notebooks: Fine-tuning mistral, mixtral, phi-2 and etc](https://github.com/brevdev/notebooks/tree/main) [**Excellent**]
- [Optimized LLM inference api for mistral-7b using vllm and AWQ](https://lightning.ai/lightning-ai/studios/optimized-llm-inference-api-for-mistral-7b-using-vllm?view=public&section=blogs) [**Excellent**]
- [Run Mistral7b Quantized for free on any computer (CPU or GPU)](https://medium.com/artificial-corner/run-mistral7b-quantized-for-free-on-any-computer-2cadc18b45a2) [Interesting]
- [Mixtral 8x22B a 176B MoE Model](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_new-open-model-from-mistral-ai-yesterday-activity-7183816273053523971-Vgse?utm_source=share&utm_medium=member_desktop)
- [Mistral-7B-Instruct-v0.3](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_mistralaimistral-7b-instruct-v03-hugging-activity-7199103875348320256-lJ_A?utm_source=share&utm_medium=member_android)
- [Codestral: A model fluent in 80+ programming languages](https://mistral.ai/news/codestral/)
- [Mistral Finetune: the official repo and guide on how to fine-tune Mistral open-source models](https://github.com/mistralai/mistral-finetune)
- [Mistral Large 2 Model](https://www.linkedin.com/posts/mistralai_large-enough-activity-7221915921622126593-JjHd?utm_source=share&utm_medium=member_desktop)
- [Mistral Small 3](https://mistral.ai/news/mistral-small-3/)  

### Yi Models:
- [Yi Github](https://github.com/01-ai/Yi)
- [Yi Website](https://01.ai/)
- [Yi-VL-6B HuggingFace](https://huggingface.co/01-ai/Yi-VL-6B)

### Qwen Models:
- [Introducing Qwen1.5 Blog Post](https://qwenlm.github.io/blog/qwen1.5/)
- [Qwen1.5 Linkedin Post](https://www.linkedin.com/posts/andrew-iain-jardine_llm-opensource-llms-activity-7160905982523445248-_t5B?utm_source=share&utm_medium=member_desktop)
- [Qwen1.5 HuggingFace](https://huggingface.co/collections/Qwen/qwen15-65c0a2f577b1ecb76d786524)
- [Qwen2 HuggingFace](https://huggingface.co/docs/transformers/en/model_doc/qwen2)
- [Qwen MoE Model](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_new-moe-alert-qwen15-moe-a27b-just-activity-7179144882668630016-i-l5?utm_source=share&utm_medium=member_android)
- [Qwen2](https://github.com/QwenLM/Qwen2)
- [Qwen 2.5 - Linkedin Post](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_9-new-multilingual-open-llms-released-qwen-activity-7242423229724676097-_9Ea?utm_source=share&utm_medium=member_desktop)
- [Qwen 2.5 - Models](https://huggingface.co/collections/Qwen/qwen25-66e81a666513e518adb90d9e)  

### DeepSeek Models:
- [Huggingface DeepSeek R1 - Linkedin Post](https://www.linkedin.com/posts/qgallouedec_last-moments-of-closed-source-ai-hugging-activity-7288908822079852544-CDgF?utm_source=share&utm_medium=member_android)  

### Gemma LLM Related Links (by Google):
- [Gemma an open Gemini LLM released by Google! - Linkedin Post](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_welcome-gemma-googles-new-open-llm-activity-7166054332914741249-FY2D?utm_source=share&utm_medium=member_desktop)
- [Gemma - another linkedin post](https://www.linkedin.com/posts/andrew-iain-jardine_opensource-llm-llms-activity-7166054662612226048-h0Ap?utm_source=share&utm_medium=member_desktop)
- [Google's Gemma Detailed Notes](https://www.linkedin.com/posts/sebastianraschka_googles-gemma-has-been-the-topic-of-the-activity-7167160406480805888-PSeR?utm_source=share&utm_medium=member_desktop)
- [Gemma usage via TRL](https://www.linkedin.com/posts/younes-belkada-b1a903145_new-release-from-google-gemma-a-state-of-the-art-activity-7166065899978870784-50To?utm_source=share&utm_medium=member_desktop)
- [Gemma usage in Hugging Face via OpenAI SDK](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_yesterday-google-released-gemma-an-open-activity-7166484882917961730-uuFB?utm_source=share&utm_medium=member_desktop)
- [Does Gemma overfit the Open LLM Leaderboard?](https://www.linkedin.com/posts/maxime-labonne_does-gemma-overfit-the-open-llm-leaderboard-activity-7166220798427402242-lJFm?utm_source=share&utm_medium=member_desktop)
- [Zehpyr 7B Gemma](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_zehpyr-7b-gemma-releasedwe-are-excited-activity-7169373526641070080-rTLD?utm_source=share&utm_medium=member_desktop)
- [Gemma 2](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_gemma-2-releasedgoogle-just-released-the-activity-7212108484920651776-BR8s?utm_source=share&utm_medium=member_desktop)
- [Gemma2 Detailed Notes](https://www.linkedin.com/posts/sebastianraschka_whats-new-and-noteworthy-in-googles-newly-activity-7213528822384611329-sKv0?utm_source=share&utm_medium=member_desktop)
- [Gemma 2-2b](https://huggingface.co/google/gemma-2-2b)  

### Jamba (SSM-Transformer Model):
- [AI21 Labs Jamba Model](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_jamba-released-ai21-labs-just-released-the-activity-7179121093482315776-xbmX?utm_source=share&utm_medium=member_desktop)  
- [Fine-tune jamba with TRL](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_yesterday-ai21-labs-released-jamba-the-first-activity-7179395299679858688-xiP9?utm_source=share&utm_medium=member_desktop)
- [Fine-tune jamba code](https://www.linkedin.com/posts/maxime-labonne_jambatypus-v01-i-fine-tuned-a-jamba-activity-7181277758876962816-Z4zt?utm_source=share&utm_medium=member_desktop)  

### 1-bit LLMs:
- [1-bit LLMs (AlphaSignal Post)](https://www.linkedin.com/posts/liorsinclair_new-breakthrough-from-microsoft-1-bit-llms-activity-7168680301064384512-UeNv?utm_source=share&utm_medium=member_desktop)  
- [1-bit Quantization](https://www.linkedin.com/posts/a-roucher_%3F-%3F%3F%3F-%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F-%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F-activity-7168987208228540416-uhcm?utm_source=share&utm_medium=member_desktop)
- [Some Notes about 1-bit LLMs (Their benefits and drawbacks)](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_the-era-of-1-bit-llms-what-does-that-mean-activity-7171533076668362753-Nl-F?utm_source=share&utm_medium=member_desktop)
- [AutoBitnet (Train your 1.58-bit LLM based on LLama Architecture for free on Colab T4 GPU)](https://www.linkedin.com/posts/zaiinulabideen_autobitnet-train-your-158-bit-llm-based-activity-7182019658135326720-_qRp?utm_source=share&utm_medium=member_desktop)
- [Llama2 7b in 1-bit precision](https://www.linkedin.com/posts/maxime-labonne_1-bit-quantization-activity-7179068277548032000-I8gR?utm_source=share&utm_medium=member_desktop)
- [Microsoft 1-Bit LLM](https://github.com/microsoft/BitNet)  

### Long Context Window LLMs (e.g., 100K Tokens LLMs):
- [Claude LLM](https://www.linkedin.com/posts/itamar-g1_anthropic-openais-biggest-rivalry-just-activity-7063773334831775744-cQ4L/?utm_source=share&utm_medium=member_android)
- [Some Notes about the 100K Claude LLM Model](https://www.linkedin.com/posts/sahar-mor_claude-a-gpt-competitor-from-anthropic-activity-7062811160168841216-z4u9/?utm_source=share&utm_medium=member_android)
- [Anthropic's Claude-2](https://www.anthropic.com/index/claude-2)  
- [Claude-2, Anthropic's ChatGPT competitor](https://www.linkedin.com/posts/ugcPost-7084607703137857537-K9Ln?utm_source=share&utm_medium=member_desktop)
- [Some Information about Claude 3](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_claude-3-is-here-anthropic-just-released-activity-7170424839529295872-Qp_S?utm_source=share&utm_medium=member_desktop)  
- [LongNet: Scaling Transformers to 1B Tokens](https://arxiv.org/abs/2307.02486)
- [Lost in the Middle: How Language Models Use Long Contexts](https://arxiv.org/abs//2307.03172)  
- [Notes about How Language Models Use Long Contexts](https://www.linkedin.com/posts/sebastianraschka_llm-ai-machinelearning-activity-7083427280605089792-MS_N/?utm_source=share&utm_medium=member_android)
- [Scaling LLaMA and GPTNeoX to >8k input context](https://www.linkedin.com/posts/gante_scaling-llama-and-gptneox-to-8k-input-context-activity-7085545793050320896-8OKi/?utm_source=share&utm_medium=member_android)
- [Unofficial Claude-API](https://github.com/KoushikNavuluri/Claude-API)
- [Claude Unofficial API](https://github.com/Explosion-Scratch/claude-unofficial-api)
- [YARN & LongLlaMa](https://www.linkedin.com/posts/pramodith_generativeai-llm-gpt-activity-7104772654313656321-QC5D?utm_source=share&utm_medium=member_desktop)
- [YaRN: Efficient Context Window Extension of LLMs](https://github.com/jquesnelle/yarn)
- [LLMs get lost when the context becomes too long: Lost in the Middle: How Language Models Use Long Contexts](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_are-vector-databases-here-to-stay-yes-activity-7085908435686285312-QVfB?utm_source=share&utm_medium=member_desktop) [**Very Important**]
- [LongLoRA: Efficient Fine-tuning of Long-Context LLMs](https://www.linkedin.com/posts/omarsar_longlora-efficient-fine-tuning-of-long-context-activity-7111000280615325699-SVEE?utm_source=share&utm_medium=member_desktop)  
- [LongLoRA: Efficient Fine-tuning of Long-Context LLMs (another post)](https://www.linkedin.com/posts/haotian-tang_expanding-the-context-size-of-large-language-activity-7110806911775641600-nShH?utm_source=share&utm_medium=member_desktop)
- [Efficient Streaming LLMs with Attention Sinks for infinite-length inputs](https://github.com/mit-han-lab/streaming-llm)
- [MemGPT: Teaching LLMs memory management for unbounded context](https://github.com/cpacker/MemGPT)
- [LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs](https://github.com/THUDM/LongWriter) [Interesting]  
- [Llmlingua Prompt Compress](https://www.linkedin.com/posts/sahar-mor_microsoft-recently-published-a-new-technique-activity-7151596182379597825-7ego?utm_source=share&utm_medium=member_desktop) [Interesting]  

### Small Language Models (SLMs):
- [Microsoft Phi-2 Model (with 2.7B Parameters)](https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/)
- [Can "small" finetuned LLMs with less than 2B parameters outperform larger openly available LLMs (Mixtral, Llama 2 Chat) and proprietary LLMs (ChatGPT)?](https://www.linkedin.com/posts/sebastianraschka_can-small-finetuned-llms-with-less-than-activity-7162082013674500096-FuYV?utm_source=share&utm_medium=member_desktop)
- [Smol LM](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_in-just-a-few-years-the-majority-of-ai-usage-activity-7219027139352801281-aMYy?utm_source=share&utm_medium=member_desktop)
- [Hymba Small LM](https://www.linkedin.com/posts/pavlo-molchanov-08738a63_excited-to-share-a-new-efficient-small-language-activity-7265581836582424576-4Mqp?utm_source=share&utm_medium=member_desktop)  

### Frameworks for Training & Using Large Language Models (LLMs):
- [ColossalAI: Library for LLMs](https://github.com/hpcaitech/ColossalAI)  
- [LangChain: Library for Building applications with LLMs](https://github.com/hwchase17/langchain)  
- [LangChain Chat](https://github.com/hwchase17/chat-langchain) 
- [LangChain Crash Course](https://www.youtube.com/watch?v=LbT1yp6quS8)  
- [LangChain 101](https://www.linkedin.com/posts/munjal-patel_llm-chatgpt-machinelearning-activity-7049757220300800000-hH7I/?utm_source=share&utm_medium=member_android)  
- [LangChain Resources](https://www.linkedin.com/posts/sonali-pattnaik_generativeai-ai-activity-7063160223967973376-3K0P/?utm_source=share&utm_medium=member_android)  
- [LangChain & Vector Databases in Production Course](https://learn.activeloop.ai/courses/langchain)
- [Building LLM Powered Apps via LangChain Course](https://www.wandb.courses/courses/building-llm-powered-apps)  
- [OpenFlamingo](https://github.com/mlfoundations/open_flamingo)  
- [Deepset Haystack Framework](https://github.com/deepset-ai/haystack)  
- [LMQL: A query language for programming LLMs](https://github.com/eth-sri/lmql)  
- [LLM Training Frameworks List](https://www.linkedin.com/posts/aboniasojasingarayar_llm-gpt3-framework-activity-7047449940192591872-3VYc/?utm_source=share&utm_medium=member_android)  
- [NeMo Guardrails](https://github.com/NVIDIA/NeMo-Guardrails)  
- [Lamini: The LLM engine for rapidly customizing models](https://github.com/lamini-ai/lamini)  
- [Scikit-LLM: Sklearn Meets Large Language Models](https://github.com/iryna-kondr/scikit-llm)  
- [Chainlit](https://github.com/Chainlit/chainlit)
- [ChatUI](https://github.com/alibaba/ChatUI)
- [Streamlit-Chat](https://github.com/AI-Yash/st-chat)
- [Gradio: Creating a Streaming chatbot fast](https://www.gradio.app/guides/creating-a-chatbot-fast#streaming-chatbots)  
- [Streamlit-Weaviate Connection: provides a custom streamlit connection to query data from weaviate](https://github.com/weaviate/st-weaviate-connection/tree/main)  
- [LangKit: an open-source text metrics toolkit for monitoring language models](https://github.com/whylabs/langkit)  
- [HuggingFace Transformers Agents](https://huggingface.co/docs/transformers/transformers_agents)
- [privateGPT: Ask questions to your documents using the power of LLMs](https://github.com/imartinez/privateGPT)  
- [Spacy LLM](https://github.com/explosion/spacy-llm)  
- [Lit-GPT](https://github.com/Lightning-AI/lit-gpt)
- [Zero to LitGPT Tutorial: Getting Started with Pretraining, Finetuning, and Using LLMs](https://github.com/Lightning-AI/litgpt/blob/main/tutorials/0_to_litgpt.md) [Great]  
- [GPTCache: A Library for Creating Semantic Cache for LLM Queries](https://github.com/zilliztech/GPTCache/tree/main)
- [AutoTrain-Advanced](https://github.com/huggingface/autotrain-advanced)
- [Monster API: API for using & fine-tuning LLMs](https://monsterapi.ai/)
- [AnythingLLM: A full-stack personalized AI assistant](https://github.com/Mintplex-Labs/anything-llm)
- [EasyLLM: helpful tools and methods for working with LLMs](https://github.com/philschmid/easyllm)
- [gpt-llm-trainer: input a description of your task, and fine-tune a LLaMA 2 model for you](https://github.com/mshumer/gpt-llm-trainer)
- [Embedchain: a framework to easily create LLM powered bots](https://github.com/embedchain/embedchain)
- [PandasAI](https://github.com/gventuri/pandas-ai) [It is not related strictly in this section, but it is interesting]  
- [GPT Engineer: Specify what you want it to build, the AI asks for clarification, and then builds it](https://github.com/AntonOsika/gpt-engineer)
- [Ludwig: a low-code framework for building custom AI models like LLMs](https://github.com/ludwig-ai/ludwig)
- [open-interpreter](https://github.com/KillianLucas/open-interpreter)
- [kani: is a lightweight and highly hackable framework for chat-based language models with tool usage/function calling](https://github.com/zhudotexe/kani)
- [Kani colab samples](https://colab.research.google.com/github/zhudotexe/kani/blob/main/examples/colab_examples.ipynb)
- [Kani Linkedin Post](https://www.linkedin.com/posts/chris-callison-burch-40bb87b7_my-phd-students-have-build-a-really-great-activity-7110728026971115520-T16F?utm_source=share&utm_medium=member_desktop)
- [Argilla: the open-source data curation platform for LLMs](https://github.com/argilla-io/argilla)
- [LiteLLM: Call all LLM APIs using the OpenAI format](https://github.com/BerriAI/litellm)
- [LLM Finetuning with PEFT](https://github.com/ashishpatel26/LLM-Finetuning)
- [ChatGPT-AutoExpert: Supercharged Custom Instructions for ChatGPT](https://github.com/spdustin/ChatGPT-AutoExpert)
- [PyTorch thunder (pytorch compiler for speed up training of LLMs) - Linkedin Post](https://www.linkedin.com/posts/sebastianraschka_we-just-open-sourced-thunder-a-new-compiler-activity-7176571765639245824-srIZ?utm_source=share&utm_medium=member_desktop)
- [PyTorch Lightning Thunder](https://github.com/Lightning-AI/lightning-thunder)  
- [unsloth library: 2-5X faster 70% less memory QLoRA & LoRA finetuning](https://github.com/unslothai/unsloth) [**Great for fine-tuning LLMs**]
- [TorchTune: A Native-PyTorch Library for LLM Fine-tuning](https://github.com/pytorch/torchtune)  

### Notes and Codes for Training and fine-tuning LLMs:
- [LLM Finetuning with PEFT Colab Notebooks](https://github.com/ashishpatel26/LLM-Finetuning)
- [Self Instruct TRL for LLMs](https://github.com/yizhongw/self-instruct)
- [Self Instruct TRL for LLMs - Link2](https://huggingface.co/docs/trl/sft_trainer)
- [How to Fine-Tune LLMs in 2024 with Hugging Face](https://www.philschmid.de/fine-tune-llms-in-2024-with-trl)
- [How to fine-tune open LLMs in 2025 with Hugging Face](https://www.philschmid.de/fine-tune-llms-in-2025)  
- [Fine tune LLMs in your own hardware via PyTorch team (great)](https://pytorch.org/blog/finetune-llms/?utm_content=278057355&utm_medium=social&utm_source=linkedin&hss_channel=lcp-78618366)
- [RLHF in 2024 with DPO & Hugging Face](https://www.philschmid.de/dpo-align-llms-in-2024-with-trl)
- [A little guide to building Large Language Models in 2024 (PPT by HuggingFace Team)](https://docs.google.com/presentation/d/1IkzESdOwdmwvPxIELYJi8--K3EZ98_cL6c5ZcLKSyVg/edit?usp=sharing) [**Great**]
- [Video Link1 of A little guide to building Large Language Models in 2024 (PPT by HuggingFace Team)](https://www.linkedin.com/posts/thom-wolf_75min-talk-i-finally-recorded-this-lecture-activity-7179106246505967617-0nzC?utm_source=share&utm_medium=member_desktop)
- [Video Link2 of A little guide to building Large Language Models in 2024 (PPT by HuggingFace Team)](https://www.youtube.com/watch?v=2-SPH9hIKT8)
- [Understanding the instruction fine-tuning process in LLMs](https://www.linkedin.com/posts/sebastianraschka_if-you-are-looking-for-a-resource-to-understand-activity-7208093607122145280-6wFF?utm_source=share&utm_medium=member_desktop)
- [Top 5 Tips and Tricks for LLM Fine-Tuning and Inference from Intel Experts](https://www.intel.com/content/www/us/en/developer/articles/technical/top-tricks-for-llm-fine-tuning-and-inference.html)  

### Reflection-Tuning of LLMs:
- [Reflection-Tuning of LLMs](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_mindblowing-a-70b-open-meta-llama-3-better-activity-7237712642339926016-Cfm6?utm_source=share&utm_medium=member_desktop)  

### Memory Layer for LLMs:
- [Memory layer for LLMs](https://www.linkedin.com/posts/liorsinclair_mem0-gained-20000-stars-on-github-in-30-activity-7237475167822585857-4Jbu?utm_source=share&utm_medium=member_desktop)
- [Memory layer for LLMs - GitHub Repo](https://github.com/mem0ai/mem0)  

### LLMs for Coding:
- [CodeGen](https://github.com/salesforce/CodeGen)
- [Code Llama](https://github.com/facebookresearch/codellama)
- [Code Llama Notes](https://www.linkedin.com/posts/aleksagordic_nice-meta-ai-just-announced-code-llama-activity-7100559934764810240-Un2i/?utm_source=share&utm_medium=member_android)  

### LLMs as Front-End Engineers:
- [Design2Code: How Far Are We From Automating Front-End Engineering?](https://arxiv.org/abs/2403.03163)
- [Llama Coder: Can generate full React apps](https://llamacoder.together.ai/)  

### LLMs Courses & Tutorials:
- [LLM Bootcamp - Spring 2023](https://fullstackdeeplearning.com/llm-bootcamp/spring-2023/)  
- [LLM University](https://docs.cohere.com/docs/llmu)  
- [List of LLM Courses](https://www.linkedin.com/posts/srijankr_ai-llm-activity-7080929772523966464-Le4u/?utm_source=share&utm_medium=member_android)
- [Anti-hype LLM reading list](https://gist.github.com/veekaybee/be375ab33085102f9027853128dc5f0e)
- [Microsoft Generative AI Course](https://github.com/microsoft/generative-ai-for-beginners)
- [Google and Kaggle five-day generative AI course](https://blog.google/technology/developers/google-kaggle-genai-intensive/) [Good]  
- [Best Resources for learning to work with LLMs](https://www.linkedin.com/posts/whats-ai_github-louisfb01start-llms-a-complete-activity-7133590058229456896-WEf0?utm_source=share&utm_medium=member_desktop)  
- [Start with Large Language Models (LLMs) - Become an expert for free!](https://github.com/louisfb01/start-llms) [Interesting]
- [Intro to LLMs: Andrej Karpathy 1 Hour Lecture](https://www.youtube.com/watch?v=zjkBMFhNj_g)
- [LLM Course](https://github.com/mlabonne/llm-course) [**good**]
- [LLM Course in ChatGPT Plus](https://www.linkedin.com/posts/maria-vechtomova_llm-gpt-activity-7160567161856360448-IFjd?utm_source=share&utm_medium=member_desktop)  
- [Build a Large Language Model (From Scratch) great Course and Book Tutorial](https://github.com/rasbt/LLMs-from-scratch) [**Great**]  
- [Learning Resources about LLMs](https://www.linkedin.com/posts/pauliusztin_machinelearning-mlops-datascience-activity-7135530424767819777-ui-5?utm_source=share&utm_medium=member_desktop)
- [The Transformer Layer by Layer Course](https://mlbootcamp.ai/course.html?guid=d105240a-94e1-405b-be80-60056659c24c)
- [The Transformer Layer by Layer Course: Linkedin](https://www.linkedin.com/posts/juan-olano-b9a330112_artificialintelligence-transformers-onlinelearning-activity-7137158122715897856-cneV?utm_source=share&utm_medium=member_desktop)
- [Hands-on LLMs Course](https://github.com/iusztinpaul/hands-on-llms)
- [Direct Preference Optimization (DPO) Method for LLMs Tutorial](https://huggingface.co/blog/pref-tuning)
- [CS25: Transformers United V3 Courses - Autumn 2023](https://web.stanford.edu/class/cs25/)
- [CS336: Language Modeling from Scratch](https://stanford-cs336.github.io/spring2024/)
- [Visual and Animated Lecture about LLMs and Transformers and Deep Learning](https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi)  
- [LLMs Roadmap](https://www.linkedin.com/posts/ba%C5%9Fak-tu%C4%9F%C3%A7e-eskili-61511b58_nlp-llms-gpt3-activity-7168168071356997632-V8yL?utm_source=share&utm_medium=member_desktop) [Great]  
- [Brev.dev Notebooks: Fine-tuning mistral, mixtral, phi-2 and etc](https://github.com/brevdev/notebooks/tree/main) [**Excellent**]
- [LLM Summer School](https://www.linkedin.com/posts/sebastianraschka_a-suggestion-for-an-effective-11-step-llm-activity-7195778889384693762-2TB_?utm_source=share&utm_medium=member_android)
- [LLM Engineer's Handbook](https://www.linkedin.com/posts/maxime-labonne_super-proud-to-announce-my-new-book-the-activity-7219253497559425024-IVkc?utm_source=share&utm_medium=member_desktop)
- [LLM Twin Course: Building Your Production-Ready AI Replica](https://github.com/decodingml/llm-twin-course)
- [Hands-On Large Language Models Book](https://www.linkedin.com/posts/jalammar_our-newly-released-llm-book-hands-on-large-activity-7242207044533948417-_i2R?utm_source=share&utm_medium=member_desktop)
- [Foundations of LLMs Book](https://arxiv.org/abs/2501.09223)  

### LLMs Ranking:
- [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)  
- [Chatbot Arena Leaderboard](https://lmsys.org/blog/2023-05-10-leaderboard/)
- [AlpacaEval Leaderboard](https://tatsu-lab.github.io/alpaca_eval/)
- [CanAiCode Leaderboard](https://huggingface.co/spaces/mike-ravkine/can-ai-code-results)
- [Small LLMs Performance Ranking](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_how-big-do-llms-need-to-be-able-to-reason-activity-7134108036473741312-2jxI?utm_source=share&utm_medium=member_desktop)
- [Chatbot Arena: Benchmarking LLMs in the Wild](https://huggingface.co/spaces/lmsys/chatbot-arena) [**Great**]
- [Chatbot Arena Leaderboard](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard)
- [AI2 WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild](https://huggingface.co/spaces/allenai/WildBench) [**Great**]
- [AI2 WildBench Linkedin Post](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_new-evaluation-benchmark-leaderboard-by-activity-7171853629325316096-67sr?utm_source=share&utm_medium=member_desktop)
- [Persian LLM Leaderboard (via Part AI)](https://huggingface.co/spaces/PartAI/persian-llm-leaderboard)  

### Building NLP Applications Powered by LLMs (Different Methods for Augmenting Knowledge to LLMs (or Retrieval-Augmented Generation (RAG) applications)):
- [Ask a Book Questions with LangChain OpenAI](https://bennycheung.github.io/ask-a-book-questions-with-langchain-openai) [Great]  
- [OpenAI Web QA Embeddings](https://platform.openai.com/docs/tutorials/web-qa-embeddings)  
- [Deepset Haystack Framework](https://github.com/deepset-ai/haystack)  
- [Stanford Retrieval-based NLP](https://ai.stanford.edu/blog/retrieval-based-NLP/)  
- [Hypothetical Document Embeddings (HyDE)](https://www.linkedin.com/posts/activity-7048838677438861312-8MFD/?utm_source=share&utm_medium=member_android)  
- [ChatDB: Augmenting LLMs with Databases](https://chatdatabase.github.io/)
- [ChatNode](https://www.chatnode.ai/)  
- [Emerging Architectures for LLM Applications](https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/)
- [Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines](https://github.com/explodinggradients/ragas)
- [Fine tuning vs. RAG for LLMs](https://www.linkedin.com/posts/alexander-ratner-038ba239_lots-of-debate-on-fine-tuning-vs-rag-for-activity-7103836027957506048-AjoJ?utm_source=share&utm_medium=member_desktop)
- [Building RAG-based LLM Applications for Production (Part 1)](https://www.anyscale.com/blog/a-comprehensive-guide-for-building-rag-based-llm-applications-part-1) [Good]
- [Verba: The Golden RAGtriever, user-friendly interface for Retrieval-Augmented Generation (RAG) applications](https://github.com/weaviate/Verba)
- [DocsGPT: GPT-powered chat for documentation, chat with your documents](https://github.com/arc53/DocsGPT)
- [RAFT: Retrieval Augmented Fine Tuning - Post1](https://www.linkedin.com/posts/pascalbiese_raft-the-best-of-rag-and-fine-tuning-combined-activity-7175089937036283904-ltQI?utm_source=share&utm_medium=member_desktop)
- [RAFT: Retrieval Augmented Fine Tuning - Post2](https://www.linkedin.com/posts/tianjun-zhang-333bb2126_raft-a-new-way-to-teach-llms-to-be-better-activity-7174525633291587584-CO-h?utm_source=share&utm_medium=member_desktop)
- [RAFT: Retrieval Augmented Fine Tuning - Microsoft Blog](https://techcommunity.microsoft.com/t5/ai-ai-platform-blog/raft-a-new-way-to-teach-llms-to-be-better-at-rag/ba-p/4084674)
- [RAFT: Retrieval Augmented Fine Tuning - Berkeley Blog](https://gorilla.cs.berkeley.edu/blogs/9_raft.html)
- [RAFT Code](https://github.com/ShishirPatil/gorilla/tree/main/raft)
- [Long context LLMs vs RAG](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_how-good-are-llms-in-a-long-context-and-activity-7214185350959689728-cnfp?utm_source=share&utm_medium=member_android) [Interesting]
- [RAGFlow: an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding](https://github.com/infiniflow/ragflow)
- [Two Step RAG: Speculative RAG: Enhancing retrieval augmented generation through drafting](https://research.google/blog/speculative-rag-enhancing-retrieval-augmented-generation-through-drafting/)
- [Exploring Multimodal RAG with LlamaIndex and GPT-4 or the New Anthropic Sonnet Model](https://levelup.gitconnected.com/exploring-multimodal-rag-with-llamaindex-and-gpt-4-or-the-new-anthropic-sonnet-model-96705c877dbb)
- [PaperQA2: High accuracy RAG for answering questions from scientific documents with citations](https://github.com/Future-House/paper-qa)
- [Sophisticated Controllable Agent for Complex RAG Tasks](https://github.com/NirDiamant/Controllable-RAG-Agent)  
- [Anthropic's Cluade Introducing Contextual Retrieval RAG](https://www.anthropic.com/news/contextual-retrieval)
- [Docling: Get your docs ready for gen AI](https://github.com/DS4SD/docling)
- [Lecture of RAG and Prompt Engineering](https://www.linkedin.com/posts/tom-yeh_i-just-edited-my-lecture-beginners-guide-activity-7284242137091620864-6MBy?utm_source=share&utm_medium=member_desktop)  
- [Recent RAG Research from Google](https://www.linkedin.com/posts/jihoo-kim_rag-research-from-google-2024-ugcPost-7266537405904498689-wrac?utm_source=share&utm_medium=member_android)
- [zoekt: Fast trigram based code search --> great tool for RAG of codes](https://github.com/sourcegraph/zoekt) [**important**]  

### Graph RAG & Its Related Data Bases:
- [ArangoDB: The Most Complete And Scalable Platform For Graph-Powered GenAI](https://arangodb.com/)  
- [Microsoft GraphRAG](https://microsoft.github.io/graphrag/)  
- [llamaindex Graph RAG](https://docs.llamaindex.ai/en/stable/examples/query_engine/knowledge_graph_rag_query_engine/)  
- [Gephi: The Open Graph Viz Platform](https://gephi.org/)  
- [JanusGraph: is a scalable graph database optimized for storing and querying graphs](https://janusgraph.org/)  
- [cayley: Open Source Graph Data Base](https://cayley.io/)
- [Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering (Paper)](https://arxiv.org/abs/2404.17723)
- [The GraphRAG Manifesto: Adding Knowledge to GenAI](https://neo4j.com/blog/graphrag-manifesto/)  
- [Neo4j for GenAI](https://neo4j.com/generativeai/)  

### Cache-Augmented Generation (CAG):
- [Cache-Augmented Generation (CAG) - Linkedin Post1](https://www.linkedin.com/posts/maryammiradi_dont-do-rag-cag-is-40x-faster-than-activity-7281655697086287872-c35Q?utm_source=share&utm_medium=member_desktop)
- [Cache-Augmented Generation (CAG) - Linkedin Post2](https://www.linkedin.com/posts/bhavishya-pandit_rag-vs-cag-activity-7282615153852862464-ES23?utm_source=share&utm_medium=member_desktop)
- [Cache-Augmented Generation (CAG) - Linkedin Post3](https://www.linkedin.com/posts/francoisvanderseypen_dont-do-rag-when-cache-augmented-generation-activity-7279725990342193152-8P82?utm_source=share&utm_medium=member_desktop)  

### Vector Database Libraries:
- [weaviate](https://weaviate.io/)  
- [weaviate GitHub](https://github.com/weaviate/weaviate)  
- [chroma](https://github.com/chroma-core/chroma)  
- [Qdrant: Vector Database for AI Applications](https://github.com/qdrant/qdrant)  
- [pinecone](https://www.pinecone.io/)  
- [rektor-db](https://github.com/codediodeio/rektor-db)  
- [pgvector](https://github.com/pgvector/pgvector)  
- [LlamaIndex: comprehensive toolkit to perform data augmentation for LLMs](https://github.com/jerryjliu/llama_index)
- [jina-ai VectorDB](https://github.com/jina-ai/vectordb)
- [sqlite-vec: A vector search SQLite extension](https://github.com/asg017/sqlite-vec)  

### Great Embedding Models for Search (for Augmenting External Knowledge into ChatBot Vector DB) [Retrieval Augmented Generation (RAG)]:
- [Massive Text Embedding Benchmark (MTEB) Leaderboard](https://huggingface.co/spaces/mteb/leaderboard)
- [Word and sentence embeddings is how LLMs understand text](https://www.linkedin.com/posts/sahar-mor_word-and-sentence-embeddings-is-how-llms-activity-7105921473978015744-R0Nm?utm_source=share&utm_medium=member_desktop)  
- [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding)
- [E5 embedding vs OpenAI Ada](https://www.linkedin.com/posts/andrew-iain-jardine_hosting-a-text-embedding-model-that-is-better-activity-7106338837479510016-zvBW?utm_source=share&utm_medium=member_desktop)
- [M2-BERT-80M-32k-Retrieval](https://huggingface.co/togethercomputer/m2-bert-80M-32k-retrieval)
- [Embedding Quantization - Post1](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_introducing-embedding-quantization-a-new-activity-7176971093646159872-hp9z?utm_source=share&utm_medium=member_desktop)  
- [Embedding Quantization - Post2](https://www.linkedin.com/posts/tomaarsen_binary-and-scalar-embedding-quantization-activity-7176966403332132864-lJzH?utm_source=share&utm_medium=member_desktop)  
- [Embedding Quantization - HuggingFace Blog Post](https://huggingface.co/blog/embedding-quantization)
- [Quantization Fundamentals with Hugging Face Course](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_quantization-fundamentals-with-hugging-face-activity-7186335433843167232-sKV2?utm_source=share&utm_medium=member_desktop)
- [Is Cosine-Similarity of Embeddings Really About Similarity?](https://www.linkedin.com/posts/alphasignal_is-cosine-similarity-of-embeddings-really-activity-7175543620651880449-ZoKw?utm_source=share&utm_medium=member_desktop)
- [LLM2Vec](https://www.linkedin.com/posts/zaiinulabideen_lazy-llm2vec-convert-your-favorite-llm-activity-7193618083448553472-_Q2e?utm_source=share&utm_medium=member_desktop) [**Great**]
- [Fine tuning embedding models for RAG (Linkedin post)](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_fine-tune-embedding-models-for-retrieval-activity-7203760204579028992-g7eW?utm_source=share&utm_medium=member_desktop)
- [Fine tuning embedding models for RAG (Original Post)](https://www.philschmid.de/fine-tune-embedding-model-for-rag)
- [`all-MiniLM-L6-v2` --> Sentence-Transformers Model for Embedding](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
- [Learn How to Fine-tuning Embedding Models Course](https://marqo.ai/courses/fine-tuning-embedding-models) [**Great**]
- [LLMs Embedding Course - Link1](https://github.com/anishiisc/Build_LLM_from_Scratch/tree/main)
- [LLMs Embedding Course - Link2](https://www.linkedin.com/posts/ugcPost-7228118123390902272-oVu4/?utm_source=share&utm_medium=member_android)
- [txtai: All-in-one embeddings database](https://github.com/neuml/txtai)
- [NVIDIA NV-emb-2 embeddings](https://www.linkedin.com/posts/tunguz_ok-nvidia-nv-emb-2-embeddings-are-really-activity-7262862383885213696-MWVv?utm_source=share&utm_medium=member_desktop)  
- [jina-embeddings-v3: Multilingual Embeddings With Task LoRA](https://huggingface.co/papers/2409.10173)
- [ModernBert: Linkedin Post1](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_modernbert-bert-revisited-in-the-age-of-activity-7275551060302131201-dr3c?utm_source=share&utm_medium=member_desktop)
- [ModernBert: Linkedin Post2](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_want-to-replace-bert-in-2025-the-time-has-activity-7277616689859444737-iRUe?utm_source=share&utm_medium=member_desktop)
- [Nomic-embed-text-v2-moe model](https://huggingface.co/nomic-ai/nomic-embed-text-v2-moe)
- [Nomic Embed Text V2: An Open Source, Multilingual, Mixture-of-Experts Embedding Model (Blog Post)](https://www.nomic.ai/blog/posts/nomic-embed-text-v2)
- [Gemeni models for text embedding (original link)](https://developers.googleblog.com/en/gemini-embedding-text-model-now-available-gemini-api/)
- [Gemeni models for text embedding (useful linkedin post)](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_gemini-models-for-embeddings-yes-google-activity-7303840326933176320-Tg0C?utm_source=share&utm_medium=member_android&rcm=ACoAAAgksdYBFu3_vG0bwXWdh93rSqV1J1ghMP4)  

### Prevent Hallucinations from LLMs & Controling their outputs:
- [Deep Dive Into LLM Hallucinations Across Generative Tasks](https://www.rungalileo.io/blog/deep-dive-into-llm-hallucinations-across-generative-tasks)
- [Controlled Generation Tools](https://www.linkedin.com/posts/pascalbiese_genai-llms-opensource-activity-7097185067885576192-Uv8Z/?utm_source=share&utm_medium=member_android)
- [Guidance: Controlling LLMs](https://github.com/guidance-ai/guidance)
- [NeMo Guardrails](https://github.com/NVIDIA/NeMo-Guardrails)
- [Minimising Hallucinations in LLM Applications: NeMo Guradrails Video Tutorial](https://www.linkedin.com/posts/sanyambhutani_minimising-hallucinations-in-llm-applications-activity-7104810583304077312-w983?utm_source=share&utm_medium=member_desktop)
- [Mitigate Hallucination in LLMs](https://www.linkedin.com/posts/vinija_mitigate-hallucination-in-llms-as-activity-7114468991330390016-O0BZ?utm_source=share&utm_medium=member_desktop)
- [LLMs Hallucinations Benchmark](https://www.linkedin.com/posts/drjimfan_please-see-update-below-a-recent-llm-hallucination-activity-7130230516246593536-mxAY?utm_source=share&utm_medium=member_desktop)  
- [Mitigating LLM Hallucinations: a multifaceted approach](https://amatriain.net/blog/hallucinations) [Great]

### Training & Using Large Language Models (LLMs) on Low Resource Machines:
- [Cramming: Training a Language Model on a Single GPU in One Day](https://github.com/jonasgeiping/cramming) [**Great**]  
- [Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU](https://huggingface.co/blog/trl-peft) [**Great**]    
- [PEFT: State-of-the-art Parameter-Efficient Fine-Tuning](https://github.com/huggingface/peft) [**Great**]   
- [PEFT: Parameter-Efficient Fine-Tuning of Billion-Scale Models on Low-Resource Hardware](https://huggingface.co/blog/peft) [**Great**]  
- [Introduction to 8-bit Matrix Multiplication for transformers at scale using Hugging Face Transformers, Accelerate and bitsandbytes](https://huggingface.co/blog/hf-bitsandbytes-integration)  
- [bitsandbytes: 8-bit CUDA functions for PyTorch](https://github.com/TimDettmers/bitsandbytes)  
- [Alpaca-LoRA: Low-Rank LLaMA Instruct-Tuning on consumer hardware](https://github.com/tloen/alpaca-lora) [Great]  
- [LLaMA & Alpaca Tutorial: “ChatGPT” On Your Local Computer](https://medium.com/@martin-thissen/llama-alpaca-chatgpt-on-your-local-computer-tutorial-17adda704c23)  
- [Dalai: The simplest way to run LLaMA on your local machine](https://github.com/cocktailpeanut/dalai)  
- [pyllama](https://github.com/juncongmoo/pyllama)  
- [Alpaca-LoRA-Serve](https://github.com/deep-diver/Alpaca-LoRA-Serve)  
- [llama.cpp: Port of Facebook's LLaMA model in C/C++](https://github.com/ggerganov/llama.cpp)  
- [alpaca.cpp](https://github.com/antimatter15/alpaca.cpp)  
- [SparseGPT: Remove 100 Billion Parameters of LLMs](https://neuralmagic.com/blog/sparsegpt-remove-100-billion-parameters-for-free/)  
- [xFormers: Toolbox to Accelerate Research on Transformers](https://github.com/facebookresearch/xformers)  
- [LLaMA-Adapter: Efficient Fine-tuning of LLaMA (Fine-tuning LLaMA to follow instructions within 1 Hour and 1.2M Parameters)](https://github.com/ZrrSkywalker/LLaMA-Adapter)  
- [GPT4All](https://github.com/nomic-ai/gpt4all) [Great]  
- [Vicuna web page](https://vicuna.lmsys.org/) [Great]   
- [Vicuna GitHub: FastChat](https://github.com/lm-sys/FastChat)  
- [PetGPT](https://github.com/maziarraissi/PetGPT)  
- [GPT-4-LLM](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM)  
- [baize Chatbot](https://github.com/project-baize/baize-chatbot)  
- [Koala](https://github.com/young-geng/EasyLM#koala)  
- [Gorilla: An API store for LLMs](https://github.com/ShishirPatil/gorilla)  
- [Lit-LLaMA](https://github.com/Lightning-AI/lit-llama)  
- [Auto-GPT](https://github.com/Torantulino/Auto-GPT)  
- [xTuring](https://github.com/stochasticai/xTuring)  
- [GPTCache](https://github.com/zilliztech/gptcache)  
- [Dolly-v2-12B](https://huggingface.co/databricks/dolly-v2-12b)  
- [Web LLM](https://github.com/mlc-ai/web-llm)
- [P-tuning v2](https://github.com/THUDM/P-tuning-v2)  
- [QLoRA: Efficient Finetuning of Quantized LLMs](https://github.com/artidoro/qlora)  
- [AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration](https://github.com/mit-han-lab/llm-awq)
- [GPTQ Quantization Method in Transformers](https://www.linkedin.com/posts/marc-sun_opensource-llm-quantization-activity-7100102215582797824-td7E?utm_source=share&utm_medium=member_desktop)
- [Optimize open LLMs using GPTQ and Hugging Face Optimum](https://www.linkedin.com/feed/update/urn:li:activity:7103049470908485632/?utm_source=share&utm_medium=member_android)
- [GPTQ vs. bitsandbytes (BNB)](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_quantization-makes-fine-tuning-and-deploying-activity-7104480375841636352-_dgY?utm_source=share&utm_medium=member_desktop)  
- [BNB Blog: Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA](https://huggingface.co/blog/4bit-transformers-bitsandbytes)
- [GPTQ Blog: Making LLMs lighter with AutoGPTQ and transformers](https://huggingface.co/blog/gptq-integration)
- [TensorRT-LLM](https://www.linkedin.com/posts/tunguz_llm-h100-languagemodels-activity-7106253824910139392-WZRM?utm_source=share&utm_medium=member_desktop)
- [Overview of 🤗 Transformers Quantization: GPTQ vs bitsandbytes](https://huggingface.co/blog/overview-quantization-transformers)
- [LoRA Exchange (LoRAX): Serve 100s of Fine-Tuned LLMs for the Cost of 1](https://predibase.com/blog/lora-exchange-lorax-serve-100s-of-fine-tuned-llms-for-the-cost-of-one)  
- [Introducing LoRAX](https://www.linkedin.com/posts/travisaddair_lora-exchange-lorax-serve-100s-of-fine-tuned-activity-7120819275442896896-vlI_?utm_source=share&utm_medium=member_desktop)
- [DeepSparse: Sparsity-aware deep learning inference runtime for CPUs](https://github.com/neuralmagic/deepsparse)
- [Practical Tips for Finetuning LLMs Using LoRA (Low-Rank Adaptation)](https://magazine.sebastianraschka.com/p/practical-tips-for-finetuning-llms) [**Great**]
- [Dare method for improving LLMs performance](https://www.linkedin.com/posts/andrew-iain-jardine_llm-opensource-llms-activity-7134896163698208768-0Gyf?utm_source=share&utm_medium=member_desktop)
- [Small model that surpass the GPT4](https://www.linkedin.com/posts/clementdelangue_open-models-now-starting-to-surpass-gpt4-activity-7137904570898264064-LSmc?utm_source=share&utm_medium=member_desktop) [Interesting]
- [Efficient LLMs Survey](https://github.com/AIoT-MLSys-Lab/Efficient-LLMs-Survey) [Great]
- [LoRAX (LoRA eXchange): Framework that allows users to serve thousands of fine-tuned models on a single GPU](https://github.com/predibase/lorax)
- [PowerInfer: High-speed LLMs Serving on PCs with Consumer-grade GPUs](https://github.com/SJTU-IPADS/PowerInfer)
- [LoRA From Scratch Implementation](https://www.linkedin.com/posts/sebastianraschka_code-lora-from-scratch-a-lightning-studio-activity-7155241298227060736-QRul?utm_source=share&utm_medium=member_desktop)
- [Improving LoRA (DoRA): Implementing Weight-Decomposed Low-Rank Adaptation (DoRA)](https://www.linkedin.com/posts/sebastianraschka_improving-lora-implementing-weight-decomposed-activity-7165053172175024128-bqwu?utm_source=share&utm_medium=member_desktop)
- [DoRA Link2](https://magazine.sebastianraschka.com/p/lora-and-dora-from-scratch)  
- [Proxy-Tuning (new method for fine-tuning LLMs)](https://www.linkedin.com/posts/sebastianraschka_theres-a-new-promising-method-for-finetuning-activity-7153788017017544705-ADC7?utm_source=share&utm_medium=member_desktop)
- [AutoQuantize (GGUF, AWQ, EXL2, GPTQ) Colab Notebook](https://colab.research.google.com/drive/1Li3USnl3yoYctqJLtYux3LAIy4Bnnv3J?usp=sharing) [Great]
- [DoRA: Weight-Decomposed Low-Rank Adaptation - Linkedin Post](https://www.linkedin.com/posts/sebastianraschka_while-everyone-is-talking-about-sora-theres-activity-7164268573756960770-N7Hu?utm_source=share&utm_medium=member_desktop)
- [DoRA: Weight-Decomposed Low-Rank Adaptation - Paper](https://arxiv.org/abs/2402.09353)
- [GaLore: Memory Efficient Fine-tuning Technique](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_galore-is-a-new-memory-efficient-fine-tuning-activity-7177599313294827521-kye2?utm_source=share&utm_medium=member_desktop)  
- [Quanto: a pytorch quantization toolkit](https://huggingface.co/blog/quanto-introduction) [**Great**]  
- [Quanto: Linkedin Post](https://www.linkedin.com/posts/dcorvoysier_quanto-a-pytorch-quantization-toolkit-activity-7175421050808078336-QcEM?utm_source=share&utm_medium=member_desktop)
- [Deleting 40% of LLM Layers Without Drop in Accuracy](https://www.linkedin.com/posts/liorsinclair_researchers-just-developed-a-new-method-to-activity-7180929255411789826-z3TV?utm_source=share&utm_medium=member_desktop)
- [The Unreasonable Ineffectiveness of the Deeper Layers](https://arxiv.org/html/2403.17887v1)
- [Continual Pretraining of LLMs](https://www.linkedin.com/posts/sebastianraschka_we-talk-a-lot-about-finetuning-llms-to-follow-activity-7174395744068464642-jPFI?utm_source=share&utm_medium=member_desktop)
- [NOLA: run 10,000 customized LLaMA2 (70B) (4bit) models on a single 48GB GPU](https://www.linkedin.com/posts/hpirsiav_iclr2024-iclr2024-activity-7192618595405725696-HZXu?utm_source=share&utm_medium=member_desktop)
- [NOLA LLaMA3](https://www.linkedin.com/posts/s-hasan-abbas_syed-hasan-8503llama-3-8b-nola-hugging-activity-7193318944575762434-MD_T?utm_source=share&utm_medium=member_desktop)
- [LoRA Learns Less and Forgets Less in comparision to full finetuning](https://www.linkedin.com/posts/sebastianraschka_lora-learns-less-and-forgets-less-when-i-activity-7197576220585201664-KA4L?utm_source=share&utm_medium=member_desktop)
- [Best Practices for Fine-Tuning & Training LLMs](https://www.linkedin.com/posts/aleksagordic_amazing-list-of-techniques-for-improving-activity-7215624025639645184-496W?utm_source=share&utm_medium=member_android)
- [TorchChat](https://www.linkedin.com/posts/pytorch_llms-mobilellms-localai-activity-7224090140011380737-RHdH?utm_source=share&utm_medium=member_desktop)
- [The Evolution of Extreme LLM Compression: From QuIP to AQLM with PV-Tuning](https://medium.com/yandex/the-evolution-of-extreme-llm-compression-from-quip-to-aqlm-with-pv-tuning-19c44b91af96)  
- [Calculating GPU memory for serving LLMs](https://www.substratus.ai/blog/calculating-gpu-memory-for-llm)
- [How Much GPU Memory is Needed to Serve a Large Language Model (LLM)?](https://masteringllm.medium.com/how-much-gpu-memory-is-needed-to-serve-a-large-languagemodel-llm-b1899bb2ab5d)
- [CUDA-Free Inference for LLMs (PyTorch Blog)](https://pytorch.org/blog/cuda-free-inference-for-llms/?utm_content=306418724&utm_medium=social&utm_source=linkedin&hss_channel=lcp-78618366)
- [The Ultra-Scale Playbook: Training LLMs on GPU Clusters](https://huggingface.co/spaces/nanotron/ultrascale-playbook)  

### Productionizing LLMs:
- [LLM From the Trenches: 10 Lessons Learned Operationalizing Models at GoDaddy](https://www.godaddy.com/resources/news/llm-from-the-trenches-10-lessons-learned-operationalizing-models-at-godaddy)  

### LLMs on Mobile Devices:
- [MLC LLM](https://github.com/mlc-ai/mlc-llm)  

### LLM Applications & APIs:
- [Building LLM applications for production](https://huyenchip.com/2023/04/11/llm-engineering.html)  
- [Bard API](https://github.com/dsdanielpark/Bard-API)
- [Amazon Bedrock: build and scale generative AI applications](https://aws.amazon.com/bedrock/) [**Great**]

### Natural Language to SQL:
- [text to SQL Github Repos](https://github.com/topics/text-to-sql)  
- [vanna](https://github.com/vanna-ai/vanna)  
- [sqlchat](https://github.com/sqlchat/sqlchat)  
- [dataherald](https://github.com/Dataherald/dataherald)  
- [WrenAI](https://github.com/Canner/WrenAI)
- [Practical text-to-SQL for data analytics by Linkedin](https://www.linkedin.com/blog/engineering/ai/practical-text-to-sql-for-data-analytics) [Great]
- [Persian abstract of above Practical text-to-SQL for data analytics by Linkedin - Out of Distribution Telegram Channel](https://t.me/out_of_distribution/1122)  
  
### Prompt Engineering:
- [Different Kinds of Prompt Engineering](https://www.linkedin.com/posts/munjal-patel_generativeai-largelanguagemodels-llm-activity-7051862874935197696-2E_J/?utm_source=share&utm_medium=member_android)  
- [Prompt Engineering Guide](https://www.promptingguide.ai/)
- [PromptTools: tools for prompt testing and experimentation](https://github.com/hegelai/prompttools)
- [Prompt engineering for Claude's long context window](https://www.anthropic.com/index/prompting-long-context)
- [Chain of Verification Prompt engineering method](https://www.linkedin.com/posts/xamatriain_a-week-ago-meta-presented-a-new-prompt-engineering-activity-7114351307183820800-MsgT?utm_source=share&utm_medium=member_desktop)
- [Analogical Prompting](https://huggingface.co/papers/2310.01714)
- [Prompt Flow: Build high-quality LLM apps](https://github.com/microsoft/promptflow)
- [Contrastive Chain-of-Thought Prompting (CCoT)](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_improve-chain-of-thought-prompting-by-adding-activity-7133477395944091648-TKlQ?utm_source=share&utm_medium=member_desktop)  
- [New Prompting Techniques](https://www.linkedin.com/posts/pramodith_promptengineering-llm-activity-7134507333530836992-evPU?utm_source=share&utm_medium=member_desktop)
- [Openai Prompt Engineering Guide - Linkedin Post](https://www.linkedin.com/posts/eric-vyacheslav-156273169_game-changer-open-ai-just-released-their-activity-7141454141683343360-eunF?utm_source=share&utm_medium=member_desktop)  
- [Openai Prompt Engineering Guide](https://platform.openai.com/docs/guides/prompt-engineering)
- [Anthropic Claude Metaprompt Tool](https://www.linkedin.com/posts/sahar-mor_anthropic-released-a-useful-tool-that-turns-activity-7194705248039444480-7KtG?utm_source=share&utm_medium=member_desktop)
- [Anthropic Prompt Improver](https://www.anthropic.com/news/prompt-improver)
- [Anthropic Prompt Improver Linkedin Post](https://www.linkedin.com/posts/anthropicresearch_weve-added-a-new-prompt-improver-to-the-activity-7262874194802036736-Q_RP?utm_source=share&utm_medium=member_desktop)
- [Anthropic Evaluate Prompts Tool](https://www.anthropic.com/news/evaluate-prompts)    
- [Cohere Prompt Tuner: Prompt Optimization at Your Fingertips](https://cohere.com/blog/intro-prompt-tuner?utm_source=bensbites&utm_medium=newsletter&utm_campaign=daily-digest-talk-with-your-ai-besties)
- [Quality Prompts: Use and evaluate prompting techniques quickly](https://github.com/sarthakrastogi/quality-prompts)
- [Prompt Design at Character.AI](https://research.character.ai/prompt-design-at-character-ai/)
- [Structured Prompting](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_structured-prompting-is-a-key-requirement-activity-7235928635633725440-0OG-?utm_source=share&utm_medium=member_desktop)
- [Writing with AI: Five ways professional writers are leveraging ChatGPT](https://openai.com/chatgpt/use-cases/writing-with-ai/)
- [Google Prompt Gallery](https://ai.google.dev/gemini-api/prompts)
- [ell: The Language Model Programming Library](https://docs.ell.so/)  
- [Template prompts of Cursor & VS Code and etc](https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools) [useful]
- [System Prompts Leaks](https://github.com/asgeirtj/system_prompts_leaks/)   

### LLM-based Recommender Systems:
- [ChatGPT-based Recommender Systems](https://blog.reachsumit.com/posts/2023/05/chatgpt-for-recsys/)  

### LLMs for Tabular Data:
- [Unleashing the Potential of Large Language Models for Predictive Tabular Tasks in Data Science](https://arxiv.org/abs/2403.20208)
- [LLMs for Tabular Data - Linkedin post](https://www.linkedin.com/posts/pascalbiese_unleashing-the-potential-of-llms-for-tabular-activity-7180873134743449600-ChWm?utm_source=share&utm_medium=member_desktop) 

### LLMs as Classifiers (finetuning LLMs for classification):
- [LLMs as Classifiers Linkedin Post1](https://www.linkedin.com/posts/sebastianraschka_what-if-you-care-about-finetuning-llms-for-activity-7183808393155944448-CSR1?utm_source=share&utm_medium=member_desktop)
- [Training LLMs for Spam Classification](https://www.linkedin.com/posts/sebastianraschka_training-llms-for-spam-classification-i-activity-7197943692949676034-c6_j?utm_source=share&utm_medium=member_desktop)  

### LLM Data Sets:
- [SlimPajama: A 627B token cleaned and deduplicated version of RedPajama](https://www.cerebras.net/blog/slimpajama-a-627b-token-cleaned-and-deduplicated-version-of-redpajama)

### LLM based Agents:
- [MetaGPT: Multi-Agent Framework](https://github.com/geekan/MetaGPT)
- [DevOpsGPT: AI-Driven Software Development Automation Solution](https://github.com/kuafuai/DevOpsGPT)
- [LLM Agent Survey](https://github.com/Paitesanshi/LLM-Agent-Survey)
- [Microsoft AutoGen development of LLM applications using multiple agents](https://github.com/microsoft/autogen)
- [OpenDevin: autonomous AI software engineer](https://github.com/OpenDevin/OpenDevin)
- [Composio: the best toolset to integrate AI Agents](https://github.com/ComposioHQ/composio)
- [MindSearch: An LLM-based Multi-agent Framework of Web Search Engine](https://github.com/InternLM/MindSearch)
- [OpenAI Swarm Library for Multi-Agent](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_this-came-unexpected-openai-released-swarm-activity-7250841965519368192-oJ35?utm_source=share&utm_medium=member_desktop)
- [Don't Sleep on Single-agent Systems](https://www.all-hands.dev/blog/dont-sleep-on-single-agent-systems)
- [Linkedin post for Don't Sleep on Single-agent Systems](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_the-more-progress-we-make-on-llms-the-more-activity-7246758324912758784-VC3N?utm_source=share&utm_medium=member_desktop)
- [Microsoft TinyTroupe library for simulate human agents with LLMs](https://www.linkedin.com/posts/sahar-mor_a-new-open-source-python-library-called-tinytroupe-activity-7262849272381874176-KFk_?utm_source=share&utm_medium=member_desktop) [Interesting]
- [Google Whitepaper on AI Agents - Linkedin Post](https://www.linkedin.com/posts/eric-vyacheslav-156273169_whitepaper-ai-agents-ugcPost-7286059606814990338-JinO/?utm_source=share&utm_medium=member_desktop)  
- [Google Whitepaper on AI Agents](https://www.kaggle.com/whitepaper-agents)
- [Microsoft ai-agents-for-beginners Course](https://github.com/microsoft/ai-agents-for-beginners)  
- [HuggingFace Smolagent Library blog post](https://huggingface.co/blog/smolagents) [Useful]  

### Structured Output in LLMs:
- [PydanticAI](https://github.com/pydantic/pydantic-ai)
- [PydanticAI Linkedin Post](https://www.linkedin.com/posts/liorsinclair_theres-a-new-ai-agent-framework-that-lets-activity-7270122274408534017-OOQq?utm_source=share&utm_medium=member_desktop)  

### Deploying LLMs:
- [ExecuTorch Post1](https://www.linkedin.com/posts/pytorch_introducing-executorch-alpha-executorch-activity-7191120577749831680-vYzE?utm_source=share&utm_medium=member_desktop)

### LLM Engineering:  
- [Langfuse: Open Source LLM Engineering Platform](https://github.com/langfuse/langfuse)  

### External Tools that Useful for LLMs:
- [Microsoft MarkItDown: Python library that lets you convert any document to Markdown](https://www.linkedin.com/posts/liorsinclair_microsoft-just-open-sourced-markitdown-a-activity-7275201481828454403-c5TX?utm_source=share&utm_medium=member_desktop) [Great]  

### Notes about Cost & Price of Training and Using LLMs:
- [Cost to Deploy LLaMA2 vs. ChatGPT](https://www.linkedin.com/posts/damienbenveniste_machinelearning-datascience-artificialintelligence-activity-7109561666324885504-ySeC?utm_source=share&utm_medium=member_desktop) [Very Important]  
- [Anyscale Training Cost](https://www.linkedin.com/posts/robert-nishihara-b6465444_im-so-proud-of-what-we-launched-last-week-activity-7113021412084219904-WFbP?utm_source=share&utm_medium=member_desktop)
- [LLMs APIs Pricing Benchmark: pricing of AWS Bedrock, OpenAI, Microsoft Azure](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_yesterday-amazon-web-services-aws-released-activity-7113454144216031233-LYuF?utm_source=share&utm_medium=member_desktop)
- [LLM Token-based Price Sheet](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_claude-21-with-200k-context-just-got-released-activity-7132812689369657344-Rk_a?utm_source=share&utm_medium=member_desktop)
- [LLM Pricing Table Sheet](https://docs.google.com/spreadsheets/d/1NX8ZW9Jnfpy88PC2d6Bwla87JRiv3GTeqwXoB4mKU_s/edit#gid=0)
- [LLM Pricing Table Linkedin Post](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_updated-llm-pricing-table-earlier-today-activity-7170527176168042497-YgT4?utm_source=share&utm_medium=member_desktop)
- [Pricibg Sheet for Hosted LLMs](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_just-updated-my-pricing-sheet-for-hosted-activity-7213556290575368196-u71R?utm_source=share&utm_medium=member_desktop)
- [LLM Pricing Comparison Tool in HuggingFace Space](https://huggingface.co/spaces/philschmid/llm-pricing)  

### Excellent & Easy to Learn Resources for Learning Transformers:
- [e2eml transformers from scratch](https://e2eml.school/transformers.html) [**Excellent**]  
- [annotated-transformer: Learning transformers from code](http://nlp.seas.harvard.edu/annotated-transformer/#a-first-example)  
- [Transformers Recipe](https://github.com/dair-ai/Transformers-Recipe)  

### Persian based Transformer Models:
- [ALBERT-Persian](https://github.com/m3hrdadfi/albert-persian)  
- [ALBERT-Persian Demo Page](https://albert-lab.m3hrdadfi.me/)  
- [ALBERT-Farsi-base-v2 in HuggingFace](https://huggingface.co/m3hrdadfi/albert-fa-base-v2)  
- [ParsBERT - Model for Persian Language Understanding](https://github.com/hooshvare/parsbert)  
- [ARMAN](https://github.com/alirezasalemi7/ARMAN) [Great]   
- [ParsBigBird: Persian Bert For Long-Range Sequences](https://github.com/sajjjadayobi/ParsBigBird) [Great]    
- [PersianQA](https://github.com/sajjjadayobi/PersianQA)   
- [Persian (Farsi) Pre-trained Language Models](https://nlpdataset.ir/farsi/pre-trained_lm.html) [Great]
- [Hezar: The all-in-one AI library for Persian, supporting a wide variety of tasks and modalities](https://github.com/hezarai/hezar) [**Great & Important**]
- [XLM-RoBERTa (Multilingual & supports Persian)](https://huggingface.co/FacebookAI/xlm-roberta-base)  
- [TookaBERT by PartAI](https://huggingface.co/PartAI/TookaBERT-Large) [Great]
- [Dorna PartAI LLM](https://www.linkedin.com/posts/partdp-ai_aetaexaesabraeaaeqaepaeuahy-aevaewaecaetaedaeuaewaehahy-activity-7205158585968844800-sqqa/?utm_source=share&utm_medium=member_desktop)  

## Transfer Learning with Transformers:
- [Transfer Learning for NLP via BERT for Text Classification](https://www.analyticsvidhya.com/blog/2020/07/transfer-learning-for-nlp-fine-tuning-bert-for-text-classification/)  
- [Text Classification with BERT Tokenizer](https://stackabuse.com/text-classification-with-bert-tokenizer-and-tf-2-0-in-python/)   
- [Bert Text Classification](https://github.com/Shivampanwar/Bert-text-classification)  
- [Persian Semantic Search](https://github.com/m3hrdadfi/semantic-search)  
- [Toward fine-tuning a state of the art Natural Language Inference (NLI) model for Persian](https://haddadhesam.medium.com/toward-fine-tuning-a-state-of-the-art-natural-language-inference-nli-model-for-persian-4d538ea4525d)  

### Siamese Netowrks and Dual BERT for Multi Text Classification:  
- [Siamese and Dual BERT for Multi-text Classification](https://towardsdatascience.com/siamese-and-dual-bert-for-multi-text-classification-c6552d435533)    
- [Transfer Learning via Siamese Networks](https://www.inovex.de/blog/transfer-learning-siamese-networks/)  

## Attention Mechanism:
- [Attention Mechanism](https://blog.floydhub.com/attention-mechanism/)  
- [Visualizing A Neural Machine Translation Model - Attention Mechanism](https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/)  
- [Intuitive Understanding of Attention Mechanism in Deep Learning](https://towardsdatascience.com/intuitive-understanding-of-attention-mechanism-in-deep-learning-6c9482aecf4f)  
- [Structured Attention Networks](https://medium.com/uci-nlp/summary-structured-attention-networks-f1917dd622af)  

## Sequence Modeling:
- [WaveNet: Increasing reception field using dilated convolution](https://medium.com/@kion.kim/wavenet-a-network-good-to-know-7caaae735435)  
- [Understanding WaveNet architecture](https://medium.com/@satyam.kumar.iiitv/understanding-wavenet-architecture-361cc4c2d623)  
- [WaveNet: A Generative Model for Raw Audio](https://medium.com/a-paper-a-day-will-have-you-screaming-hurray/wavenet-a-generative-model-for-raw-audio-84b2aa5fb4a0)  
- [How WaveNet Works](https://towardsdatascience.com/how-wavenet-works-12e2420ef386)  
- [PyTorch Tutorial to Sequence Labeling](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Sequence-Labeling)  

## Text Summarization:
- [Bert Extractive Summarizer](https://pypi.org/project/bert-extractive-summarizer/) [**Great**]   
- [Generating Text Summaries Using GPT-2 on PyTorch with Minimal Training](https://blog.paperspace.com/generating-text-summaries-gpt-2/) [_Good_]    
- [A Gentle Introduction to Text Summarization in Machine Learning](https://blog.floydhub.com/gentle-introduction-to-text-summarization-in-machine-learning/)  
- [Taming Recurrent Neural Networks for Better Summarization](http://www.abigailsee.com/2017/04/16/taming-rnns-for-better-summarization.html)  
- [PyTorch implementation of "Get to the point"](https://github.com/mjc92/GetToThePoint)  
- [TensorFlow implementation of "Get to the point"](https://github.com/abisee/pointer-generator)  

## Language Model:
- [A Comprehensive Guide to Build your own Language Model in Python](https://www.analyticsvidhya.com/blog/2019/08/comprehensive-guide-language-model-nlp-python-code/)   
- [D2L: Language Models and Dataset](https://d2l.ai/chapter_recurrent-neural-networks/language-models-and-dataset.html)  
- [Develop a word-level Neural Language Model in Keras](https://machinelearningmastery.com/how-to-develop-a-word-level-neural-language-model-in-keras/)  
- [IBM deep learning language model](https://github.com/IBM/deep-learning-language-model)  
- [BERT language model](https://devopedia.org/bert-language-model)  
- [Facebook AI: GSLM](https://www.marktechpost.com/2021/09/09/facebook-ai-introduces-gslm-generative-spoken-language-model-a-textless-nlp-model-that-breaks-free-completely-of-the-dependence-on-text-for-training/)   
- [Language Modeling Great Tutorial](https://lena-voita.github.io/nlp_course/language_modeling.html)   
- [GALACTICA: general-purpose scientific language model](https://github.com/paperswithcode/galai) [Great]  
- [Distributed Training of Language Models with Reinforcement Learning via Human Feedback (RLHF)](https://github.com/CarperAI/trlx) [**Excellent**]  

## Text & Document Classification:
- [hedwig - PyTorch deep learning models for document classification](https://github.com/castorini/hedwig)  

## Topic Modeling:
- [Topic Modeling with BERT](https://towardsdatascience.com/topic-modeling-with-bert-779f7db187e6)
- [BERTopic: Great Library for Topic Modeling](https://github.com/MaartenGr/BERTopic) [Great]  

## Sentiment Analysis:
- [Introduction to Deep Learning – Sentiment Analysis](https://nlpforhackers.io/deep-learning-introduction/)  

## Co-Reference Resolution:
- [Coreference Resolution for Chatbots](https://medium.com/huggingface/state-of-the-art-neural-coreference-resolution-for-chatbots-3302365dcf30)  
- [Hugging Face - CoRef](https://huggingface.co/coref/)  

## Imbalance Handling in NLP:
- [Over-Sampling using SMOTE](https://imbalanced-learn.readthedocs.io/en/stable/generated/imblearn.over_sampling.SMOTE.html) [_SMOTE for high-dimensional class-imbalanced data_]  
- [Over-sampling via imbalanced-learn library](https://imbalanced-learn.readthedocs.io/en/stable/over_sampling.html)  
- [Imbalanced Data Handling](https://www.jeremyjordan.me/imbalanced-data/)  

## Information Retrieval:
- [PyTerrier: Python API for Terrier](https://github.com/terrier-org/pyterrier)  

## Distance Measures:
- [Edit Distance](https://www.geeksforgeeks.org/edit-distance-dp-5/)  

## Text-based Emotion Recognition:
- [XLM-EMO: Multilingual Emotion Prediction in Social Media Text](https://github.com/MilaNLProc/xlm-emo)

## Machine Translation:
- [Open-NLLB: No Language Left Behind (NLLB), models capable of delivering high-quality translations directly between any pair of 200+ languages](https://github.com/gordicaleksa/Open-NLLB)  

## Chatbot:
- [Rasa Chatbot](https://github.com/RasaHQ/rasa) [**Great**]      
- [Learn how to Build and Deploy a Chatbot in Minutes using Rasa](https://www.analyticsvidhya.com/blog/2019/04/learn-build-chatbot-rasa-nlp-ipl/)   
- [chatbot with DialoGPT](https://www.machinecurve.com/index.php/2021/03/16/easy-chatbot-with-dialogpt-machine-learning-and-huggingface-transformers/)   
- [DialoGPT: huggingface Transformer](https://huggingface.co/transformers/model_doc/dialogpt.html)   
- [deeppavlov](https://github.com/deeppavlov/DeepPavlov) [**Great**]  
- [PyTorch Chatbot Tutorial](https://brsoff.github.io/tutorials/beginner/chatbot_tutorial.html)   
- [Implement a Simple Chat Bot With PyTorch](https://www.python-engineer.com/posts/chatbot-pytorch/)   
- [GPT2 Chatbot PyTorch](https://github.com/devjwsong/gpt2-chatbot-pytorch)   
- [PyTorch Official Chatbot Tutorial](https://pytorch.org/tutorials/beginner/chatbot_tutorial.html)    
- [PaddlePaddle Knover: toolkit for knowledge grounded dialogue generation](https://github.com/PaddlePaddle/Knover)   
- [PaddlePaddle PLATO-2](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/dialogue/plato-2)   
- [ParlAI](https://github.com/facebookresearch/ParlAI) [Great]     
- [huggingface: Transformers](https://github.com/huggingface/transformers) [Great]  
- [huggingface: Blenderbot](https://huggingface.co/transformers/model_doc/blenderbot.html) [**Great**]  
- [huggingface: Blenderbot Small](https://huggingface.co/transformers/model_doc/blenderbot_small.html) [**Great**]  
- [huggingface: GPT-2 Text Generation](https://huggingface.co/gpt2?text=A+long+time+ago%2C) [**Great**]            
- [Seq2seq Chatbot](https://github.com/ricsinaruto/Seq2seqChatbots)   
- [seq2seq Chatbot implemented in Pytorch](https://github.com/khordoo/chatbot-pytorch)   
- [papers with code: chatbot](https://paperswithcode.com/task/chatbot)   
- [Proudly Leading the Chatbot](https://www.analyticsinsight.net/ankush-sabharwal-proudly-leading-the-chatbot-sphere-with-strategical-innovations-and-implementations/)  
- [Real Python: Build a Chatbot with Python ChatterBot](https://realpython.com/build-a-chatbot-python-chatterbot/)  
- [A step-by-step guide to building a chatbot based on your own documents with GPT](https://bootcamp.uxdesign.cc/a-step-by-step-guide-to-building-a-chatbot-based-on-your-own-documents-with-gpt-2d550534eea5)
- [MiniPerplx: an alternative to Perplexity that lets search the web, research papers, youtube videos, movies](https://scira.app/)  
- [GitHub Models](https://github.blog/news-insights/product-news/introducing-github-models/)
- [Git Ingest: Quickly turn a GitHub repository into text for LLMs](https://www.linkedin.com/posts/eric-vyacheslav-156273169_you-can-now-quickly-turn-a-github-repository-activity-7277322180223254528-CRW9?utm_source=share&utm_medium=member_desktop) [**Great**]  
- [Create a Chatbot for any GitHub repo](https://www.linkedin.com/posts/eric-vyacheslav-156273169_game-changer-you-can-now-create-a-chatbot-activity-7226604741261230081-Bthf?utm_source=share&utm_medium=member_desktop) [**Great**]  

### Chatbot & LLMs Evaluation Metrics:
- [Chatbot Analytics: 9 Key Metrics](https://www.tidio.com/blog/chatbot-analytics/)  
- [Chatbot Statistics for 2023](https://www.tidio.com/blog/chatbot-statistics/)  
- [Chatbot Analytics 101: Essential Metrics to Track](https://blog.hootsuite.com/chatbot-analytics/)  
- [12 Metrics For Chatbot Analytics](https://www.kommunicate.io/blog/metrics-for-chatbot-analytics/)  
- [ParlAI Evaluation Metrics for Chatbot](https://github.com/facebookresearch/ParlAI/blob/14a10258bf90218341e0253d1c5a88c9d2cd013f/docs/source/tutorial_metrics.md)  
- [Chatbot Evaluation Metrics](https://github.com/ahkarami/Great-Deep-Learning-Tutorials/blob/master/NLP/Chatbot_Evaluation_Metrics.md) [**Great**]
- [Databricks' report on LLM evaluation methods](https://www.linkedin.com/posts/activity-7107825117379907584-m17h?utm_source=share&utm_medium=member_desktop)
- [AgentBench: Evaluating LLMs as Agents](https://github.com/THUDM/AgentBench)
- [Prometheus: Using GPT4 as SLMs Evaluator](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_using-powerful-llms-gpt-4-as-an-evaluator-activity-7131951255119110145-RH86?utm_source=share&utm_medium=member_desktop)
- [LLM Model Evaluation Metrics - When and How to Use Them](https://www.linkedin.com/posts/amrita-rath-288a071bb_llm-evaluation-metrics-activity-7198262398464503808-Gs6y?utm_source=share&utm_medium=member_desktop)  

### OpenAI ChatGPT & Its Applications:  
- [OpenAI ChatGPT](https://openai.com/blog/chatgpt/) [Amazing]  
- [Description of How OpenAI ChatGPT Works: Illustrating Reinforcement Learning from Human Feedback (RLHF)](https://github.com/huggingface/blog/blob/main/rlhf.md)  
- [How ChatGPT was Trained](https://www.linkedin.com/posts/damienbenveniste_machinelearning-datascience-chatgpt-activity-7007019154666909696-T5WM/?utm_source=share&utm_medium=member_android)  
- [ChatGPT Android SDK](https://github.com/skydoves/chatgpt-android/releases)  
- [ChatGPT awesome apps](https://www.linkedin.com/posts/tarrysingh_chatgpt-activity-7017947289721655296-7-pK/?utm_source=share&utm_medium=member_android)  
- [A Categorical Archive of ChatGPT Failures](https://arxiv.org/abs/2302.03494)  
- [Is ChatGPT a General-Purpose Natural Language Processing Task Solver?](https://arxiv.org/abs/2302.06476)  
- [aman.ai chatGPT Tutorial](https://aman.ai/primers/ai/chatGPT/) [Great]  
- [ChatGPT for customer service](https://www.intercom.com/ai-bot)  
- [Chatgpt Retrieval Plugin](https://github.com/openai/chatgpt-retrieval-plugin)  
- [Trending AI Tools](https://galionaitools.blogspot.com/2023/03/trending-ai-tools.html)  
- [Merlin: OpenAI ChatGPT Plus extension on all websites](https://merlin.foyer.work/)  
- [Adrenaline](https://useadrenaline.com/app)  
- [Using LLMs as agents that orchestrate tools](https://www.linkedin.com/posts/moritz-laurer_augmented-language-models-a-survey-activity-7047924951625953281-0XDj/?utm_source=share&utm_medium=member_android) [Interesting]  
- [ChatGPT API Using Python](https://www.machinelearning-basics.com/2023/04/chatgpt-api-using-python.html?m=1)  
- [parthean: A Startup about Financial Expert via ChatGPT](https://www.parthean.com/)  
- [Notes on the cost of ChatGPT](https://www.linkedin.com/posts/laurencevanelegem_sam-altman-ceo-of-openai-dropped-a-at-activity-7061987804548870144-RF9y/?utm_source=share&utm_medium=member_android)  
- [Ortus - your YouTube AI buddy](https://chrome.google.com/webstore/detail/ortus-your-youtube-ai-bud/jmpepfdhkjkknfpnfohnmnjoceepcbmp)
- [How Is ChatGPT’s Behavior Changing over Time?](https://www.linkedin.com/posts/svpino_gpt-4-is-getting-worse-over-time-not-better-activity-7087379892077481984-uORp?utm_source=share&utm_medium=member_android)
- [LLM Drifts: How Is ChatGPT’s Behavior Changing over Time?](https://github.com/lchen001/LLMDrift)
- [ChatGPT app Builder](https://www.linkedin.com/posts/zainkahn_absolute-madness-openai-ceo-sam-altman-activity-7128011745868050432-Ox5K?utm_source=share&utm_medium=member_desktop)
- [GPT4 Turbo 128k analysis Notes (its price)](https://www.linkedin.com/posts/reuvencohen_i-finally-got-a-chance-to-play-with-the-new-activity-7128179916512104448-SlEX?utm_source=share&utm_medium=member_desktop)
- [Designer GPT: website creator](https://www.linkedin.com/posts/eric-vyacheslav-156273169_this-is-crazy-designergpt-is-a-new-gpt-that-activity-7129833701873438720-lQuN?utm_source=share&utm_medium=member_desktop)
- [OpenAI DevDay Breakout Sessions Videos](https://www.linkedin.com/posts/openai_openai-devday-breakout-sessions-youtube-activity-7130298061599195137-vbyY?utm_source=share&utm_medium=member_desktop)
- [GPT Seed Parameter Notes](https://www.linkedin.com/posts/sahar-mor_openai-released-a-feature-that-mitigates-activity-7130940108974788608-vkDW?utm_source=share&utm_medium=member_desktop)
- [Awesome ChatGPT Prompts](https://github.com/f/awesome-chatgpt-prompts)
- [GPT-4o Full Data Analysis](https://www.linkedin.com/posts/eric-vyacheslav-156273169_gpt-4o-can-do-full-data-analysis-from-a-single-activity-7196162441116860416--yzu?utm_source=share&utm_medium=member_desktop)
- [GPT4-o Architecture](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_is-this-the-architecture-of-openai-gpt-4o-activity-7199759664836739073-gTEz?utm_source=share&utm_medium=member_desktop)
- [Introducing Structured Outputs in the OpenAI API](https://openai.com/index/introducing-structured-outputs-in-the-api/)
- [OpenAI Realtime-api](https://openai.com/index/introducing-the-realtime-api/)
- [OpenAI Model Distillation in the API](https://openai.com/index/api-model-distillation/)
- [OpenAI Prompt Caching](https://platform.openai.com/docs/guides/prompt-caching)  
- [LibreChat: Enhanced ChatGPT Clone](https://github.com/danny-avila/LibreChat) [**Great**]  

### OpenAI Learning to Reason & O1 Models:
- [Learning to Reason with LLMs: OpenAI o1 Model](https://openai.com/index/learning-to-reason-with-llms/)
- [How does OpenAI train the Strawberry (o1) model to spend more time thinking?](https://www.linkedin.com/posts/tom-yeh_openai-strawberry-aibyhand-activity-7240201012697833472-rrzD?utm_source=share&utm_medium=member_desktop)  
- [Learning to Reason before you speak is how OpenAI o1 generates its response](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_learning-to-reason-before-you-speak-is-how-activity-7240629908559785984--wMj?utm_source=share&utm_medium=member_desktop)
- [5 Papers that better understanding Openai o1 models](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_here-are-5-papers-you-want-to-read-to-understand-activity-7241017716214571008-eVba/?utm_source=share&utm_medium=member_android)  

## Google Bard & Gemini:  
- [Google DeepMind Gemini](https://www.linkedin.com/posts/googledeepmind_introducing-gemini-googles-largest-and-activity-7138182085441118208--M-h?utm_source=share&utm_medium=member_desktop)
- [Google released Gemini](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_google-just-released-gemini-their-most-activity-7138191392861757440-djDD?utm_source=share&utm_medium=member_desktop)
- [Google Gemini official released notes](https://blog.google/technology/ai/google-gemini-ai/?utm_source=linkedin&utm_medium=social&utm_campaign=GDMGemini)  

## Anthropic Claude:  
- [Anthropic Claude Tool Use](https://www.linkedin.com/posts/anthropicresearch_tool-use-is-now-available-in-beta-to-all-activity-7201976267171086336-oQ4K?utm_source=share&utm_medium=member_desktop)
- [Anthropic Prompt Generator](https://www.linkedin.com/posts/liorsinclair_anthropic-mightve-just-solved-prompt-engineering-activity-7196911121939795968-yray?utm_source=share&utm_medium=member_desktop)
- [Switched to Claude 3.5](https://www.interconnects.ai/p/switched-to-claude-from-chatgpt)
- [Anthropic Message Batches API](https://www.anthropic.com/news/message-batches-api)
- [Anthropic Message Batches API - Linkdin Post](https://www.linkedin.com/posts/anthropicresearch_introducing-the-message-batches-api-activity-7249461524996440066-xS37?utm_source=share&utm_medium=member_desktop)
- [OpenAI Prompt Caching in GPT 4o and o1: How Does It Compare To Claude Prompt Caching?](https://blog.getbind.co/2024/10/03/openai-prompt-caching-how-does-it-compare-to-claude-prompt-caching/)
- [Anthropic Blog: Transformer Circuits Thread](https://transformer-circuits.pub/)
- [Anthropic MCP (Model Context Protocol)](https://modelcontextprotocol.io/quickstart)  

## How do LLMs think?
- [On the Biology of a Large Language Model](https://transformer-circuits.pub/2025/attribution-graphs/biology.html)  

## NLP Programming Notes:
- [100 Times Faster Natural Language Processing in Python](https://medium.com/huggingface/100-times-faster-natural-language-processing-in-python-ee32033bdced)  
- [Multi-label Text Classification using BERT](https://medium.com/huggingface/multi-label-text-classification-using-bert-the-mighty-transformer-69714fa3fb3d)  
- [Learning Meaning in Natural Language Processing](https://medium.com/huggingface/learning-meaning-in-natural-language-processing-the-semantics-mega-thread-9c0332dfe28e)  
- [Train and Deploy the Mighty Transformer NLP models using FastBert and AWS SageMaker](https://medium.com/@kaushaltrivedi/train-and-deploy-mighty-transformer-nlp-models-using-fastbert-and-aws-sagemaker-cc4303c51cf3)  
- [Distilling knowledge from Neural Networks to build smaller and faster models](https://blog.floydhub.com/knowledge-distillation/)  
- [HarfBuzz - a text shaping library](https://github.com/harfbuzz/harfbuzz) [_Useful_]  
- [PruneBERT - Hugging Face](https://github.com/huggingface/transformers/tree/master/examples/movement-pruning)  
- [spacy-streamlit: spaCy building blocks for Streamlit apps](https://github.com/explosion/spacy-streamlit)  
- [HuggingFace Evaluate Library](https://github.com/huggingface/evaluate)  
- [NeMo - toolkit for Conversational AI](https://github.com/NVIDIA/NeMo) [_Excellent_]  

## Data Annotation Tools:
- [doccano is an open source text annotation tool](https://github.com/doccano/doccano) [**Great**]  
- [doccano-divar](https://doccano.divar.ir/)  

## Dataset Creator Tools:
- [Nvidia create dataset from massive pdf files tool](https://www.linkedin.com/posts/liorsinclair_nvidia-just-released-a-powerful-pdf-extraction-ugcPost-7267580522359336962-GAQv?utm_source=share&utm_medium=member_android)  

## NLP Courses:
- [HuggingFace Course](https://github.com/huggingface/course)  
- [NLP Zero to One: Full Course](https://medium.com/nerd-for-tech/nlp-zero-to-one-full-course-4f8e1902c379)  
- [Stanford CS25: Transformers United](https://web.stanford.edu/class/cs25/)  

## Other NLP Topics & miscellaneous:
- [HybridNLP - Tutorial on Hybrid Techniques for Knowledge-based NLP](https://github.com/hybridnlp/tutorial)  
- [Top 10 GPT-3 Tools Easing Content Creation Work in 2022](https://www.analyticsinsight.net/top-10-gpt-3-tools-easing-content-creation-work-in-2022/) [Interesting]  
- [Inflection-2.5 CahtBot](https://inflection.ai/inflection-2-5)
- [Research Paper Report Generating Agent](https://github.com/run-llama/llamacloud-demo/blob/main/examples/report_generation/research_paper_report_generation.ipynb)
- [Fast Semantic Text Deduplication](https://www.linkedin.com/posts/patrick-fleith_2-lines-of-code-to-deduplicate-a-dataset-activity-7289903818069164032-p2aa?utm_source=share&utm_medium=member_android)