# Great Deep Learning Tutorials for Natural Language Processing (NLP) A Great Collection of Deep Learning Tutorials and Repositories for Natural Language Processing (NLP) ## General: - [Great NLP Posts](http://jalammar.github.io/) - [Awesome NLP Paper Discussions - Hugging Face](https://github.com/huggingface/awesome-papers) [_Excellent_] - [Ten trends in Deep learning NLP](https://blog.floydhub.com/ten-trends-in-deep-learning-nlp/) - [Attention in RNNs](https://medium.com/datadriveninvestor/attention-in-rnns-321fbcd64f05) - [Understanding self-attention and other types of attention mechanisms](https://www.linkedin.com/posts/sebastianraschka_understanding-and-coding-self-attention-activity-7152300807080546304-uu21?utm_source=share&utm_medium=member_desktop) - [BERT - TensorFlow](https://github.com/google-research/bert) - [Understanding XLNet](https://www.borealisai.com/en/blog/understanding-xlnet/) - [XLNet - TensorFlow](https://github.com/zihangdai/xlnet) - [XLM (PyTorch implementation of Cross-lingual Language Model Pretraining)](https://github.com/facebookresearch/XLM) - [Pretrained PyTorch models for BERT](https://github.com/huggingface/pytorch-pretrained-BERT) - [Library of state-of-the-art pretrained models for NLP](https://github.com/huggingface/pytorch-transformers#quick-tour) [_Excellent_] - [DistilBERT](https://medium.com/huggingface/distilbert-8cf3380435b5) - [FastBert](https://arxiv.org/abs/2311.10770) - [FastBert Linkedin Post](https://www.linkedin.com/posts/activity-7132888497119485952-GMsV?utm_source=share&utm_medium=member_desktop) - [PyTorch Hub - BERT](https://pytorch.org/hub/huggingface_pytorch-pretrained-bert_bert/) - [A Simple Guide On Using BERT for Binary Text Classification](https://medium.com/swlh/a-simple-guide-on-using-bert-for-text-classification-bbf041ac8d04) - [Core ML 3 implementation of BERT for Question answering](https://github.com/huggingface/swift-coreml-transformers) - [NLP - Keras - Intro](https://nlpforhackers.io/keras-intro/) - [AllenNLP](https://allennlp.org/) [_General NLP_] - [Stanza - A Python NLP Library for Many Human Languages](https://stanfordnlp.github.io/stanza/) - [The Best NLP Papers From ICLR 2020](https://www.topbots.com/best-nlp-papers-from-iclr-2020) - [Deep learning for natural language processing and information retrieval at the University of Waterloo](https://github.com/castorini) - [Natural Language Processing With spaCy in Python](https://realpython.com/natural-language-processing-spacy-python/) [_Great_] - [NLP Papers](https://github.com/AliAkbarBadri/nlp-papers) - [A Great NLP Course](https://lena-voita.github.io/nlp_course.html) - [KerasNLP: Modular NLP Workflows for Keras](https://github.com/keras-team/keras-nlp) - [NLP Test: Deliver Safe & Effective Models](https://github.com/JohnSnowLabs/nlptest) - [Karpathy minbpe](https://github.com/karpathy/minbpe) - [Karpathy's 2 Hours Tutorial for Building GPT Tokenizer](https://www.linkedin.com/posts/liorsinclair_andrej-karpathy-just-uploaded-a-new-2-hour-activity-7165765602492571650-io92?utm_source=share&utm_medium=member_desktop) - [Learning Core Foundational Concepts in NLP by Examples and by calculation by Hand](https://www.linkedin.com/posts/alphasignal_can-foundational-concepts-like-transformers-activity-7163890641054232576-B1ai?utm_source=share&utm_medium=member_android) - [SetFit: Efficient Few-shot Learning with Sentence Transformers](https://github.com/huggingface/setfit) ## General Persian based libraries & Data Sets: - [Parsivar: library for Persian text preprocessing](https://github.com/ICTRC/Parsivar) - [Hazm](https://github.com/sobhe/hazm) - [persianNLP](https://github.com/persiannlp) - [ParsiNLU: Comprehensive suit of high-level NLP tasks for Persian language](https://github.com/persiannlp/parsinlu) - [FarsTail: A Persian Natural Language Inference Dataset](https://github.com/dml-qom/FarsTail) - [wordfreq: Access a database of word frequencies](https://github.com/rspeer/wordfreq) - [Persian Stop Words List](https://github.com/kharazi/persian-stopwords) - [Persian Stop Words List in Hazm Repo](https://github.com/sobhe/hazm/blob/master/hazm/data/stopwords.dat) - [PCoQA: Persian Conversational Question Answering Dataset](https://github.com/HamedHematian/PCoQA) - [Khayyam Challenge (PersianMMLU): Is Your LLM Truly Wise to The Persian Language?](https://arxiv.org/html/2404.06644v1) [Good paper & dataset] - [Basalam Dataset via RadeAI Team](https://www.linkedin.com/posts/rade-ai_datascience-machinelearning-basalam-activity-7193561781280157696-NF8T?utm_source=share&utm_medium=member_desktop) - [Basalam Datasets for LLM Fine-tuning](https://www.linkedin.com/posts/mohammadreza-esmaeilian-572ba9193_%D8%A7%D9%86%D8%AA%D8%B4%D8%A7%D8%B1-%D8%AF%DB%8C%D8%AA%D8%A7%D8%B3%D8%AA%D9%87%D8%A7-%D9%88-llm%D9%87%D8%A7%DB%8C-%D9%81%D8%A7%DB%8C%D9%86%D8%AA%DB%8C%D9%88%D9%86-%D8%B4%D8%AF%D9%87-%D8%A7%D8%AE%D8%AA%D8%B5%D8%A7%D8%B5%DB%8C-activity-7204220860142989314-VDUO?utm_source=share&utm_medium=member_desktop) - [ParsBench](https://www.linkedin.com/posts/shahriarshm_llm-dataset-syntheticabrdataset-activity-7278063501909098496-KR0O?utm_source=share&utm_medium=member_desktop) ## Text Representation: - [Beyond Word Embeddings Part 1](https://towardsdatascience.com/beyond-word-embeddings-part-1-an-overview-of-neural-nlp-milestones-82b97a47977f) - [Beyond Word Embeddings Part 2](https://towardsdatascience.com/beyond-word-embeddings-part-2-word-vectors-nlp-modeling-from-bow-to-bert-4ebd4711d0ec) - [Learning Word Embedding](https://lilianweng.github.io/lil-log/2017/10/15/learning-word-embedding.html) - [Introduction to Word Embedding and Word2Vec](https://towardsdatascience.com/introduction-to-word-embedding-and-word2vec-652d0c2060fa) - [Word Embedding](https://medium.com/data-science-group-iitr/word-embedding-2d05d270b285) - [Understanding Word Embeddings](https://hackernoon.com/understanding-word-embeddings-a9ff830403ce) - [Introduction to Word Vectors](https://medium.com/@jayeshbahire/introduction-to-word-vectors-ea1d4e4b84bf) - [Word2vec Made Easy](https://towardsdatascience.com/word2vec-made-easy-139a31a4b8ae) - [What is GloVe? Part I](https://towardsdatascience.com/emnlp-what-is-glove-part-i-3b6ce6a7f970) - [What is GloVe? Part II](https://towardsdatascience.com/emnlp-what-is-glove-part-ii-9e5ad227ee0) - [What is GloVe? Part III](https://towardsdatascience.com/emnlp-what-is-glove-part-iii-c6090bed114) - [What is GloVe? Part IV](https://towardsdatascience.com/emnlp-what-is-glove-part-iv-e605a4c407c8) - [What is GloVe? Part V](https://towardsdatascience.com/emnlp-what-is-glove-part-v-fa888272c290) - [ELMo: Deep Contextualized Word Representation](https://allennlp.org/elmo) - [A Step-by-Step NLP Guide to Learn ELMo](https://www.analyticsvidhya.com/blog/2019/03/learn-to-use-elmo-to-extract-features-from-text/) - [ELMo: Contextual language embedding](https://towardsdatascience.com/elmo-contextual-language-embedding-335de2268604) - [word embeddings with ELMo](https://medium.com/saarthi-ai/elmo-for-contextual-word-embedding-for-text-classification-24c9693b0045) - [Doc2Vec - Gensim](https://radimrehurek.com/gensim/models/doc2vec.html) ## Self-Supervised Learning in NLP: - [https://amitness.com/2020/05/self-supervised-learning-nlp/](https://amitness.com/2020/05/self-supervised-learning-nlp/) - [COSINE: Fine-Tuning Pre-trained Language Model with Weak Supervision](https://github.com/yueyu1030/COSINE) ## RNN, LSTM, and GRU: - [Understanding LSTM Networks](https://colah.github.io/posts/2015-08-Understanding-LSTMs/) - [Illustrated Guide to LSTMโ€™s and GRUโ€™s](https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21) - [Animated RNN, LSTM and GRU](https://towardsdatascience.com/animated-rnn-lstm-and-gru-ef124d06cf45) - [Recurrent Neural Networks and LSTM explained](https://medium.com/@purnasaigudikandula/recurrent-neural-networks-and-lstm-explained-7f51c7f6bbb9) - [Long Short-Term Memory (LSTM): Concept](https://medium.com/@kangeugine/long-short-term-memory-lstm-concept-cb3283934359) - [Understanding architecture of LSTM cell from scratch](https://hackernoon.com/understanding-architecture-of-lstm-cell-from-scratch-with-code-8da40f0b71f4) - [Basic understanding of LSTM](https://blog.goodaudience.com/basic-understanding-of-lstm-539f3b013f1e) - [Taming LSTMs with PyTorch](https://towardsdatascience.com/taming-lstms-variable-sized-mini-batches-and-why-pytorch-is-good-for-your-health-61d35642972e) - [Introduction to LSTM](https://www.analyticsvidhya.com/blog/2017/12/fundamentals-of-deep-learning-introduction-to-lstm/?utm_medium=ELMoNLParticle&utm_source=blog) - [Introduction to RNNs](https://www.jeremyjordan.me/introduction-to-recurrent-neural-networks/) - [xLSTM - Post1](https://www.linkedin.com/posts/liorsinclair_is-this-the-end-of-transformers-the-team-activity-7194350205318701056-8yBr?utm_source=share&utm_medium=member_desktop) - [Were RNNs All We Needed?](https://arxiv.org/abs/2410.01201) [Interesting Paper] ## Transformers: - [How Transformers Work](https://towardsdatascience.com/transformers-141e32e69591) - [The Illustrated Transformer](http://jalammar.github.io/illustrated-transformer/) - [Transformers from Scratch](https://e2eml.school/transformers.html) - [What is a Transformer?](https://medium.com/inside-machine-learning/what-is-a-transformer-d07dd1fbec04) - [How Transformers work in deep learning and NLP](https://theaisummer.com/transformer/) - [Transformer: A Novel Neural Network Architecture for Language Understanding](https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html) - [How do Transformers Work in NLP?](https://www.analyticsvidhya.com/blog/2019/06/understanding-transformers-nlp-state-of-the-art-models/) - [The Essence of Transformers](https://towardsdatascience.com/the-essence-of-transformers-9fb8e14cc465) [Good] - [Transformers and Multi Head Attention](https://uvadlc-notebooks.readthedocs.io/en/latest/tutorial_notebooks/tutorial6/Transformers_and_MHAttention.html) - [Multi Head Attention](https://d2l.ai/chapter_attention-mechanisms-and-transformers/multihead-attention.html) - [BERT for Dummies](https://towardsdatascience.com/bert-for-dummies-step-by-step-tutorial-fb90890ffe03) - [The Dark Secrets of BERT](https://text-machine-lab.github.io/blog/2020/bert-secrets/) - [A Survey of Long-Term Context in Transformers](https://www.pragmatic.ml/a-survey-of-methods-for-incorporating-long-term-context/) [_Great_] - [The Transformer Family](https://lilianweng.github.io/lil-log/2020/04/07/the-transformer-family.html) - [The Transformer Isnโ€™t As Hard To Understand As You Might Think](https://towardsdatascience.com/knocking-on-transformers-door-attention-mechanism-explained-intuitively-df5d4fcecdf8) - [Review of Compact Transformer Architectures](https://medium.com/@jfd2139/review-of-compact-transformer-architectures-c477b797e2d5) [**Great**] - [REFORMER: The Efficient Transformer](https://arxiv.org/pdf/2001.04451.pdf) - [GPT-3: Language Models are Few-Shot Learners](https://github.com/openai/gpt-3) - [GPT-3 Sandbox](https://github.com/shreyashankar/gpt3-sandbox) - [Microsoft will launch GPT-4](https://medium.com/@yablonassaf/microsoft-will-launch-gpt-4-with-ai-videos-on-wednesday-75d882e0260e) - [OpenAI GPT-4](https://openai.com/research/gpt-4) - [Some information about GPT-4](https://www.linkedin.com/posts/damienbenveniste_machinelearning-datascience-artificialintelligence-activity-7041793426530390016-5P-n/?utm_source=share&utm_medium=member_android) - [Regular Expressions (Regex) Generated by GPT-3](https://losslesshq.com/) - [Auto Regex: Converting English description to Regex](https://www.autoregex.xyz/) [Good] - [minGPT](https://github.com/karpathy/minGPT) - [NVIDIA FasterTransformer: Transformer related optimization, including BERT & GPT](https://github.com/NVIDIA/FasterTransformer) - [OpenNMT CTranslate2: Fast inference engine for Transformer models](https://github.com/OpenNMT/CTranslate2/) - [Deploying GPT-J and T5 with FasterTransformer and Triton Inference Server](https://developer.nvidia.com/blog/deploying-gpt-j-and-t5-with-fastertransformer-and-triton-inference-server/?ncid=so-link-499508#cid=dl05_so-link_en-us) [Interesting] - [MEND: Fast Model Editing at Scale](https://github.com/eric-mitchell/mend) [**Excellent Work**] - [BorealisAI Transformers I: Introduction](https://www.borealisai.com/research-blogs/tutorial-14-transformers-i-introduction/) - [OpenAI Best Practices for Deploying Language Models](https://openai.com/blog/best-practices-for-deploying-language-models/) - [OPT-IML](https://github.com/facebookresearch/metaseq/tree/main/projects/OPT-IML) - [RetNet: an Alternative to Transformers](https://www.linkedin.com/posts/aleksagordic_an-alternative-to-transformers-whoa-activity-7087790555190980608-66ZM?utm_source=share&utm_medium=member_android) - [What comes after Transformers?](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_what-comes-after-transformers-neural-memory-activity-7402992391957270528-mj34?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAgksdYBFu3_vG0bwXWdh93rSqV1J1ghMP4) - [Transformer Taxonomy](https://kipp.ly/blog/transformer-taxonomy/) [Great] - [Generative AI exists because of the transformer: Great Visual Explanation](https://ig.ft.com/generative-ai/) [Great] ### Reinforcement Learning from Human Feedback (RLHF): - [RLHF Tutorial](https://vinija.ai/concepts/RLHF/) - [New method instead of RLHF: Direct Preference Optimization: Your Language Model is Secretly a Reward Model](https://www.linkedin.com/posts/yoelzeldes_to-get-llms-as-good-as-openais-gpt-4-is-activity-7078958558519656451-N6Wo/?utm_source=share&utm_medium=member_android) - [Finetuning an LLM: RLHF and alternatives (Part I)](https://argilla.io/blog/mantisnlp-rlhf-part-1/) - [Finetuning an LLM: RLHF and alternatives (Part II)](https://argilla.io/blog/mantisnlp-rlhf-part-2/) - [Finetuning an LLM: RLHF and alternatives (Part III)](https://argilla.io/blog/mantisnlp-rlhf-part-3/) - [How good is AI feedback?](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_how-good-is-ai-feedback-and-does-it-really-activity-7171174171413102592-eVs9?utm_source=share&utm_medium=member_desktop) - [Direct Preference Optimization (DPO) for LLM Alignment (From Scratch)](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb) ### Tokenizer Notes: - [๐—ป๐—ฒ๐˜„ ๐—ฝ๐—ฎ๐—ฝ๐—ฒ๐—ฟ ๐—ฏ๐˜† ๐— ๐—ฒ๐˜๐—ฎ ๐—ฐ๐—น๐—ฎ๐—ถ๐—บ๐˜€ ๐˜๐—ต๐—ฎ๐˜ ๐˜„๐—ฒ ๐—ฐ๐—ฎ๐—ป ๐—ด๐—ฒ๐˜ ๐—ฟ๐—ถ๐—ฑ ๐—ผ๐—ณ ๐˜๐—ผ๐—ธ๐—ฒ๐—ป๐—ถ๐˜‡๐—ฒ๐—ฟ๐˜€: Byte Latent Transformer: Patches Scale Better Than Tokens --> we could get rid of tokenizers](https://www.linkedin.com/posts/a-roucher_%F0%9D%97%A3%F0%9D%97%BC%F0%9D%98%81%F0%9D%97%B2%F0%9D%97%BB%F0%9D%98%81%F0%9D%97%B6%F0%9D%97%AE%F0%9D%97%B9-%F0%9D%97%BD%F0%9D%97%AE%F0%9D%97%BF%F0%9D%97%AE%F0%9D%97%B1%F0%9D%97%B6%F0%9D%97%B4%F0%9D%97%BA-%F0%9D%98%80%F0%9D%97%B5%F0%9D%97%B6%F0%9D%97%B3%F0%9D%98%81-activity-7273382398891810816-QfQo?utm_source=share&utm_medium=member_desktop) - [Byte Latent Transformer: Patches Scale Better Than Tokens (paper)](https://dl.fbaipublicfiles.com/blt/BLT__Patches_Scale_Better_Than_Tokens.pdf) ### Large Language Models (LLMs): - [LLM Reading Papers](https://www.linkedin.com/posts/eric-vyacheslav-156273169_new-must-read-the-anti-hype-llm-reading-activity-7247244292568625152-DQsb?utm_source=share&utm_medium=member_desktop) - [LLaMA](https://github.com/facebookresearch/llama) - [Toolformer: Language Models Can Teach Themselves to Use Tools](https://arxiv.org/abs/2302.04761) [Great] - [Toolformer GitHub](https://github.com/lucidrains/toolformer-pytorch) - [Amazon Multimodal Chain-of-Thought Reasoning in Language Models](https://github.com/amazon-science/mm-cot) - [LLaMA-based ChatGPT Training](https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/chatllama) [Great] - [The Wisdom of Hindsight Makes Language Models Better Instruction Followers](https://arxiv.org/abs/2302.05206) - [Stanford Alpaca: An Instruction-following LLaMA model](https://github.com/tatsu-lab/stanford_alpaca) - [Alpaca: A Strong, Replicable Instruction-Following Model](https://crfm.stanford.edu/2023/03/13/alpaca.html) - [Fine-Tune Alpaca in Arabic](https://www.linkedin.com/posts/yassine-boukhari-006748217_alpaca-a-strong-replicable-instruction-following-activity-7043223149710036992-YUJb?utm_source=share&utm_medium=member_android) - [TRL: Transformer Reinforcement Learning](https://github.com/lvwerra/trl) - [Large Language Model (LLM) Primers Tutorial](https://www.linkedin.com/posts/amanc_artificialintelligence-machinelearning-ai-activity-7045245910850695168-Fp9K/?utm_source=share&utm_medium=member_android) [Great] - [Dolly](https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html) - [Microsoft JARVIS & HuggingGPT](https://github.com/microsoft/JARVIS) [Interesting] - [open-source LLMs](https://www.linkedin.com/posts/sahar-mor_artificialintelligence-machinelearning-activity-7049789761728770049-QLsv/?utm_source=share&utm_medium=member_android) - [GPT4Free](https://github.com/xtekky/gpt4free) - [HuggingChat](https://huggingface.co/chat/) - [LaMini-LM: A Diverse Herd of Distilled Models](https://github.com/mbzuai-nlp/LaMini-LM/) - [RedPajama-Data: An Open Source Recipe to Reproduce LLaMA training dataset](https://github.com/togethercomputer/RedPajama-Data) - [BigCode](https://huggingface.co/bigcode) - [OpenLLaMA](https://github.com/openlm-research/open_llama) - [Dromedary: towards helpful, ethical and reliable LLMs](https://github.com/IBM/Dromedary) - [MPT-7B Model with Commercial Licence](https://huggingface.co/mosaicml/mpt-7b/blob/main/README.md) - [MPT-7B Story Writer](https://huggingface.co/mosaicml/mpt-7b-storywriter) - [MPT-7B](https://github.com/mosaicml/llm-foundry) - [MPT-7B Blog](https://www.mosaicml.com/blog/mpt-7b) - [Open LLMs](https://github.com/eugeneyan/open-llms) - [Google PaLM 2](https://ai.google/discover/palm2) - [BLOOMChat](https://github.com/sambanova/bloomchat) - [LLMs Practical Guide](https://github.com/Mooler0410/LLMsPracticalGuide) - [FrugalGPT](https://www.linkedin.com/posts/sanyambhutani_saving-98-llm-usage-costs-stanford-activity-7062420577357037568-t0a8/?utm_source=share&utm_medium=member_android) - [ChatALL](https://github.com/sunner/ChatALL) [Great] - [Falcon LLM](https://falconllm.tii.ae/) - [The Falcon has landed in the Hugging Face ecosystem](https://huggingface.co/blog/falcon) [Great] - [Open LLMs](https://github.com/eugeneyan/open-llms) [Great] - [OpenLLMs: Less is More for Open-source Models](https://github.com/imoneoi/openchat) [Great] - [LLaMA2](https://www.llama2.ai/) - [source code of llama2-chatbot](https://github.com/a16z-infra/llama2-chatbot/tree/main) - [Notes about OpenAI's GPT-4 Model](https://www.linkedin.com/posts/aleksagordic_openais-gpt-4-details-have-apparently-been-activity-7085226267712614400-T1d3/?utm_source=share&utm_medium=member_android) - [GPT-4 is getting worse over time](https://www.linkedin.com/posts/svpino_gpt-4-is-getting-worse-over-time-not-better-activity-7087379892077481984-uORp/?utm_source=share&utm_medium=member_android) - [OpenChat: Less is More for Open-source Models](https://huggingface.co/openchat/openchat) - [Instruction Tuning Datasets](https://github.com/raunak-agarwal/instruction-datasets) - [ToolLLM](https://www.linkedin.com/posts/omarsar_enabling-llms-with-tool-use-capabilities-activity-7093299751571320832-1WHU/?utm_source=share&utm_medium=member_android) - [Falcon 180B](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_falcon-180b-released-tii-just-released-activity-7105166508376367105-P7ws?utm_source=share&utm_medium=member_desktop) - [Fine-tune Falcon 180B using QLoRA and Flash Attention on Amazon SageMaker](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_fine-tune-falcon-180b-with-qlora-and-flash-activity-7107387875515580416-zhSe?utm_source=share&utm_medium=member_desktop) - [Large Language Models as Optimizers](https://arxiv.org/abs/2309.03409) - [Favourite LLM Authors](https://www.linkedin.com/posts/sanyambhutani_curated-list-of-my-favourite-llm-authors-activity-7105896422226423808-Unev?utm_source=share&utm_medium=member_desktop) - [Open Source LLMs for Commercial Use](https://www.linkedin.com/posts/armand-ruiz_top-open-source-llms-available-for-commercial-activity-7137772625468002304-jkMM?utm_source=share&utm_medium=member_desktop) - [Optimizing your LLM in production](https://huggingface.co/blog/optimize-llm) [Important] - [In Context Vectors (ICV): an alternative to Few-Shot Learning and Finetuning techniques like LoRA to improve an LLMs performance](https://www.linkedin.com/posts/pramodith_in-context-vectors-icv-is-an-alternative-activity-7131970618467471360-67Z3?utm_source=share&utm_medium=member_desktop) - [NexusRavan v2 13B Fuction Calling LLM Surpassing GPT-4](https://www.linkedin.com/posts/nexusflow-ai_nexusravenv2-opensource-genai-activity-7137805301323362304-U2Pl?utm_source=share&utm_medium=member_desktop) - [Phixtral model](https://www.linkedin.com/posts/maxime-labonne_phixtral-i-made-the-first-efficient-mixture-activity-7150758415961620481-v0qx?utm_source=share&utm_medium=member_desktop) - [Eagle-7B LLM: 100% attention-free RNN Model!](https://www.linkedin.com/posts/maxime-labonne_rwkv-released-eagle-7b-its-an-llm-that-activity-7157700712330661888-cdd1?utm_source=share&utm_medium=member_desktop) - [Eagle-7B LLM: Blog Post](https://blog.rwkv.com/p/eagle-7b-soaring-past-transformers) - [Can LLMs improve themselves? Self-play fine-tuning (SPIN)](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_can-llms-improve-themselves-self-play-fine-tuning-activity-7150501901665542144-mk4K?utm_source=share&utm_medium=member_desktop) - [AI2 OLMo Model: Linkedin Post](https://www.linkedin.com/posts/natolambert_allenaiolmo-7b-hugging-face-activity-7158834284689035264-vfu7?utm_source=share&utm_medium=member_desktop) - [AI2 OLMo Model: HuggingFace](https://huggingface.co/allenai/OLMo-7B) - [AI2 OLMo Model: Original Blog post](https://www.interconnects.ai/p/olmo) - [Some Notes about OLMo Model](https://www.linkedin.com/posts/sebastianraschka_ive-been-working-with-the-1b7b-olmo-models-activity-7166067492778360832-kc3T?utm_source=share&utm_medium=member_desktop) - [Mixtral in colab](https://github.com/dvmazur/mixtral-offloading/blob/master/notebooks/demo.ipynb) [Great] - [Grok-1 LLM with 314B Size: Post1](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_elon-musk-kept-his-word-and-released-grok-activity-7175221121472983040-F7zS?utm_source=share&utm_medium=member_desktop) - [Grok-1 LLM: Post2](https://www.linkedin.com/posts/liorsinclair_big-news-grok-is-finally-open-source-with-activity-7175496738948968448--Ewx?utm_source=share&utm_medium=member_desktop) - [Grok-3 LLM from xAI](https://x.com/lmarena_ai/status/1891706264800936307) - [Grok-3 LLM from xAI - karpathy](https://x.com/karpathy/status/1891720635363254772) - [DBRX LLM](https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm) - [DBRX LLM: Post1](https://www.linkedin.com/posts/mateizaharia_at-databricks-weve-built-an-awesome-model-activity-7178738621099769857-v4X8?utm_source=share&utm_medium=member_desktop) - [DBRX LLM: Post2](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_new-state-of-the-art-open-llm-databricks-activity-7178748050117451776-Otgg?utm_source=share&utm_medium=member_desktop) - [LLMs via Multi-Token Prediction](https://www.linkedin.com/posts/aiatmeta_new-research-from-fair-better-faster-large-activity-7194022959609438208-TH1u?utm_source=share&utm_medium=member_android) - [Test Time Computing for Open LLMs](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_how-we-implemented-test-time-computing-for-activity-7274685354895458304-elNI?utm_source=share&utm_medium=member_desktop) ### Merge LLMs: - [Linkedin Post](https://www.linkedin.com/posts/maxime-labonne_merge-large-language-models-with-mergekit-activity-7150044812337901569-3zIu?utm_source=share&utm_medium=member_android) - [Colab Notebook](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb) - [Main Github of Mergekit](https://github.com/cg123/mergekit) - [huggingface merge-models blog post](https://huggingface.co/blog/mlabonne/merge-models) - [Making the NeuralBeagle14-7B LLM Model (via Merging models and other methods)](https://www.linkedin.com/posts/maxime-labonne_heres-how-i-made-the-new-best-performing-activity-7153302680780640256-1Sv7?utm_source=share&utm_medium=member_desktop) - [Merge Large Language Models with mergekit](https://towardsdatascience.com/merge-large-language-models-with-mergekit-2118fb392b54) - [Fine-tune a Mistral-7b model with Direct Preference Optimization](https://towardsdatascience.com/fine-tune-a-mistral-7b-model-with-direct-preference-optimization-708042745aac) - [AutoMerger](https://www.linkedin.com/posts/maxime-labonne_automerger-how-i-automated-the-model-merging-activity-7172890188430454786-Djs7?utm_source=share&utm_medium=member_desktop) - [Evolutionary LLM Merging - Post1](https://www.linkedin.com/posts/maxime-labonne_evolutionary-model-merge-sakana-ai-released-activity-7176527260097597440-52JT?utm_source=share&utm_medium=member_desktop) - [Evolutionary LLM Merging - Post2](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_the-evolution-of-llms-model-merging-is-activity-7176561819933671424-NNNX?utm_source=share&utm_medium=member_desktop) - [Mixture of Experts (MoEs) Explained](https://huggingface.co/blog/moe) [Great] - [Mixture of Experts (MoEs) Papers List](https://huggingface.co/collections/osanseviero/moes-papers-reading-list-65a83f8a9aec16459920ffe0) - [Mixture of Experts (MoEs) Linkedin Post](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_mixture-of-experts-explained-activity-7179478562398187520-dbzM?utm_source=share&utm_medium=member_desktop) - [Mixture-of-Depths - Post1](https://www.linkedin.com/posts/zaiinulabideen_crazy-ai-week-mixture-of-depths-qwen15-activity-7182746449921658880-aLVO?utm_source=share&utm_medium=member_desktop) - [Mixture-of-Depths (MoD) - Post2](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_can-we-train-llms-to-allocate-flops-compute-activity-7182303286429917184-jkOm?utm_source=share&utm_medium=member_desktop) - [AutoLoRA-Merging Linkedin Post](https://www.linkedin.com/posts/zaiinulabideen_autolora-merging-ties-dare-magnitudeprune-activity-7166081059166662658-OzxA?utm_source=share&utm_medium=member_desktop) ### LLaMA2 Related Links: - [A colab gradio web UI for running Large Language Models](https://github.com/camenduru/text-generation-webui-colab) [Great] - [llama-2-7b-chat-GPTQ-4bit](https://colab.research.google.com/github/camenduru/text-generation-webui-colab/blob/main/llama-2-7b-chat-GPTQ-4bit.ipynb) - [camenduru](https://github.com/camenduru) - [llama-2 philschmid](https://www.philschmid.de/llama-2) - [fine-tuning LLMs with TRL](https://www.linkedin.com/posts/lvwerra_it-crazy-how-far-the-ml-field-has-come-when-activity-7087699813009383425-Sr1y/?utm_source=share&utm_medium=member_android) - [lora tuning peft finetuning llama2](https://huggingface.co/docs/trl/main/en/lora_tuning_peft#finetuning-llama2-model) - [LLaMA2 with PEFT](https://www.linkedin.com/posts/gante_unleash-the-true-llama-2-potential-from-day-activity-7087363261666328577-38jV/?utm_source=share&utm_medium=member_android) - [Baby LLaMA2 in C](https://github.com/karpathy/llama2.c) - [Releasing LLongMA-2 16k](https://www.linkedin.com/posts/enrico-shippole-495521b8_conceptofmindllongma-2-13b-16k-hugging-activity-7090718505183928320-DYtD/?utm_source=share&utm_medium=member_android) - [LLaMA2 API in Hugging Face Inference](https://www.linkedin.com/feed/update/urn:li:activity:7089986843839979521/?utm_source=share&utm_medium=member_android) - [LLaMA2 API in Monster API](https://monsterapi.ai/llama-2-7b-chat-api) - [LLaMA2-Accessory](https://github.com/Alpha-VLLM/LLaMA2-Accessory) - [Hermes-LLongMA-2 8k](https://www.linkedin.com/posts/enrico-shippole-495521b8_conceptofmindhermes-llongma-2-13b-8k-hugging-activity-7092178977217282049-JZB8/?utm_source=share&utm_medium=member_android) - [Training Llama 2](https://www.linkedin.com/posts/bhavsarpratik_llama2-finetuning-genai-activity-7092496767870509056-RojZ/?utm_source=share&utm_medium=member_android) - [Llama-2-7B-32K-Instruct โ€” and fine-tuning for Llama-2 models with Together API](https://together.ai/blog/llama-2-7b-32k-instruct) - [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) - [LLaMA-Factory Notes](https://www.linkedin.com/posts/rorcde_llama-factory-ai-library-of-the-day-llama-activity-7138958059506143234-t5p2?utm_source=share&utm_medium=member_desktop) - [Purple llama by Meta - Link1](https://github.com/facebookresearch/PurpleLlama) - [Purple llama by Meta - Link2](https://www.linkedin.com/posts/aiatmeta_announcing-purple-llama-towards-open-trust-activity-7138536031858937857-edXE?utm_source=share&utm_medium=member_desktop) - [Purple llama by Meta - Link3](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_purple-llama-just-got-released-by-meta-activity-7138538944115200001-WKAR?utm_source=share&utm_medium=member_desktop) - [TinyLLaMa-1.1B](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) - [Can llama learn new language?](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_is-it-possible-to-teach-llms-a-different-activity-7148653756165812226--l7o?utm_source=share&utm_medium=member_desktop) - [Persian LLaMa](https://huggingface.co/spaces/mostafaamiri/persianllama) ### LLaMA3 Related Links: - [LLaMA3 Linkedin Post1](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_welcome-llama-3-metas-new-open-llm-activity-7186762894989012992-SBLe?utm_source=share&utm_medium=member_desktop) - [Meta LLaMA3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) - [Fine tune LLaMA3](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_efficiently-fine-tune-llama-3-with-pytorch-activity-7188186109363859456-sYSR?utm_source=share&utm_medium=member_desktop) - [LLaMA3 Long Context](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_llama-3-extended-to-almost-100000-token-activity-7189518531300904963-9Y9V?utm_source=share&utm_medium=member_desktop) - [LLaMA3.1](https://ollama.com/library/llama3.1) - [LLaMA 3.1 Some Notes](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_llama-405b-is-here-and-it-comes-with-more-activity-7221533382025822208-K-Zm?utm_source=share&utm_medium=member_desktop) - [LLaMA 3.1 Model Finetunning](https://www.linkedin.com/posts/danielhanchen_google-colab-activity-7221621362417700867-y935/?utm_source=share&utm_medium=member_android) - [LLaMA 3.1 Detail Notes](https://www.linkedin.com/posts/sebastianraschka_yesterdays-llama-31-release-marked-a-big-activity-7221861717876645888-wz3H?utm_source=share&utm_medium=member_android) - [LLaMA 3.2 Detail Notes](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_llama-can-now-see-and-run-on-your-phone-activity-7244763879690354688-Iaan?utm_source=share&utm_medium=member_android) - [Mobile LLaMA 3.2](https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/) - [Llama-3.3-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct) - [How an online gifting site is using Llama to help protect customer privacy](https://ai.meta.com/blog/untukmu-built-with-llama/) [interesting] ### DeepSeek Models Related Links: - [DeepSeek-V3 Linkedin Post](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_yesterday-the-best-open-model-to-date-was-activity-7278313766679658498-6BCl?utm_source=share&utm_medium=member_desktop) - [Train your own R1 reasoning model with Unsloth (GRPO)](https://unsloth.ai/blog/r1-reasoning) ### Phi-3 Related Links: - [Phi-3 Linkedin Post1](https://www.linkedin.com/posts/sebastianraschka_microsoft-just-casually-shared-theirnew-activity-7188544168380510208-AdDG?utm_source=share&utm_medium=member_desktop) - [Phi-3 Linkedin Post2](https://www.linkedin.com/posts/julienchaumond_in-case-you-missed-it-earlier-this-week-activity-7189273186256003072-91B0?utm_source=share&utm_medium=member_desktop) ### Mistral & Mixtral Models Related Links: - [Mistral AI models](https://github.com/mistralai/mistral-src) - [Is Mistral's first model a good replacement for OpenAI?](https://blog.quivr.app/is-mistral-a-good-replacement-for-openai/) - [Mistral Mixture of Experts (MoE) Model](https://www.linkedin.com/posts/liorsinclair_big-news-mistral-just-released-an-open-source-activity-7139323993253228544-5coS?utm_source=share&utm_medium=member_desktop) - [Mixtral - a SOTA Mixture of Experts](https://huggingface.co/blog/mixtral) - [Mistraltrx](https://www.linkedin.com/posts/allen-roush-27721011b_cultrixmistraltrix-v1-hugging-face-activity-7149086757945298944-T7IA?utm_source=share&utm_medium=member_desktop) - [Nous-Hermes-Mixtral model](https://www.linkedin.com/posts/maxime-labonne_nousresearch-just-released-nous-hermes-activity-7152787405815566337-4aTY?utm_source=share&utm_medium=member_desktop) - [Mixtral in colab](https://github.com/dvmazur/mixtral-offloading/blob/master/notebooks/demo.ipynb) [Great] - [Brev.dev Notebooks: Fine-tuning mistral, mixtral, phi-2 and etc](https://github.com/brevdev/notebooks/tree/main) [**Excellent**] - [Optimized LLM inference api for mistral-7b using vllm and AWQ](https://lightning.ai/lightning-ai/studios/optimized-llm-inference-api-for-mistral-7b-using-vllm?view=public§ion=blogs) [**Excellent**] - [Run Mistral7b Quantized for free on any computer (CPU or GPU)](https://medium.com/artificial-corner/run-mistral7b-quantized-for-free-on-any-computer-2cadc18b45a2) [Interesting] - [Mixtral 8x22B a 176B MoE Model](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_new-open-model-from-mistral-ai-yesterday-activity-7183816273053523971-Vgse?utm_source=share&utm_medium=member_desktop) - [Mistral-7B-Instruct-v0.3](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_mistralaimistral-7b-instruct-v03-hugging-activity-7199103875348320256-lJ_A?utm_source=share&utm_medium=member_android) - [Codestral: A model fluent in 80+ programming languages](https://mistral.ai/news/codestral/) - [Mistral Finetune: the official repo and guide on how to fine-tune Mistral open-source models](https://github.com/mistralai/mistral-finetune) - [Mistral Large 2 Model](https://www.linkedin.com/posts/mistralai_large-enough-activity-7221915921622126593-JjHd?utm_source=share&utm_medium=member_desktop) - [Mistral Small 3](https://mistral.ai/news/mistral-small-3/) ### Yi Models: - [Yi Github](https://github.com/01-ai/Yi) - [Yi Website](https://01.ai/) - [Yi-VL-6B HuggingFace](https://huggingface.co/01-ai/Yi-VL-6B) ### Qwen Models: - [Introducing Qwen1.5 Blog Post](https://qwenlm.github.io/blog/qwen1.5/) - [Qwen1.5 Linkedin Post](https://www.linkedin.com/posts/andrew-iain-jardine_llm-opensource-llms-activity-7160905982523445248-_t5B?utm_source=share&utm_medium=member_desktop) - [Qwen1.5 HuggingFace](https://huggingface.co/collections/Qwen/qwen15-65c0a2f577b1ecb76d786524) - [Qwen2 HuggingFace](https://huggingface.co/docs/transformers/en/model_doc/qwen2) - [Qwen MoE Model](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_new-moe-alert-qwen15-moe-a27b-just-activity-7179144882668630016-i-l5?utm_source=share&utm_medium=member_android) - [Qwen2](https://github.com/QwenLM/Qwen2) - [Qwen 2.5 - Linkedin Post](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_9-new-multilingual-open-llms-released-qwen-activity-7242423229724676097-_9Ea?utm_source=share&utm_medium=member_desktop) - [Qwen 2.5 - Models](https://huggingface.co/collections/Qwen/qwen25-66e81a666513e518adb90d9e) ### DeepSeek Models: - [Huggingface DeepSeek R1 - Linkedin Post](https://www.linkedin.com/posts/qgallouedec_last-moments-of-closed-source-ai-hugging-activity-7288908822079852544-CDgF?utm_source=share&utm_medium=member_android) ### Gemma LLM Related Links (by Google): - [Gemma an open Gemini LLM released by Google! - Linkedin Post](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_welcome-gemma-googles-new-open-llm-activity-7166054332914741249-FY2D?utm_source=share&utm_medium=member_desktop) - [Gemma - another linkedin post](https://www.linkedin.com/posts/andrew-iain-jardine_opensource-llm-llms-activity-7166054662612226048-h0Ap?utm_source=share&utm_medium=member_desktop) - [Google's Gemma Detailed Notes](https://www.linkedin.com/posts/sebastianraschka_googles-gemma-has-been-the-topic-of-the-activity-7167160406480805888-PSeR?utm_source=share&utm_medium=member_desktop) - [Gemma usage via TRL](https://www.linkedin.com/posts/younes-belkada-b1a903145_new-release-from-google-gemma-a-state-of-the-art-activity-7166065899978870784-50To?utm_source=share&utm_medium=member_desktop) - [Gemma usage in Hugging Face via OpenAI SDK](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_yesterday-google-released-gemma-an-open-activity-7166484882917961730-uuFB?utm_source=share&utm_medium=member_desktop) - [Does Gemma overfit the Open LLM Leaderboard?](https://www.linkedin.com/posts/maxime-labonne_does-gemma-overfit-the-open-llm-leaderboard-activity-7166220798427402242-lJFm?utm_source=share&utm_medium=member_desktop) - [Zehpyr 7B Gemma](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_zehpyr-7b-gemma-releasedwe-are-excited-activity-7169373526641070080-rTLD?utm_source=share&utm_medium=member_desktop) - [Gemma 2](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_gemma-2-releasedgoogle-just-released-the-activity-7212108484920651776-BR8s?utm_source=share&utm_medium=member_desktop) - [Gemma2 Detailed Notes](https://www.linkedin.com/posts/sebastianraschka_whats-new-and-noteworthy-in-googles-newly-activity-7213528822384611329-sKv0?utm_source=share&utm_medium=member_desktop) - [Gemma 2-2b](https://huggingface.co/google/gemma-2-2b) ### Jamba (SSM-Transformer Model): - [AI21 Labs Jamba Model](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_jamba-released-ai21-labs-just-released-the-activity-7179121093482315776-xbmX?utm_source=share&utm_medium=member_desktop) - [Fine-tune jamba with TRL](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_yesterday-ai21-labs-released-jamba-the-first-activity-7179395299679858688-xiP9?utm_source=share&utm_medium=member_desktop) - [Fine-tune jamba code](https://www.linkedin.com/posts/maxime-labonne_jambatypus-v01-i-fine-tuned-a-jamba-activity-7181277758876962816-Z4zt?utm_source=share&utm_medium=member_desktop) ### 1-bit LLMs: - [1-bit LLMs (AlphaSignal Post)](https://www.linkedin.com/posts/liorsinclair_new-breakthrough-from-microsoft-1-bit-llms-activity-7168680301064384512-UeNv?utm_source=share&utm_medium=member_desktop) - [1-bit Quantization](https://www.linkedin.com/posts/a-roucher_%3F-%3F%3F%3F-%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F-%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F-activity-7168987208228540416-uhcm?utm_source=share&utm_medium=member_desktop) - [Some Notes about 1-bit LLMs (Their benefits and drawbacks)](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_the-era-of-1-bit-llms-what-does-that-mean-activity-7171533076668362753-Nl-F?utm_source=share&utm_medium=member_desktop) - [AutoBitnet (Train your 1.58-bit LLM based on LLama Architecture for free on Colab T4 GPU)](https://www.linkedin.com/posts/zaiinulabideen_autobitnet-train-your-158-bit-llm-based-activity-7182019658135326720-_qRp?utm_source=share&utm_medium=member_desktop) - [Llama2 7b in 1-bit precision](https://www.linkedin.com/posts/maxime-labonne_1-bit-quantization-activity-7179068277548032000-I8gR?utm_source=share&utm_medium=member_desktop) - [Microsoft 1-Bit LLM](https://github.com/microsoft/BitNet) ### Long Context Window LLMs (e.g., 100K Tokens LLMs): - [Claude LLM](https://www.linkedin.com/posts/itamar-g1_anthropic-openais-biggest-rivalry-just-activity-7063773334831775744-cQ4L/?utm_source=share&utm_medium=member_android) - [Some Notes about the 100K Claude LLM Model](https://www.linkedin.com/posts/sahar-mor_claude-a-gpt-competitor-from-anthropic-activity-7062811160168841216-z4u9/?utm_source=share&utm_medium=member_android) - [Anthropic's Claude-2](https://www.anthropic.com/index/claude-2) - [Claude-2, Anthropic's ChatGPT competitor](https://www.linkedin.com/posts/ugcPost-7084607703137857537-K9Ln?utm_source=share&utm_medium=member_desktop) - [Some Information about Claude 3](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_claude-3-is-here-anthropic-just-released-activity-7170424839529295872-Qp_S?utm_source=share&utm_medium=member_desktop) - [LongNet: Scaling Transformers to 1B Tokens](https://arxiv.org/abs/2307.02486) - [Lost in the Middle: How Language Models Use Long Contexts](https://arxiv.org/abs//2307.03172) - [Notes about How Language Models Use Long Contexts](https://www.linkedin.com/posts/sebastianraschka_llm-ai-machinelearning-activity-7083427280605089792-MS_N/?utm_source=share&utm_medium=member_android) - [Scaling LLaMA and GPTNeoX to >8k input context](https://www.linkedin.com/posts/gante_scaling-llama-and-gptneox-to-8k-input-context-activity-7085545793050320896-8OKi/?utm_source=share&utm_medium=member_android) - [Unofficial Claude-API](https://github.com/KoushikNavuluri/Claude-API) - [Claude Unofficial API](https://github.com/Explosion-Scratch/claude-unofficial-api) - [YARN & LongLlaMa](https://www.linkedin.com/posts/pramodith_generativeai-llm-gpt-activity-7104772654313656321-QC5D?utm_source=share&utm_medium=member_desktop) - [YaRN: Efficient Context Window Extension of LLMs](https://github.com/jquesnelle/yarn) - [LLMs get lost when the context becomes too long: Lost in the Middle: How Language Models Use Long Contexts](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_are-vector-databases-here-to-stay-yes-activity-7085908435686285312-QVfB?utm_source=share&utm_medium=member_desktop) [**Very Important**] - [LongLoRA: Efficient Fine-tuning of Long-Context LLMs](https://www.linkedin.com/posts/omarsar_longlora-efficient-fine-tuning-of-long-context-activity-7111000280615325699-SVEE?utm_source=share&utm_medium=member_desktop) - [LongLoRA: Efficient Fine-tuning of Long-Context LLMs (another post)](https://www.linkedin.com/posts/haotian-tang_expanding-the-context-size-of-large-language-activity-7110806911775641600-nShH?utm_source=share&utm_medium=member_desktop) - [Efficient Streaming LLMs with Attention Sinks for infinite-length inputs](https://github.com/mit-han-lab/streaming-llm) - [MemGPT: Teaching LLMs memory management for unbounded context](https://github.com/cpacker/MemGPT) - [LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs](https://github.com/THUDM/LongWriter) [Interesting] - [Llmlingua Prompt Compress](https://www.linkedin.com/posts/sahar-mor_microsoft-recently-published-a-new-technique-activity-7151596182379597825-7ego?utm_source=share&utm_medium=member_desktop) [Interesting] ### Small Language Models (SLMs): - [Microsoft Phi-2 Model (with 2.7B Parameters)](https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/) - [Can "small" finetuned LLMs with less than 2B parameters outperform larger openly available LLMs (Mixtral, Llama 2 Chat) and proprietary LLMs (ChatGPT)?](https://www.linkedin.com/posts/sebastianraschka_can-small-finetuned-llms-with-less-than-activity-7162082013674500096-FuYV?utm_source=share&utm_medium=member_desktop) - [Smol LM](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_in-just-a-few-years-the-majority-of-ai-usage-activity-7219027139352801281-aMYy?utm_source=share&utm_medium=member_desktop) - [Hymba Small LM](https://www.linkedin.com/posts/pavlo-molchanov-08738a63_excited-to-share-a-new-efficient-small-language-activity-7265581836582424576-4Mqp?utm_source=share&utm_medium=member_desktop) ### Frameworks for Training & Using Large Language Models (LLMs): - [ColossalAI: Library for LLMs](https://github.com/hpcaitech/ColossalAI) - [LangChain: Library for Building applications with LLMs](https://github.com/hwchase17/langchain) - [LangChain Chat](https://github.com/hwchase17/chat-langchain) - [LangChain Crash Course](https://www.youtube.com/watch?v=LbT1yp6quS8) - [LangChain 101](https://www.linkedin.com/posts/munjal-patel_llm-chatgpt-machinelearning-activity-7049757220300800000-hH7I/?utm_source=share&utm_medium=member_android) - [LangChain Resources](https://www.linkedin.com/posts/sonali-pattnaik_generativeai-ai-activity-7063160223967973376-3K0P/?utm_source=share&utm_medium=member_android) - [LangChain & Vector Databases in Production Course](https://learn.activeloop.ai/courses/langchain) - [Building LLM Powered Apps via LangChain Course](https://www.wandb.courses/courses/building-llm-powered-apps) - [OpenFlamingo](https://github.com/mlfoundations/open_flamingo) - [Deepset Haystack Framework](https://github.com/deepset-ai/haystack) - [LMQL: A query language for programming LLMs](https://github.com/eth-sri/lmql) - [LLM Training Frameworks List](https://www.linkedin.com/posts/aboniasojasingarayar_llm-gpt3-framework-activity-7047449940192591872-3VYc/?utm_source=share&utm_medium=member_android) - [NeMo Guardrails](https://github.com/NVIDIA/NeMo-Guardrails) - [Lamini: The LLM engine for rapidly customizing models](https://github.com/lamini-ai/lamini) - [Scikit-LLM: Sklearn Meets Large Language Models](https://github.com/iryna-kondr/scikit-llm) - [Chainlit](https://github.com/Chainlit/chainlit) - [ChatUI](https://github.com/alibaba/ChatUI) - [Streamlit-Chat](https://github.com/AI-Yash/st-chat) - [Gradio: Creating a Streaming chatbot fast](https://www.gradio.app/guides/creating-a-chatbot-fast#streaming-chatbots) - [Streamlit-Weaviate Connection: provides a custom streamlit connection to query data from weaviate](https://github.com/weaviate/st-weaviate-connection/tree/main) - [LangKit: an open-source text metrics toolkit for monitoring language models](https://github.com/whylabs/langkit) - [HuggingFace Transformers Agents](https://huggingface.co/docs/transformers/transformers_agents) - [privateGPT: Ask questions to your documents using the power of LLMs](https://github.com/imartinez/privateGPT) - [Spacy LLM](https://github.com/explosion/spacy-llm) - [Lit-GPT](https://github.com/Lightning-AI/lit-gpt) - [Zero to LitGPT Tutorial: Getting Started with Pretraining, Finetuning, and Using LLMs](https://github.com/Lightning-AI/litgpt/blob/main/tutorials/0_to_litgpt.md) [Great] - [GPTCache: A Library for Creating Semantic Cache for LLM Queries](https://github.com/zilliztech/GPTCache/tree/main) - [AutoTrain-Advanced](https://github.com/huggingface/autotrain-advanced) - [Monster API: API for using & fine-tuning LLMs](https://monsterapi.ai/) - [AnythingLLM: A full-stack personalized AI assistant](https://github.com/Mintplex-Labs/anything-llm) - [EasyLLM: helpful tools and methods for working with LLMs](https://github.com/philschmid/easyllm) - [gpt-llm-trainer: input a description of your task, and fine-tune a LLaMA 2 model for you](https://github.com/mshumer/gpt-llm-trainer) - [Embedchain: a framework to easily create LLM powered bots](https://github.com/embedchain/embedchain) - [PandasAI](https://github.com/gventuri/pandas-ai) [It is not related strictly in this section, but it is interesting] - [GPT Engineer: Specify what you want it to build, the AI asks for clarification, and then builds it](https://github.com/AntonOsika/gpt-engineer) - [Ludwig: a low-code framework for building custom AI models like LLMs](https://github.com/ludwig-ai/ludwig) - [open-interpreter](https://github.com/KillianLucas/open-interpreter) - [kani: is a lightweight and highly hackable framework for chat-based language models with tool usage/function calling](https://github.com/zhudotexe/kani) - [Kani colab samples](https://colab.research.google.com/github/zhudotexe/kani/blob/main/examples/colab_examples.ipynb) - [Kani Linkedin Post](https://www.linkedin.com/posts/chris-callison-burch-40bb87b7_my-phd-students-have-build-a-really-great-activity-7110728026971115520-T16F?utm_source=share&utm_medium=member_desktop) - [Argilla: the open-source data curation platform for LLMs](https://github.com/argilla-io/argilla) - [LiteLLM: Call all LLM APIs using the OpenAI format](https://github.com/BerriAI/litellm) - [LLM Finetuning with PEFT](https://github.com/ashishpatel26/LLM-Finetuning) - [ChatGPT-AutoExpert: Supercharged Custom Instructions for ChatGPT](https://github.com/spdustin/ChatGPT-AutoExpert) - [PyTorch thunder (pytorch compiler for speed up training of LLMs) - Linkedin Post](https://www.linkedin.com/posts/sebastianraschka_we-just-open-sourced-thunder-a-new-compiler-activity-7176571765639245824-srIZ?utm_source=share&utm_medium=member_desktop) - [PyTorch Lightning Thunder](https://github.com/Lightning-AI/lightning-thunder) - [unsloth library: 2-5X faster 70% less memory QLoRA & LoRA finetuning](https://github.com/unslothai/unsloth) [**Great for fine-tuning LLMs**] - [TorchTune: A Native-PyTorch Library for LLM Fine-tuning](https://github.com/pytorch/torchtune) ### Notes and Codes for Training and fine-tuning LLMs: - [LLM Finetuning with PEFT Colab Notebooks](https://github.com/ashishpatel26/LLM-Finetuning) - [Self Instruct TRL for LLMs](https://github.com/yizhongw/self-instruct) - [Self Instruct TRL for LLMs - Link2](https://huggingface.co/docs/trl/sft_trainer) - [How to Fine-Tune LLMs in 2024 with Hugging Face](https://www.philschmid.de/fine-tune-llms-in-2024-with-trl) - [How to fine-tune open LLMs in 2025 with Hugging Face](https://www.philschmid.de/fine-tune-llms-in-2025) - [Fine tune LLMs in your own hardware via PyTorch team (great)](https://pytorch.org/blog/finetune-llms/?utm_content=278057355&utm_medium=social&utm_source=linkedin&hss_channel=lcp-78618366) - [RLHF in 2024 with DPO & Hugging Face](https://www.philschmid.de/dpo-align-llms-in-2024-with-trl) - [A little guide to building Large Language Models in 2024 (PPT by HuggingFace Team)](https://docs.google.com/presentation/d/1IkzESdOwdmwvPxIELYJi8--K3EZ98_cL6c5ZcLKSyVg/edit?usp=sharing) [**Great**] - [Video Link1 of A little guide to building Large Language Models in 2024 (PPT by HuggingFace Team)](https://www.linkedin.com/posts/thom-wolf_75min-talk-i-finally-recorded-this-lecture-activity-7179106246505967617-0nzC?utm_source=share&utm_medium=member_desktop) - [Video Link2 of A little guide to building Large Language Models in 2024 (PPT by HuggingFace Team)](https://www.youtube.com/watch?v=2-SPH9hIKT8) - [Understanding the instruction fine-tuning process in LLMs](https://www.linkedin.com/posts/sebastianraschka_if-you-are-looking-for-a-resource-to-understand-activity-7208093607122145280-6wFF?utm_source=share&utm_medium=member_desktop) - [Top 5 Tips and Tricks for LLM Fine-Tuning and Inference from Intel Experts](https://www.intel.com/content/www/us/en/developer/articles/technical/top-tricks-for-llm-fine-tuning-and-inference.html) ### Reflection-Tuning of LLMs: - [Reflection-Tuning of LLMs](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_mindblowing-a-70b-open-meta-llama-3-better-activity-7237712642339926016-Cfm6?utm_source=share&utm_medium=member_desktop) ### Memory Layer for LLMs: - [Memory layer for LLMs](https://www.linkedin.com/posts/liorsinclair_mem0-gained-20000-stars-on-github-in-30-activity-7237475167822585857-4Jbu?utm_source=share&utm_medium=member_desktop) - [Memory layer for LLMs - GitHub Repo](https://github.com/mem0ai/mem0) ### LLMs for Coding: - [CodeGen](https://github.com/salesforce/CodeGen) - [Code Llama](https://github.com/facebookresearch/codellama) - [Code Llama Notes](https://www.linkedin.com/posts/aleksagordic_nice-meta-ai-just-announced-code-llama-activity-7100559934764810240-Un2i/?utm_source=share&utm_medium=member_android) ### LLMs as Front-End Engineers: - [Design2Code: How Far Are We From Automating Front-End Engineering?](https://arxiv.org/abs/2403.03163) - [Llama Coder: Can generate full React apps](https://llamacoder.together.ai/) ### LLMs Courses & Tutorials: - [LLM Bootcamp - Spring 2023](https://fullstackdeeplearning.com/llm-bootcamp/spring-2023/) - [LLM University](https://docs.cohere.com/docs/llmu) - [List of LLM Courses](https://www.linkedin.com/posts/srijankr_ai-llm-activity-7080929772523966464-Le4u/?utm_source=share&utm_medium=member_android) - [Anti-hype LLM reading list](https://gist.github.com/veekaybee/be375ab33085102f9027853128dc5f0e) - [Microsoft Generative AI Course](https://github.com/microsoft/generative-ai-for-beginners) - [Google and Kaggle five-day generative AI course](https://blog.google/technology/developers/google-kaggle-genai-intensive/) [Good] - [Best Resources for learning to work with LLMs](https://www.linkedin.com/posts/whats-ai_github-louisfb01start-llms-a-complete-activity-7133590058229456896-WEf0?utm_source=share&utm_medium=member_desktop) - [Start with Large Language Models (LLMs)โ€Š-โ€ŠBecome an expert for free!](https://github.com/louisfb01/start-llms) [Interesting] - [Intro to LLMs: Andrej Karpathy 1 Hour Lecture](https://www.youtube.com/watch?v=zjkBMFhNj_g) - [LLM Course](https://github.com/mlabonne/llm-course) [**good**] - [LLM Course in ChatGPT Plus](https://www.linkedin.com/posts/maria-vechtomova_llm-gpt-activity-7160567161856360448-IFjd?utm_source=share&utm_medium=member_desktop) - [Build a Large Language Model (From Scratch) great Course and Book Tutorial](https://github.com/rasbt/LLMs-from-scratch) [**Great**] - [Learning Resources about LLMs](https://www.linkedin.com/posts/pauliusztin_machinelearning-mlops-datascience-activity-7135530424767819777-ui-5?utm_source=share&utm_medium=member_desktop) - [The Transformer Layer by Layer Course](https://mlbootcamp.ai/course.html?guid=d105240a-94e1-405b-be80-60056659c24c) - [The Transformer Layer by Layer Course: Linkedin](https://www.linkedin.com/posts/juan-olano-b9a330112_artificialintelligence-transformers-onlinelearning-activity-7137158122715897856-cneV?utm_source=share&utm_medium=member_desktop) - [Hands-on LLMs Course](https://github.com/iusztinpaul/hands-on-llms) - [Direct Preference Optimization (DPO) Method for LLMs Tutorial](https://huggingface.co/blog/pref-tuning) - [CS25: Transformers United V3 Courses - Autumn 2023](https://web.stanford.edu/class/cs25/) - [CS336: Language Modeling from Scratch](https://stanford-cs336.github.io/spring2024/) - [Visual and Animated Lecture about LLMs and Transformers and Deep Learning](https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi) - [LLMs Roadmap](https://www.linkedin.com/posts/ba%C5%9Fak-tu%C4%9F%C3%A7e-eskili-61511b58_nlp-llms-gpt3-activity-7168168071356997632-V8yL?utm_source=share&utm_medium=member_desktop) [Great] - [Brev.dev Notebooks: Fine-tuning mistral, mixtral, phi-2 and etc](https://github.com/brevdev/notebooks/tree/main) [**Excellent**] - [LLM Summer School](https://www.linkedin.com/posts/sebastianraschka_a-suggestion-for-an-effective-11-step-llm-activity-7195778889384693762-2TB_?utm_source=share&utm_medium=member_android) - [LLM Engineer's Handbook](https://www.linkedin.com/posts/maxime-labonne_super-proud-to-announce-my-new-book-the-activity-7219253497559425024-IVkc?utm_source=share&utm_medium=member_desktop) - [LLM Twin Course: Building Your Production-Ready AI Replica](https://github.com/decodingml/llm-twin-course) - [Hands-On Large Language Models Book](https://www.linkedin.com/posts/jalammar_our-newly-released-llm-book-hands-on-large-activity-7242207044533948417-_i2R?utm_source=share&utm_medium=member_desktop) - [Foundations of LLMs Book](https://arxiv.org/abs/2501.09223) ### LLMs Ranking: - [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) - [Chatbot Arena Leaderboard](https://lmsys.org/blog/2023-05-10-leaderboard/) - [AlpacaEval Leaderboard](https://tatsu-lab.github.io/alpaca_eval/) - [CanAiCode Leaderboard](https://huggingface.co/spaces/mike-ravkine/can-ai-code-results) - [Small LLMs Performance Ranking](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_how-big-do-llms-need-to-be-able-to-reason-activity-7134108036473741312-2jxI?utm_source=share&utm_medium=member_desktop) - [Chatbot Arena: Benchmarking LLMs in the Wild](https://huggingface.co/spaces/lmsys/chatbot-arena) [**Great**] - [Chatbot Arena Leaderboard](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard) - [AI2 WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild](https://huggingface.co/spaces/allenai/WildBench) [**Great**] - [AI2 WildBench Linkedin Post](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_new-evaluation-benchmark-leaderboard-by-activity-7171853629325316096-67sr?utm_source=share&utm_medium=member_desktop) - [Persian LLM Leaderboard (via Part AI)](https://huggingface.co/spaces/PartAI/persian-llm-leaderboard) ### Building NLP Applications Powered by LLMs (Different Methods for Augmenting Knowledge to LLMs (or Retrieval-Augmented Generation (RAG) applications)): - [Ask a Book Questions with LangChain OpenAI](https://bennycheung.github.io/ask-a-book-questions-with-langchain-openai) [Great] - [OpenAI Web QA Embeddings](https://platform.openai.com/docs/tutorials/web-qa-embeddings) - [Deepset Haystack Framework](https://github.com/deepset-ai/haystack) - [Stanford Retrieval-based NLP](https://ai.stanford.edu/blog/retrieval-based-NLP/) - [Hypothetical Document Embeddings (HyDE)](https://www.linkedin.com/posts/activity-7048838677438861312-8MFD/?utm_source=share&utm_medium=member_android) - [ChatDB: Augmenting LLMs with Databases](https://chatdatabase.github.io/) - [ChatNode](https://www.chatnode.ai/) - [Emerging Architectures for LLM Applications](https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/) - [Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines](https://github.com/explodinggradients/ragas) - [Fine tuning vs. RAG for LLMs](https://www.linkedin.com/posts/alexander-ratner-038ba239_lots-of-debate-on-fine-tuning-vs-rag-for-activity-7103836027957506048-AjoJ?utm_source=share&utm_medium=member_desktop) - [Building RAG-based LLM Applications for Production (Part 1)](https://www.anyscale.com/blog/a-comprehensive-guide-for-building-rag-based-llm-applications-part-1) [Good] - [Verba: The Golden RAGtriever, user-friendly interface for Retrieval-Augmented Generation (RAG) applications](https://github.com/weaviate/Verba) - [DocsGPT: GPT-powered chat for documentation, chat with your documents](https://github.com/arc53/DocsGPT) - [RAFT: Retrieval Augmented Fine Tuning - Post1](https://www.linkedin.com/posts/pascalbiese_raft-the-best-of-rag-and-fine-tuning-combined-activity-7175089937036283904-ltQI?utm_source=share&utm_medium=member_desktop) - [RAFT: Retrieval Augmented Fine Tuning - Post2](https://www.linkedin.com/posts/tianjun-zhang-333bb2126_raft-a-new-way-to-teach-llms-to-be-better-activity-7174525633291587584-CO-h?utm_source=share&utm_medium=member_desktop) - [RAFT: Retrieval Augmented Fine Tuning - Microsoft Blog](https://techcommunity.microsoft.com/t5/ai-ai-platform-blog/raft-a-new-way-to-teach-llms-to-be-better-at-rag/ba-p/4084674) - [RAFT: Retrieval Augmented Fine Tuning - Berkeley Blog](https://gorilla.cs.berkeley.edu/blogs/9_raft.html) - [RAFT Code](https://github.com/ShishirPatil/gorilla/tree/main/raft) - [Long context LLMs vs RAG](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_how-good-are-llms-in-a-long-context-and-activity-7214185350959689728-cnfp?utm_source=share&utm_medium=member_android) [Interesting] - [RAGFlow: an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding](https://github.com/infiniflow/ragflow) - [Two Step RAG: Speculative RAG: Enhancing retrieval augmented generation through drafting](https://research.google/blog/speculative-rag-enhancing-retrieval-augmented-generation-through-drafting/) - [Exploring Multimodal RAG with LlamaIndex and GPT-4 or the New Anthropic Sonnet Model](https://levelup.gitconnected.com/exploring-multimodal-rag-with-llamaindex-and-gpt-4-or-the-new-anthropic-sonnet-model-96705c877dbb) - [PaperQA2: High accuracy RAG for answering questions from scientific documents with citations](https://github.com/Future-House/paper-qa) - [Sophisticated Controllable Agent for Complex RAG Tasks](https://github.com/NirDiamant/Controllable-RAG-Agent) - [Anthropic's Cluade Introducing Contextual Retrieval RAG](https://www.anthropic.com/news/contextual-retrieval) - [Docling: Get your docs ready for gen AI](https://github.com/DS4SD/docling) - [Lecture of RAG and Prompt Engineering](https://www.linkedin.com/posts/tom-yeh_i-just-edited-my-lecture-beginners-guide-activity-7284242137091620864-6MBy?utm_source=share&utm_medium=member_desktop) - [Recent RAG Research from Google](https://www.linkedin.com/posts/jihoo-kim_rag-research-from-google-2024-ugcPost-7266537405904498689-wrac?utm_source=share&utm_medium=member_android) - [zoekt: Fast trigram based code search --> great tool for RAG of codes](https://github.com/sourcegraph/zoekt) [**important**] ### Graph RAG & Its Related Data Bases: - [ArangoDB: The Most Complete And Scalable Platform For Graph-Powered GenAI](https://arangodb.com/) - [Microsoft GraphRAG](https://microsoft.github.io/graphrag/) - [llamaindex Graph RAG](https://docs.llamaindex.ai/en/stable/examples/query_engine/knowledge_graph_rag_query_engine/) - [Gephi: The Open Graph Viz Platform](https://gephi.org/) - [JanusGraph: is a scalable graph database optimized for storing and querying graphs](https://janusgraph.org/) - [cayley: Open Source Graph Data Base](https://cayley.io/) - [Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering (Paper)](https://arxiv.org/abs/2404.17723) - [The GraphRAG Manifesto: Adding Knowledge to GenAI](https://neo4j.com/blog/graphrag-manifesto/) - [Neo4j for GenAI](https://neo4j.com/generativeai/) ### Cache-Augmented Generation (CAG): - [Cache-Augmented Generation (CAG) - Linkedin Post1](https://www.linkedin.com/posts/maryammiradi_dont-do-rag-cag-is-40x-faster-than-activity-7281655697086287872-c35Q?utm_source=share&utm_medium=member_desktop) - [Cache-Augmented Generation (CAG) - Linkedin Post2](https://www.linkedin.com/posts/bhavishya-pandit_rag-vs-cag-activity-7282615153852862464-ES23?utm_source=share&utm_medium=member_desktop) - [Cache-Augmented Generation (CAG) - Linkedin Post3](https://www.linkedin.com/posts/francoisvanderseypen_dont-do-rag-when-cache-augmented-generation-activity-7279725990342193152-8P82?utm_source=share&utm_medium=member_desktop) ### Vector Database Libraries: - [weaviate](https://weaviate.io/) - [weaviate GitHub](https://github.com/weaviate/weaviate) - [chroma](https://github.com/chroma-core/chroma) - [Qdrant: Vector Database for AI Applications](https://github.com/qdrant/qdrant) - [pinecone](https://www.pinecone.io/) - [rektor-db](https://github.com/codediodeio/rektor-db) - [pgvector](https://github.com/pgvector/pgvector) - [LlamaIndex: comprehensive toolkit to perform data augmentation for LLMs](https://github.com/jerryjliu/llama_index) - [jina-ai VectorDB](https://github.com/jina-ai/vectordb) - [sqlite-vec: A vector search SQLite extension](https://github.com/asg017/sqlite-vec) ### Great Embedding Models for Search (for Augmenting External Knowledge into ChatBot Vector DB) [Retrieval Augmented Generation (RAG)]: - [Massive Text Embedding Benchmark (MTEB) Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) - [Word and sentence embeddings is how LLMs understand text](https://www.linkedin.com/posts/sahar-mor_word-and-sentence-embeddings-is-how-llms-activity-7105921473978015744-R0Nm?utm_source=share&utm_medium=member_desktop) - [FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding) - [E5 embedding vs OpenAI Ada](https://www.linkedin.com/posts/andrew-iain-jardine_hosting-a-text-embedding-model-that-is-better-activity-7106338837479510016-zvBW?utm_source=share&utm_medium=member_desktop) - [M2-BERT-80M-32k-Retrieval](https://huggingface.co/togethercomputer/m2-bert-80M-32k-retrieval) - [Embedding Quantization - Post1](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_introducing-embedding-quantization-a-new-activity-7176971093646159872-hp9z?utm_source=share&utm_medium=member_desktop) - [Embedding Quantization - Post2](https://www.linkedin.com/posts/tomaarsen_binary-and-scalar-embedding-quantization-activity-7176966403332132864-lJzH?utm_source=share&utm_medium=member_desktop) - [Embedding Quantization - HuggingFace Blog Post](https://huggingface.co/blog/embedding-quantization) - [Quantization Fundamentals with Hugging Face Course](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_quantization-fundamentals-with-hugging-face-activity-7186335433843167232-sKV2?utm_source=share&utm_medium=member_desktop) - [Is Cosine-Similarity of Embeddings Really About Similarity?](https://www.linkedin.com/posts/alphasignal_is-cosine-similarity-of-embeddings-really-activity-7175543620651880449-ZoKw?utm_source=share&utm_medium=member_desktop) - [LLM2Vec](https://www.linkedin.com/posts/zaiinulabideen_lazy-llm2vec-convert-your-favorite-llm-activity-7193618083448553472-_Q2e?utm_source=share&utm_medium=member_desktop) [**Great**] - [Fine tuning embedding models for RAG (Linkedin post)](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_fine-tune-embedding-models-for-retrieval-activity-7203760204579028992-g7eW?utm_source=share&utm_medium=member_desktop) - [Fine tuning embedding models for RAG (Original Post)](https://www.philschmid.de/fine-tune-embedding-model-for-rag) - [`all-MiniLM-L6-v2` --> Sentence-Transformers Model for Embedding](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) - [Learn How to Fine-tuning Embedding Models Course](https://marqo.ai/courses/fine-tuning-embedding-models) [**Great**] - [LLMs Embedding Course - Link1](https://github.com/anishiisc/Build_LLM_from_Scratch/tree/main) - [LLMs Embedding Course - Link2](https://www.linkedin.com/posts/ugcPost-7228118123390902272-oVu4/?utm_source=share&utm_medium=member_android) - [txtai: All-in-one embeddings database](https://github.com/neuml/txtai) - [NVIDIA NV-emb-2 embeddings](https://www.linkedin.com/posts/tunguz_ok-nvidia-nv-emb-2-embeddings-are-really-activity-7262862383885213696-MWVv?utm_source=share&utm_medium=member_desktop) - [jina-embeddings-v3: Multilingual Embeddings With Task LoRA](https://huggingface.co/papers/2409.10173) - [ModernBert: Linkedin Post1](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_modernbert-bert-revisited-in-the-age-of-activity-7275551060302131201-dr3c?utm_source=share&utm_medium=member_desktop) - [ModernBert: Linkedin Post2](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_want-to-replace-bert-in-2025-the-time-has-activity-7277616689859444737-iRUe?utm_source=share&utm_medium=member_desktop) - [Nomic-embed-text-v2-moe model](https://huggingface.co/nomic-ai/nomic-embed-text-v2-moe) - [Nomic Embed Text V2: An Open Source, Multilingual, Mixture-of-Experts Embedding Model (Blog Post)](https://www.nomic.ai/blog/posts/nomic-embed-text-v2) - [Gemeni models for text embedding (original link)](https://developers.googleblog.com/en/gemini-embedding-text-model-now-available-gemini-api/) - [Gemeni models for text embedding (useful linkedin post)](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_gemini-models-for-embeddings-yes-google-activity-7303840326933176320-Tg0C?utm_source=share&utm_medium=member_android&rcm=ACoAAAgksdYBFu3_vG0bwXWdh93rSqV1J1ghMP4) ### Prevent Hallucinations from LLMs & Controling their outputs: - [Deep Dive Into LLM Hallucinations Across Generative Tasks](https://www.rungalileo.io/blog/deep-dive-into-llm-hallucinations-across-generative-tasks) - [Controlled Generation Tools](https://www.linkedin.com/posts/pascalbiese_genai-llms-opensource-activity-7097185067885576192-Uv8Z/?utm_source=share&utm_medium=member_android) - [Guidance: Controlling LLMs](https://github.com/guidance-ai/guidance) - [NeMo Guardrails](https://github.com/NVIDIA/NeMo-Guardrails) - [Minimising Hallucinations in LLM Applications: NeMo Guradrails Video Tutorial](https://www.linkedin.com/posts/sanyambhutani_minimising-hallucinations-in-llm-applications-activity-7104810583304077312-w983?utm_source=share&utm_medium=member_desktop) - [Mitigate Hallucination in LLMs](https://www.linkedin.com/posts/vinija_mitigate-hallucination-in-llms-as-activity-7114468991330390016-O0BZ?utm_source=share&utm_medium=member_desktop) - [LLMs Hallucinations Benchmark](https://www.linkedin.com/posts/drjimfan_please-see-update-below-a-recent-llm-hallucination-activity-7130230516246593536-mxAY?utm_source=share&utm_medium=member_desktop) - [Mitigating LLM Hallucinations: a multifaceted approach](https://amatriain.net/blog/hallucinations) [Great] ### Training & Using Large Language Models (LLMs) on Low Resource Machines: - [Cramming: Training a Language Model on a Single GPU in One Day](https://github.com/jonasgeiping/cramming) [**Great**] - [Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU](https://huggingface.co/blog/trl-peft) [**Great**] - [PEFT: State-of-the-art Parameter-Efficient Fine-Tuning](https://github.com/huggingface/peft) [**Great**] - [PEFT: Parameter-Efficient Fine-Tuning of Billion-Scale Models on Low-Resource Hardware](https://huggingface.co/blog/peft) [**Great**] - [Introduction to 8-bit Matrix Multiplication for transformers at scale using Hugging Face Transformers, Accelerate and bitsandbytes](https://huggingface.co/blog/hf-bitsandbytes-integration) - [bitsandbytes: 8-bit CUDA functions for PyTorch](https://github.com/TimDettmers/bitsandbytes) - [Alpaca-LoRA: Low-Rank LLaMA Instruct-Tuning on consumer hardware](https://github.com/tloen/alpaca-lora) [Great] - [LLaMA & Alpaca Tutorial: โ€œChatGPTโ€ On Your Local Computer](https://medium.com/@martin-thissen/llama-alpaca-chatgpt-on-your-local-computer-tutorial-17adda704c23) - [Dalai: The simplest way to run LLaMA on your local machine](https://github.com/cocktailpeanut/dalai) - [pyllama](https://github.com/juncongmoo/pyllama) - [Alpaca-LoRA-Serve](https://github.com/deep-diver/Alpaca-LoRA-Serve) - [llama.cpp: Port of Facebook's LLaMA model in C/C++](https://github.com/ggerganov/llama.cpp) - [alpaca.cpp](https://github.com/antimatter15/alpaca.cpp) - [SparseGPT: Remove 100 Billion Parameters of LLMs](https://neuralmagic.com/blog/sparsegpt-remove-100-billion-parameters-for-free/) - [xFormers: Toolbox to Accelerate Research on Transformers](https://github.com/facebookresearch/xformers) - [LLaMA-Adapter: Efficient Fine-tuning of LLaMA (Fine-tuning LLaMA to follow instructions within 1 Hour and 1.2M Parameters)](https://github.com/ZrrSkywalker/LLaMA-Adapter) - [GPT4All](https://github.com/nomic-ai/gpt4all) [Great] - [Vicuna web page](https://vicuna.lmsys.org/) [Great] - [Vicuna GitHub: FastChat](https://github.com/lm-sys/FastChat) - [PetGPT](https://github.com/maziarraissi/PetGPT) - [GPT-4-LLM](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM) - [baize Chatbot](https://github.com/project-baize/baize-chatbot) - [Koala](https://github.com/young-geng/EasyLM#koala) - [Gorilla: An API store for LLMs](https://github.com/ShishirPatil/gorilla) - [Lit-LLaMA](https://github.com/Lightning-AI/lit-llama) - [Auto-GPT](https://github.com/Torantulino/Auto-GPT) - [xTuring](https://github.com/stochasticai/xTuring) - [GPTCache](https://github.com/zilliztech/gptcache) - [Dolly-v2-12B](https://huggingface.co/databricks/dolly-v2-12b) - [Web LLM](https://github.com/mlc-ai/web-llm) - [P-tuning v2](https://github.com/THUDM/P-tuning-v2) - [QLoRA: Efficient Finetuning of Quantized LLMs](https://github.com/artidoro/qlora) - [AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration](https://github.com/mit-han-lab/llm-awq) - [GPTQ Quantization Method in Transformers](https://www.linkedin.com/posts/marc-sun_opensource-llm-quantization-activity-7100102215582797824-td7E?utm_source=share&utm_medium=member_desktop) - [Optimize open LLMs using GPTQ and Hugging Face Optimum](https://www.linkedin.com/feed/update/urn:li:activity:7103049470908485632/?utm_source=share&utm_medium=member_android) - [GPTQ vs. bitsandbytes (BNB)](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_quantization-makes-fine-tuning-and-deploying-activity-7104480375841636352-_dgY?utm_source=share&utm_medium=member_desktop) - [BNB Blog: Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA](https://huggingface.co/blog/4bit-transformers-bitsandbytes) - [GPTQ Blog: Making LLMs lighter with AutoGPTQ and transformers](https://huggingface.co/blog/gptq-integration) - [TensorRT-LLM](https://www.linkedin.com/posts/tunguz_llm-h100-languagemodels-activity-7106253824910139392-WZRM?utm_source=share&utm_medium=member_desktop) - [Overview of ๐Ÿค— Transformers Quantization: GPTQ vs bitsandbytes](https://huggingface.co/blog/overview-quantization-transformers) - [LoRA Exchange (LoRAX): Serve 100s of Fine-Tuned LLMs for the Cost of 1](https://predibase.com/blog/lora-exchange-lorax-serve-100s-of-fine-tuned-llms-for-the-cost-of-one) - [Introducing LoRAX](https://www.linkedin.com/posts/travisaddair_lora-exchange-lorax-serve-100s-of-fine-tuned-activity-7120819275442896896-vlI_?utm_source=share&utm_medium=member_desktop) - [DeepSparse: Sparsity-aware deep learning inference runtime for CPUs](https://github.com/neuralmagic/deepsparse) - [Practical Tips for Finetuning LLMs Using LoRA (Low-Rank Adaptation)](https://magazine.sebastianraschka.com/p/practical-tips-for-finetuning-llms) [**Great**] - [Dare method for improving LLMs performance](https://www.linkedin.com/posts/andrew-iain-jardine_llm-opensource-llms-activity-7134896163698208768-0Gyf?utm_source=share&utm_medium=member_desktop) - [Small model that surpass the GPT4](https://www.linkedin.com/posts/clementdelangue_open-models-now-starting-to-surpass-gpt4-activity-7137904570898264064-LSmc?utm_source=share&utm_medium=member_desktop) [Interesting] - [Efficient LLMs Survey](https://github.com/AIoT-MLSys-Lab/Efficient-LLMs-Survey) [Great] - [LoRAX (LoRA eXchange): Framework that allows users to serve thousands of fine-tuned models on a single GPU](https://github.com/predibase/lorax) - [PowerInfer: High-speed LLMs Serving on PCs with Consumer-grade GPUs](https://github.com/SJTU-IPADS/PowerInfer) - [LoRA From Scratch Implementation](https://www.linkedin.com/posts/sebastianraschka_code-lora-from-scratch-a-lightning-studio-activity-7155241298227060736-QRul?utm_source=share&utm_medium=member_desktop) - [Improving LoRA (DoRA): Implementing Weight-Decomposed Low-Rank Adaptation (DoRA)](https://www.linkedin.com/posts/sebastianraschka_improving-lora-implementing-weight-decomposed-activity-7165053172175024128-bqwu?utm_source=share&utm_medium=member_desktop) - [DoRA Link2](https://magazine.sebastianraschka.com/p/lora-and-dora-from-scratch) - [Proxy-Tuning (new method for fine-tuning LLMs)](https://www.linkedin.com/posts/sebastianraschka_theres-a-new-promising-method-for-finetuning-activity-7153788017017544705-ADC7?utm_source=share&utm_medium=member_desktop) - [AutoQuantize (GGUF, AWQ, EXL2, GPTQ) Colab Notebook](https://colab.research.google.com/drive/1Li3USnl3yoYctqJLtYux3LAIy4Bnnv3J?usp=sharing) [Great] - [DoRA: Weight-Decomposed Low-Rank Adaptation - Linkedin Post](https://www.linkedin.com/posts/sebastianraschka_while-everyone-is-talking-about-sora-theres-activity-7164268573756960770-N7Hu?utm_source=share&utm_medium=member_desktop) - [DoRA: Weight-Decomposed Low-Rank Adaptation - Paper](https://arxiv.org/abs/2402.09353) - [GaLore: Memory Efficient Fine-tuning Technique](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_galore-is-a-new-memory-efficient-fine-tuning-activity-7177599313294827521-kye2?utm_source=share&utm_medium=member_desktop) - [Quanto: a pytorch quantization toolkit](https://huggingface.co/blog/quanto-introduction) [**Great**] - [Quanto: Linkedin Post](https://www.linkedin.com/posts/dcorvoysier_quanto-a-pytorch-quantization-toolkit-activity-7175421050808078336-QcEM?utm_source=share&utm_medium=member_desktop) - [Deleting 40% of LLM Layers Without Drop in Accuracy](https://www.linkedin.com/posts/liorsinclair_researchers-just-developed-a-new-method-to-activity-7180929255411789826-z3TV?utm_source=share&utm_medium=member_desktop) - [The Unreasonable Ineffectiveness of the Deeper Layers](https://arxiv.org/html/2403.17887v1) - [Continual Pretraining of LLMs](https://www.linkedin.com/posts/sebastianraschka_we-talk-a-lot-about-finetuning-llms-to-follow-activity-7174395744068464642-jPFI?utm_source=share&utm_medium=member_desktop) - [NOLA: run 10,000 customized LLaMA2 (70B) (4bit) models on a single 48GB GPU](https://www.linkedin.com/posts/hpirsiav_iclr2024-iclr2024-activity-7192618595405725696-HZXu?utm_source=share&utm_medium=member_desktop) - [NOLA LLaMA3](https://www.linkedin.com/posts/s-hasan-abbas_syed-hasan-8503llama-3-8b-nola-hugging-activity-7193318944575762434-MD_T?utm_source=share&utm_medium=member_desktop) - [LoRA Learns Less and Forgets Less in comparision to full finetuning](https://www.linkedin.com/posts/sebastianraschka_lora-learns-less-and-forgets-less-when-i-activity-7197576220585201664-KA4L?utm_source=share&utm_medium=member_desktop) - [Best Practices for Fine-Tuning & Training LLMs](https://www.linkedin.com/posts/aleksagordic_amazing-list-of-techniques-for-improving-activity-7215624025639645184-496W?utm_source=share&utm_medium=member_android) - [TorchChat](https://www.linkedin.com/posts/pytorch_llms-mobilellms-localai-activity-7224090140011380737-RHdH?utm_source=share&utm_medium=member_desktop) - [The Evolution of Extreme LLM Compression: From QuIP to AQLM with PV-Tuning](https://medium.com/yandex/the-evolution-of-extreme-llm-compression-from-quip-to-aqlm-with-pv-tuning-19c44b91af96) - [Calculating GPU memory for serving LLMs](https://www.substratus.ai/blog/calculating-gpu-memory-for-llm) - [How Much GPU Memory is Needed to Serve a Large Language Model (LLM)?](https://masteringllm.medium.com/how-much-gpu-memory-is-needed-to-serve-a-large-languagemodel-llm-b1899bb2ab5d) - [CUDA-Free Inference for LLMs (PyTorch Blog)](https://pytorch.org/blog/cuda-free-inference-for-llms/?utm_content=306418724&utm_medium=social&utm_source=linkedin&hss_channel=lcp-78618366) - [The Ultra-Scale Playbook: Training LLMs on GPU Clusters](https://huggingface.co/spaces/nanotron/ultrascale-playbook) ### Productionizing LLMs: - [LLM From the Trenches: 10 Lessons Learned Operationalizing Models at GoDaddy](https://www.godaddy.com/resources/news/llm-from-the-trenches-10-lessons-learned-operationalizing-models-at-godaddy) ### LLMs on Mobile Devices: - [MLC LLM](https://github.com/mlc-ai/mlc-llm) ### LLM Applications & APIs: - [Building LLM applications for production](https://huyenchip.com/2023/04/11/llm-engineering.html) - [Bard API](https://github.com/dsdanielpark/Bard-API) - [Amazon Bedrock: build and scale generative AI applications](https://aws.amazon.com/bedrock/) [**Great**] ### Natural Language to SQL: - [text to SQL Github Repos](https://github.com/topics/text-to-sql) - [vanna](https://github.com/vanna-ai/vanna) - [sqlchat](https://github.com/sqlchat/sqlchat) - [dataherald](https://github.com/Dataherald/dataherald) - [WrenAI](https://github.com/Canner/WrenAI) - [Practical text-to-SQL for data analytics by Linkedin](https://www.linkedin.com/blog/engineering/ai/practical-text-to-sql-for-data-analytics) [Great] - [Persian abstract of above Practical text-to-SQL for data analytics by Linkedin - Out of Distribution Telegram Channel](https://t.me/out_of_distribution/1122) ### Prompt Engineering: - [Different Kinds of Prompt Engineering](https://www.linkedin.com/posts/munjal-patel_generativeai-largelanguagemodels-llm-activity-7051862874935197696-2E_J/?utm_source=share&utm_medium=member_android) - [Prompt Engineering Guide](https://www.promptingguide.ai/) - [PromptTools: tools for prompt testing and experimentation](https://github.com/hegelai/prompttools) - [Prompt engineering for Claude's long context window](https://www.anthropic.com/index/prompting-long-context) - [Chain of Verification Prompt engineering method](https://www.linkedin.com/posts/xamatriain_a-week-ago-meta-presented-a-new-prompt-engineering-activity-7114351307183820800-MsgT?utm_source=share&utm_medium=member_desktop) - [Analogical Prompting](https://huggingface.co/papers/2310.01714) - [Prompt Flow: Build high-quality LLM apps](https://github.com/microsoft/promptflow) - [Contrastive Chain-of-Thought Prompting (CCoT)](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_improve-chain-of-thought-prompting-by-adding-activity-7133477395944091648-TKlQ?utm_source=share&utm_medium=member_desktop) - [New Prompting Techniques](https://www.linkedin.com/posts/pramodith_promptengineering-llm-activity-7134507333530836992-evPU?utm_source=share&utm_medium=member_desktop) - [Openai Prompt Engineering Guide - Linkedin Post](https://www.linkedin.com/posts/eric-vyacheslav-156273169_game-changer-open-ai-just-released-their-activity-7141454141683343360-eunF?utm_source=share&utm_medium=member_desktop) - [Openai Prompt Engineering Guide](https://platform.openai.com/docs/guides/prompt-engineering) - [Anthropic Claude Metaprompt Tool](https://www.linkedin.com/posts/sahar-mor_anthropic-released-a-useful-tool-that-turns-activity-7194705248039444480-7KtG?utm_source=share&utm_medium=member_desktop) - [Anthropic Prompt Improver](https://www.anthropic.com/news/prompt-improver) - [Anthropic Prompt Improver Linkedin Post](https://www.linkedin.com/posts/anthropicresearch_weve-added-a-new-prompt-improver-to-the-activity-7262874194802036736-Q_RP?utm_source=share&utm_medium=member_desktop) - [Anthropic Evaluate Prompts Tool](https://www.anthropic.com/news/evaluate-prompts) - [Cohere Prompt Tuner: Prompt Optimization at Your Fingertips](https://cohere.com/blog/intro-prompt-tuner?utm_source=bensbites&utm_medium=newsletter&utm_campaign=daily-digest-talk-with-your-ai-besties) - [Quality Prompts: Use and evaluate prompting techniques quickly](https://github.com/sarthakrastogi/quality-prompts) - [Prompt Design at Character.AI](https://research.character.ai/prompt-design-at-character-ai/) - [Structured Prompting](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_structured-prompting-is-a-key-requirement-activity-7235928635633725440-0OG-?utm_source=share&utm_medium=member_desktop) - [Writing with AI: Five ways professional writers are leveraging ChatGPT](https://openai.com/chatgpt/use-cases/writing-with-ai/) - [Google Prompt Gallery](https://ai.google.dev/gemini-api/prompts) - [ell: The Language Model Programming Library](https://docs.ell.so/) - [Template prompts of Cursor & VS Code and etc](https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools) [useful] - [System Prompts Leaks](https://github.com/asgeirtj/system_prompts_leaks/) ### LLM-based Recommender Systems: - [ChatGPT-based Recommender Systems](https://blog.reachsumit.com/posts/2023/05/chatgpt-for-recsys/) ### LLMs for Tabular Data: - [Unleashing the Potential of Large Language Models for Predictive Tabular Tasks in Data Science](https://arxiv.org/abs/2403.20208) - [LLMs for Tabular Data - Linkedin post](https://www.linkedin.com/posts/pascalbiese_unleashing-the-potential-of-llms-for-tabular-activity-7180873134743449600-ChWm?utm_source=share&utm_medium=member_desktop) ### LLMs as Classifiers (finetuning LLMs for classification): - [LLMs as Classifiers Linkedin Post1](https://www.linkedin.com/posts/sebastianraschka_what-if-you-care-about-finetuning-llms-for-activity-7183808393155944448-CSR1?utm_source=share&utm_medium=member_desktop) - [Training LLMs for Spam Classification](https://www.linkedin.com/posts/sebastianraschka_training-llms-for-spam-classification-i-activity-7197943692949676034-c6_j?utm_source=share&utm_medium=member_desktop) ### LLM Data Sets: - [SlimPajama: A 627B token cleaned and deduplicated version of RedPajama](https://www.cerebras.net/blog/slimpajama-a-627b-token-cleaned-and-deduplicated-version-of-redpajama) ### LLM based Agents: - [MetaGPT: Multi-Agent Framework](https://github.com/geekan/MetaGPT) - [DevOpsGPT: AI-Driven Software Development Automation Solution](https://github.com/kuafuai/DevOpsGPT) - [LLM Agent Survey](https://github.com/Paitesanshi/LLM-Agent-Survey) - [Microsoft AutoGen development of LLM applications using multiple agents](https://github.com/microsoft/autogen) - [OpenDevin: autonomous AI software engineer](https://github.com/OpenDevin/OpenDevin) - [Composio: the best toolset to integrate AI Agents](https://github.com/ComposioHQ/composio) - [MindSearch: An LLM-based Multi-agent Framework of Web Search Engine](https://github.com/InternLM/MindSearch) - [OpenAI Swarm Library for Multi-Agent](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_this-came-unexpected-openai-released-swarm-activity-7250841965519368192-oJ35?utm_source=share&utm_medium=member_desktop) - [Don't Sleep on Single-agent Systems](https://www.all-hands.dev/blog/dont-sleep-on-single-agent-systems) - [Linkedin post for Don't Sleep on Single-agent Systems](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_the-more-progress-we-make-on-llms-the-more-activity-7246758324912758784-VC3N?utm_source=share&utm_medium=member_desktop) - [Microsoft TinyTroupe library for simulate human agents with LLMs](https://www.linkedin.com/posts/sahar-mor_a-new-open-source-python-library-called-tinytroupe-activity-7262849272381874176-KFk_?utm_source=share&utm_medium=member_desktop) [Interesting] - [Google Whitepaper on AI Agents - Linkedin Post](https://www.linkedin.com/posts/eric-vyacheslav-156273169_whitepaper-ai-agents-ugcPost-7286059606814990338-JinO/?utm_source=share&utm_medium=member_desktop) - [Google Whitepaper on AI Agents](https://www.kaggle.com/whitepaper-agents) - [Microsoft ai-agents-for-beginners Course](https://github.com/microsoft/ai-agents-for-beginners) - [HuggingFace Smolagent Library blog post](https://huggingface.co/blog/smolagents) [Useful] ### Structured Output in LLMs: - [PydanticAI](https://github.com/pydantic/pydantic-ai) - [PydanticAI Linkedin Post](https://www.linkedin.com/posts/liorsinclair_theres-a-new-ai-agent-framework-that-lets-activity-7270122274408534017-OOQq?utm_source=share&utm_medium=member_desktop) ### Deploying LLMs: - [ExecuTorch Post1](https://www.linkedin.com/posts/pytorch_introducing-executorch-alpha-executorch-activity-7191120577749831680-vYzE?utm_source=share&utm_medium=member_desktop) ### LLM Engineering: - [Langfuse: Open Source LLM Engineering Platform](https://github.com/langfuse/langfuse) ### External Tools that Useful for LLMs: - [Microsoft MarkItDown: Python library that lets you convert any document to Markdown](https://www.linkedin.com/posts/liorsinclair_microsoft-just-open-sourced-markitdown-a-activity-7275201481828454403-c5TX?utm_source=share&utm_medium=member_desktop) [Great] ### Notes about Cost & Price of Training and Using LLMs: - [Cost to Deploy LLaMA2 vs. ChatGPT](https://www.linkedin.com/posts/damienbenveniste_machinelearning-datascience-artificialintelligence-activity-7109561666324885504-ySeC?utm_source=share&utm_medium=member_desktop) [Very Important] - [Anyscale Training Cost](https://www.linkedin.com/posts/robert-nishihara-b6465444_im-so-proud-of-what-we-launched-last-week-activity-7113021412084219904-WFbP?utm_source=share&utm_medium=member_desktop) - [LLMs APIs Pricing Benchmark: pricing of AWS Bedrock, OpenAI, Microsoft Azure](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_yesterday-amazon-web-services-aws-released-activity-7113454144216031233-LYuF?utm_source=share&utm_medium=member_desktop) - [LLM Token-based Price Sheet](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_claude-21-with-200k-context-just-got-released-activity-7132812689369657344-Rk_a?utm_source=share&utm_medium=member_desktop) - [LLM Pricing Table Sheet](https://docs.google.com/spreadsheets/d/1NX8ZW9Jnfpy88PC2d6Bwla87JRiv3GTeqwXoB4mKU_s/edit#gid=0) - [LLM Pricing Table Linkedin Post](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_updated-llm-pricing-table-earlier-today-activity-7170527176168042497-YgT4?utm_source=share&utm_medium=member_desktop) - [Pricibg Sheet for Hosted LLMs](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_just-updated-my-pricing-sheet-for-hosted-activity-7213556290575368196-u71R?utm_source=share&utm_medium=member_desktop) - [LLM Pricing Comparison Tool in HuggingFace Space](https://huggingface.co/spaces/philschmid/llm-pricing) ### Excellent & Easy to Learn Resources for Learning Transformers: - [e2eml transformers from scratch](https://e2eml.school/transformers.html) [**Excellent**] - [annotated-transformer: Learning transformers from code](http://nlp.seas.harvard.edu/annotated-transformer/#a-first-example) - [Transformers Recipe](https://github.com/dair-ai/Transformers-Recipe) ### Persian based Transformer Models: - [ALBERT-Persian](https://github.com/m3hrdadfi/albert-persian) - [ALBERT-Persian Demo Page](https://albert-lab.m3hrdadfi.me/) - [ALBERT-Farsi-base-v2 in HuggingFace](https://huggingface.co/m3hrdadfi/albert-fa-base-v2) - [ParsBERT - Model for Persian Language Understanding](https://github.com/hooshvare/parsbert) - [ARMAN](https://github.com/alirezasalemi7/ARMAN) [Great] - [ParsBigBird: Persian Bert For Long-Range Sequences](https://github.com/sajjjadayobi/ParsBigBird) [Great] - [PersianQA](https://github.com/sajjjadayobi/PersianQA) - [Persian (Farsi) Pre-trained Language Models](https://nlpdataset.ir/farsi/pre-trained_lm.html) [Great] - [Hezar: The all-in-one AI library for Persian, supporting a wide variety of tasks and modalities](https://github.com/hezarai/hezar) [**Great & Important**] - [XLM-RoBERTa (Multilingual & supports Persian)](https://huggingface.co/FacebookAI/xlm-roberta-base) - [TookaBERT by PartAI](https://huggingface.co/PartAI/TookaBERT-Large) [Great] - [Dorna PartAI LLM](https://www.linkedin.com/posts/partdp-ai_aetaexaesabraeaaeqaepaeuahy-aevaewaecaetaedaeuaewaehahy-activity-7205158585968844800-sqqa/?utm_source=share&utm_medium=member_desktop) ## Transfer Learning with Transformers: - [Transfer Learning for NLP via BERT for Text Classification](https://www.analyticsvidhya.com/blog/2020/07/transfer-learning-for-nlp-fine-tuning-bert-for-text-classification/) - [Text Classification with BERT Tokenizer](https://stackabuse.com/text-classification-with-bert-tokenizer-and-tf-2-0-in-python/) - [Bert Text Classification](https://github.com/Shivampanwar/Bert-text-classification) - [Persian Semantic Search](https://github.com/m3hrdadfi/semantic-search) - [Toward fine-tuning a state of the art Natural Language Inference (NLI) model for Persian](https://haddadhesam.medium.com/toward-fine-tuning-a-state-of-the-art-natural-language-inference-nli-model-for-persian-4d538ea4525d) ### Siamese Netowrks and Dual BERT for Multi Text Classification: - [Siamese and Dual BERT for Multi-text Classification](https://towardsdatascience.com/siamese-and-dual-bert-for-multi-text-classification-c6552d435533) - [Transfer Learning via Siamese Networks](https://www.inovex.de/blog/transfer-learning-siamese-networks/) ## Attention Mechanism: - [Attention Mechanism](https://blog.floydhub.com/attention-mechanism/) - [Visualizing A Neural Machine Translation Model - Attention Mechanism](https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/) - [Intuitive Understanding of Attention Mechanism in Deep Learning](https://towardsdatascience.com/intuitive-understanding-of-attention-mechanism-in-deep-learning-6c9482aecf4f) - [Structured Attention Networks](https://medium.com/uci-nlp/summary-structured-attention-networks-f1917dd622af) ## Sequence Modeling: - [WaveNet: Increasing reception field using dilated convolution](https://medium.com/@kion.kim/wavenet-a-network-good-to-know-7caaae735435) - [Understanding WaveNet architecture](https://medium.com/@satyam.kumar.iiitv/understanding-wavenet-architecture-361cc4c2d623) - [WaveNet: A Generative Model for Raw Audio](https://medium.com/a-paper-a-day-will-have-you-screaming-hurray/wavenet-a-generative-model-for-raw-audio-84b2aa5fb4a0) - [How WaveNet Works](https://towardsdatascience.com/how-wavenet-works-12e2420ef386) - [PyTorch Tutorial to Sequence Labeling](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Sequence-Labeling) ## Text Summarization: - [Bert Extractive Summarizer](https://pypi.org/project/bert-extractive-summarizer/) [**Great**] - [Generating Text Summaries Using GPT-2 on PyTorch with Minimal Training](https://blog.paperspace.com/generating-text-summaries-gpt-2/) [_Good_] - [A Gentle Introduction to Text Summarization in Machine Learning](https://blog.floydhub.com/gentle-introduction-to-text-summarization-in-machine-learning/) - [Taming Recurrent Neural Networks for Better Summarization](http://www.abigailsee.com/2017/04/16/taming-rnns-for-better-summarization.html) - [PyTorch implementation of "Get to the point"](https://github.com/mjc92/GetToThePoint) - [TensorFlow implementation of "Get to the point"](https://github.com/abisee/pointer-generator) ## Language Model: - [A Comprehensive Guide to Build your own Language Model in Python](https://www.analyticsvidhya.com/blog/2019/08/comprehensive-guide-language-model-nlp-python-code/) - [D2L: Language Models and Dataset](https://d2l.ai/chapter_recurrent-neural-networks/language-models-and-dataset.html) - [Develop a word-level Neural Language Model in Keras](https://machinelearningmastery.com/how-to-develop-a-word-level-neural-language-model-in-keras/) - [IBM deep learning language model](https://github.com/IBM/deep-learning-language-model) - [BERT language model](https://devopedia.org/bert-language-model) - [Facebook AI: GSLM](https://www.marktechpost.com/2021/09/09/facebook-ai-introduces-gslm-generative-spoken-language-model-a-textless-nlp-model-that-breaks-free-completely-of-the-dependence-on-text-for-training/) - [Language Modeling Great Tutorial](https://lena-voita.github.io/nlp_course/language_modeling.html) - [GALACTICA: general-purpose scientific language model](https://github.com/paperswithcode/galai) [Great] - [Distributed Training of Language Models with Reinforcement Learning via Human Feedback (RLHF)](https://github.com/CarperAI/trlx) [**Excellent**] ## Text & Document Classification: - [hedwig - PyTorch deep learning models for document classification](https://github.com/castorini/hedwig) ## Topic Modeling: - [Topic Modeling with BERT](https://towardsdatascience.com/topic-modeling-with-bert-779f7db187e6) - [BERTopic: Great Library for Topic Modeling](https://github.com/MaartenGr/BERTopic) [Great] ## Sentiment Analysis: - [Introduction to Deep Learning โ€“ Sentiment Analysis](https://nlpforhackers.io/deep-learning-introduction/) ## Co-Reference Resolution: - [Coreference Resolution for Chatbots](https://medium.com/huggingface/state-of-the-art-neural-coreference-resolution-for-chatbots-3302365dcf30) - [Hugging Face - CoRef](https://huggingface.co/coref/) ## Imbalance Handling in NLP: - [Over-Sampling using SMOTE](https://imbalanced-learn.readthedocs.io/en/stable/generated/imblearn.over_sampling.SMOTE.html) [_SMOTE for high-dimensional class-imbalanced data_] - [Over-sampling via imbalanced-learn library](https://imbalanced-learn.readthedocs.io/en/stable/over_sampling.html) - [Imbalanced Data Handling](https://www.jeremyjordan.me/imbalanced-data/) ## Information Retrieval: - [PyTerrier: Python API for Terrier](https://github.com/terrier-org/pyterrier) ## Distance Measures: - [Edit Distance](https://www.geeksforgeeks.org/edit-distance-dp-5/) ## Text-based Emotion Recognition: - [XLM-EMO: Multilingual Emotion Prediction in Social Media Text](https://github.com/MilaNLProc/xlm-emo) ## Machine Translation: - [Open-NLLB: No Language Left Behind (NLLB), models capable of delivering high-quality translations directly between any pair of 200+ languages](https://github.com/gordicaleksa/Open-NLLB) ## Chatbot: - [Rasa Chatbot](https://github.com/RasaHQ/rasa) [**Great**] - [Learn how to Build and Deploy a Chatbot in Minutes using Rasa](https://www.analyticsvidhya.com/blog/2019/04/learn-build-chatbot-rasa-nlp-ipl/) - [chatbot with DialoGPT](https://www.machinecurve.com/index.php/2021/03/16/easy-chatbot-with-dialogpt-machine-learning-and-huggingface-transformers/) - [DialoGPT: huggingface Transformer](https://huggingface.co/transformers/model_doc/dialogpt.html) - [deeppavlov](https://github.com/deeppavlov/DeepPavlov) [**Great**] - [PyTorch Chatbot Tutorial](https://brsoff.github.io/tutorials/beginner/chatbot_tutorial.html) - [Implement a Simple Chat Bot With PyTorch](https://www.python-engineer.com/posts/chatbot-pytorch/) - [GPT2 Chatbot PyTorch](https://github.com/devjwsong/gpt2-chatbot-pytorch) - [PyTorch Official Chatbot Tutorial](https://pytorch.org/tutorials/beginner/chatbot_tutorial.html) - [PaddlePaddle Knover: toolkit for knowledge grounded dialogue generation](https://github.com/PaddlePaddle/Knover) - [PaddlePaddle PLATO-2](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/dialogue/plato-2) - [ParlAI](https://github.com/facebookresearch/ParlAI) [Great] - [huggingface: Transformers](https://github.com/huggingface/transformers) [Great] - [huggingface: Blenderbot](https://huggingface.co/transformers/model_doc/blenderbot.html) [**Great**] - [huggingface: Blenderbot Small](https://huggingface.co/transformers/model_doc/blenderbot_small.html) [**Great**] - [huggingface: GPT-2 Text Generation](https://huggingface.co/gpt2?text=A+long+time+ago%2C) [**Great**] - [Seq2seq Chatbot](https://github.com/ricsinaruto/Seq2seqChatbots) - [seq2seq Chatbot implemented in Pytorch](https://github.com/khordoo/chatbot-pytorch) - [papers with code: chatbot](https://paperswithcode.com/task/chatbot) - [Proudly Leading the Chatbot](https://www.analyticsinsight.net/ankush-sabharwal-proudly-leading-the-chatbot-sphere-with-strategical-innovations-and-implementations/) - [Real Python: Build a Chatbot with Python ChatterBot](https://realpython.com/build-a-chatbot-python-chatterbot/) - [A step-by-step guide to building a chatbot based on your own documents with GPT](https://bootcamp.uxdesign.cc/a-step-by-step-guide-to-building-a-chatbot-based-on-your-own-documents-with-gpt-2d550534eea5) - [MiniPerplx: an alternative to Perplexity that lets search the web, research papers, youtube videos, movies](https://scira.app/) - [GitHub Models](https://github.blog/news-insights/product-news/introducing-github-models/) - [Git Ingest: Quickly turn a GitHub repository into text for LLMs](https://www.linkedin.com/posts/eric-vyacheslav-156273169_you-can-now-quickly-turn-a-github-repository-activity-7277322180223254528-CRW9?utm_source=share&utm_medium=member_desktop) [**Great**] - [Create a Chatbot for any GitHub repo](https://www.linkedin.com/posts/eric-vyacheslav-156273169_game-changer-you-can-now-create-a-chatbot-activity-7226604741261230081-Bthf?utm_source=share&utm_medium=member_desktop) [**Great**] ### Chatbot & LLMs Evaluation Metrics: - [Chatbot Analytics: 9 Key Metrics](https://www.tidio.com/blog/chatbot-analytics/) - [Chatbot Statistics for 2023](https://www.tidio.com/blog/chatbot-statistics/) - [Chatbot Analytics 101: Essential Metrics to Track](https://blog.hootsuite.com/chatbot-analytics/) - [12 Metrics For Chatbot Analytics](https://www.kommunicate.io/blog/metrics-for-chatbot-analytics/) - [ParlAI Evaluation Metrics for Chatbot](https://github.com/facebookresearch/ParlAI/blob/14a10258bf90218341e0253d1c5a88c9d2cd013f/docs/source/tutorial_metrics.md) - [Chatbot Evaluation Metrics](https://github.com/ahkarami/Great-Deep-Learning-Tutorials/blob/master/NLP/Chatbot_Evaluation_Metrics.md) [**Great**] - [Databricks' report on LLM evaluation methods](https://www.linkedin.com/posts/activity-7107825117379907584-m17h?utm_source=share&utm_medium=member_desktop) - [AgentBench: Evaluating LLMs as Agents](https://github.com/THUDM/AgentBench) - [Prometheus: Using GPT4 as SLMs Evaluator](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_using-powerful-llms-gpt-4-as-an-evaluator-activity-7131951255119110145-RH86?utm_source=share&utm_medium=member_desktop) - [LLM Model Evaluation Metrics - When and How to Use Them](https://www.linkedin.com/posts/amrita-rath-288a071bb_llm-evaluation-metrics-activity-7198262398464503808-Gs6y?utm_source=share&utm_medium=member_desktop) ### OpenAI ChatGPT & Its Applications: - [OpenAI ChatGPT](https://openai.com/blog/chatgpt/) [Amazing] - [Description of How OpenAI ChatGPT Works: Illustrating Reinforcement Learning from Human Feedback (RLHF)](https://github.com/huggingface/blog/blob/main/rlhf.md) - [How ChatGPT was Trained](https://www.linkedin.com/posts/damienbenveniste_machinelearning-datascience-chatgpt-activity-7007019154666909696-T5WM/?utm_source=share&utm_medium=member_android) - [ChatGPT Android SDK](https://github.com/skydoves/chatgpt-android/releases) - [ChatGPT awesome apps](https://www.linkedin.com/posts/tarrysingh_chatgpt-activity-7017947289721655296-7-pK/?utm_source=share&utm_medium=member_android) - [A Categorical Archive of ChatGPT Failures](https://arxiv.org/abs/2302.03494) - [Is ChatGPT a General-Purpose Natural Language Processing Task Solver?](https://arxiv.org/abs/2302.06476) - [aman.ai chatGPT Tutorial](https://aman.ai/primers/ai/chatGPT/) [Great] - [ChatGPT for customer service](https://www.intercom.com/ai-bot) - [Chatgpt Retrieval Plugin](https://github.com/openai/chatgpt-retrieval-plugin) - [Trending AI Tools](https://galionaitools.blogspot.com/2023/03/trending-ai-tools.html) - [Merlin: OpenAI ChatGPT Plus extension on all websites](https://merlin.foyer.work/) - [Adrenaline](https://useadrenaline.com/app) - [Using LLMs as agents that orchestrate tools](https://www.linkedin.com/posts/moritz-laurer_augmented-language-models-a-survey-activity-7047924951625953281-0XDj/?utm_source=share&utm_medium=member_android) [Interesting] - [ChatGPT API Using Python](https://www.machinelearning-basics.com/2023/04/chatgpt-api-using-python.html?m=1) - [parthean: A Startup about Financial Expert via ChatGPT](https://www.parthean.com/) - [Notes on the cost of ChatGPT](https://www.linkedin.com/posts/laurencevanelegem_sam-altman-ceo-of-openai-dropped-a-at-activity-7061987804548870144-RF9y/?utm_source=share&utm_medium=member_android) - [Ortus - your YouTube AI buddy](https://chrome.google.com/webstore/detail/ortus-your-youtube-ai-bud/jmpepfdhkjkknfpnfohnmnjoceepcbmp) - [How Is ChatGPTโ€™s Behavior Changing over Time?](https://www.linkedin.com/posts/svpino_gpt-4-is-getting-worse-over-time-not-better-activity-7087379892077481984-uORp?utm_source=share&utm_medium=member_android) - [LLM Drifts: How Is ChatGPTโ€™s Behavior Changing over Time?](https://github.com/lchen001/LLMDrift) - [ChatGPT app Builder](https://www.linkedin.com/posts/zainkahn_absolute-madness-openai-ceo-sam-altman-activity-7128011745868050432-Ox5K?utm_source=share&utm_medium=member_desktop) - [GPT4 Turbo 128k analysis Notes (its price)](https://www.linkedin.com/posts/reuvencohen_i-finally-got-a-chance-to-play-with-the-new-activity-7128179916512104448-SlEX?utm_source=share&utm_medium=member_desktop) - [Designer GPT: website creator](https://www.linkedin.com/posts/eric-vyacheslav-156273169_this-is-crazy-designergpt-is-a-new-gpt-that-activity-7129833701873438720-lQuN?utm_source=share&utm_medium=member_desktop) - [OpenAI DevDay Breakout Sessions Videos](https://www.linkedin.com/posts/openai_openai-devday-breakout-sessions-youtube-activity-7130298061599195137-vbyY?utm_source=share&utm_medium=member_desktop) - [GPT Seed Parameter Notes](https://www.linkedin.com/posts/sahar-mor_openai-released-a-feature-that-mitigates-activity-7130940108974788608-vkDW?utm_source=share&utm_medium=member_desktop) - [Awesome ChatGPT Prompts](https://github.com/f/awesome-chatgpt-prompts) - [GPT-4o Full Data Analysis](https://www.linkedin.com/posts/eric-vyacheslav-156273169_gpt-4o-can-do-full-data-analysis-from-a-single-activity-7196162441116860416--yzu?utm_source=share&utm_medium=member_desktop) - [GPT4-o Architecture](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_is-this-the-architecture-of-openai-gpt-4o-activity-7199759664836739073-gTEz?utm_source=share&utm_medium=member_desktop) - [Introducing Structured Outputs in the OpenAI API](https://openai.com/index/introducing-structured-outputs-in-the-api/) - [OpenAI Realtime-api](https://openai.com/index/introducing-the-realtime-api/) - [OpenAI Model Distillation in the API](https://openai.com/index/api-model-distillation/) - [OpenAI Prompt Caching](https://platform.openai.com/docs/guides/prompt-caching) - [LibreChat: Enhanced ChatGPT Clone](https://github.com/danny-avila/LibreChat) [**Great**] ### OpenAI Learning to Reason & O1 Models: - [Learning to Reason with LLMs: OpenAI o1 Model](https://openai.com/index/learning-to-reason-with-llms/) - [How does OpenAI train the Strawberry (o1) model to spend more time thinking?](https://www.linkedin.com/posts/tom-yeh_openai-strawberry-aibyhand-activity-7240201012697833472-rrzD?utm_source=share&utm_medium=member_desktop) - [Learning to Reason before you speak is how OpenAI o1 generates its response](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_learning-to-reason-before-you-speak-is-how-activity-7240629908559785984--wMj?utm_source=share&utm_medium=member_desktop) - [5 Papers that better understanding Openai o1 models](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_here-are-5-papers-you-want-to-read-to-understand-activity-7241017716214571008-eVba/?utm_source=share&utm_medium=member_android) ## Google Bard & Gemini: - [Google DeepMind Gemini](https://www.linkedin.com/posts/googledeepmind_introducing-gemini-googles-largest-and-activity-7138182085441118208--M-h?utm_source=share&utm_medium=member_desktop) - [Google released Gemini](https://www.linkedin.com/posts/philipp-schmid-a6a2bb196_google-just-released-gemini-their-most-activity-7138191392861757440-djDD?utm_source=share&utm_medium=member_desktop) - [Google Gemini official released notes](https://blog.google/technology/ai/google-gemini-ai/?utm_source=linkedin&utm_medium=social&utm_campaign=GDMGemini) ## Anthropic Claude: - [Anthropic Claude Tool Use](https://www.linkedin.com/posts/anthropicresearch_tool-use-is-now-available-in-beta-to-all-activity-7201976267171086336-oQ4K?utm_source=share&utm_medium=member_desktop) - [Anthropic Prompt Generator](https://www.linkedin.com/posts/liorsinclair_anthropic-mightve-just-solved-prompt-engineering-activity-7196911121939795968-yray?utm_source=share&utm_medium=member_desktop) - [Switched to Claude 3.5](https://www.interconnects.ai/p/switched-to-claude-from-chatgpt) - [Anthropic Message Batches API](https://www.anthropic.com/news/message-batches-api) - [Anthropic Message Batches API - Linkdin Post](https://www.linkedin.com/posts/anthropicresearch_introducing-the-message-batches-api-activity-7249461524996440066-xS37?utm_source=share&utm_medium=member_desktop) - [OpenAI Prompt Caching in GPT 4o and o1: How Does It Compare To Claude Prompt Caching?](https://blog.getbind.co/2024/10/03/openai-prompt-caching-how-does-it-compare-to-claude-prompt-caching/) - [Anthropic Blog: Transformer Circuits Thread](https://transformer-circuits.pub/) - [Anthropic MCP (Model Context Protocol)](https://modelcontextprotocol.io/quickstart) ## How do LLMs think? - [On the Biology of a Large Language Model](https://transformer-circuits.pub/2025/attribution-graphs/biology.html) ## NLP Programming Notes: - [100 Times Faster Natural Language Processing in Python](https://medium.com/huggingface/100-times-faster-natural-language-processing-in-python-ee32033bdced) - [Multi-label Text Classification using BERT](https://medium.com/huggingface/multi-label-text-classification-using-bert-the-mighty-transformer-69714fa3fb3d) - [Learning Meaning in Natural Language Processing](https://medium.com/huggingface/learning-meaning-in-natural-language-processing-the-semantics-mega-thread-9c0332dfe28e) - [Train and Deploy the Mighty Transformer NLP models using FastBert and AWS SageMaker](https://medium.com/@kaushaltrivedi/train-and-deploy-mighty-transformer-nlp-models-using-fastbert-and-aws-sagemaker-cc4303c51cf3) - [Distilling knowledge from Neural Networks to build smaller and faster models](https://blog.floydhub.com/knowledge-distillation/) - [HarfBuzz - a text shaping library](https://github.com/harfbuzz/harfbuzz) [_Useful_] - [PruneBERT - Hugging Face](https://github.com/huggingface/transformers/tree/master/examples/movement-pruning) - [spacy-streamlit: spaCy building blocks for Streamlit apps](https://github.com/explosion/spacy-streamlit) - [HuggingFace Evaluate Library](https://github.com/huggingface/evaluate) - [NeMo - toolkit for Conversational AI](https://github.com/NVIDIA/NeMo) [_Excellent_] ## Data Annotation Tools: - [doccano is an open source text annotation tool](https://github.com/doccano/doccano) [**Great**] - [doccano-divar](https://doccano.divar.ir/) ## Dataset Creator Tools: - [Nvidia create dataset from massive pdf files tool](https://www.linkedin.com/posts/liorsinclair_nvidia-just-released-a-powerful-pdf-extraction-ugcPost-7267580522359336962-GAQv?utm_source=share&utm_medium=member_android) ## NLP Courses: - [HuggingFace Course](https://github.com/huggingface/course) - [NLP Zero to One: Full Course](https://medium.com/nerd-for-tech/nlp-zero-to-one-full-course-4f8e1902c379) - [Stanford CS25: Transformers United](https://web.stanford.edu/class/cs25/) ## Other NLP Topics & miscellaneous: - [HybridNLP - Tutorial on Hybrid Techniques for Knowledge-based NLP](https://github.com/hybridnlp/tutorial) - [Top 10 GPT-3 Tools Easing Content Creation Work in 2022](https://www.analyticsinsight.net/top-10-gpt-3-tools-easing-content-creation-work-in-2022/) [Interesting] - [Inflection-2.5 CahtBot](https://inflection.ai/inflection-2-5) - [Research Paper Report Generating Agent](https://github.com/run-llama/llamacloud-demo/blob/main/examples/report_generation/research_paper_report_generation.ipynb) - [Fast Semantic Text Deduplication](https://www.linkedin.com/posts/patrick-fleith_2-lines-of-code-to-deduplicate-a-dataset-activity-7289903818069164032-p2aa?utm_source=share&utm_medium=member_android)