Stars
ACL 2019论文复现:Improving Multi-turn Dialogue Modelling with Utterance ReWriter
The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"
Unofficial PyTorch/🤗Transformers(Gemma/Llama3) implementation of Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Test code of Inverse cloze task for information retrieval
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Question and Answer based on Anything.
Collaborative Training of Large Language Models in an Efficient Way
XVERSE-13B: A multilingual large language model developed by XVERSE Technology Inc.
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
Fast and memory-efficient exact attention
ChatYuan: Large Language Model for Dialogue in Chinese and English
Instruct-tune LLaMA on consumer hardware
EMNLP 2021 - Pre-training architectures for dense retrieval
Sampled Softmax Implementation for PyTorch
Codes for NeurIPS 2020 paper "Adversarial Weight Perturbation Helps Robust Generalization"
zlh1992 / qlib
Forked from microsoft/qlibQlib is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment. With Qlib, yo…
Implementation of some unbalanced loss like focal_loss, dice_loss, DSC Loss, GHM Loss et.al