Highlights
- Pro
Stars
- All languages
- Agda
- Arc
- Assembly
- C
- C#
- C++
- CMake
- CSS
- Clojure
- CodeQL
- CoffeeScript
- Coq
- Cuda
- Dart
- Dhall
- Dockerfile
- Elixir
- Emacs Lisp
- Gherkin
- Go
- Groovy
- HTML
- Handlebars
- Haskell
- Isabelle
- JSON
- Java
- JavaScript
- Jsonnet
- Julia
- Jupyter Notebook
- Kotlin
- LLVM
- Lean
- Less
- Lua
- MATLAB
- MDX
- Makefile
- Markdown
- Meson
- Mustache
- OCaml
- Objective-C
- Objective-C++
- PHP
- Pascal
- Perl
- PowerShell
- Python
- R
- Racket
- Ruby
- Rust
- SCSS
- Scala
- Scheme
- Shell
- Smarty
- Starlark
- Svelte
- Swift
- SystemVerilog
- TLA
- TSQL
- TeX
- TypeScript
- Typst
- VHDL
- Verilog
- Vim Script
- Vue
- WebAssembly
- Zig
A hash table with consistent order and fast iteration; access items by key or sequence index
Make a cascading timeline from markdown-like text. Supports simple American/European date styles, ISO8601, images, links, locations, and more.
An OAI compatible exllamav2 API that's both lightweight and fast
Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM)
Pre-built implicit layer architectures with O(1) backprop, GPUs, and stiff+non-stiff DE solvers, demonstrating scientific machine learning (SciML) and physics-informed machine learning methods
Provide with pre-build flash-attention package wheels using GitHub Actions
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Material for lectures on Diffusion models at IE university
Convert PDF to markdown + JSON quickly with high accuracy
A modular graph-based Retrieval-Augmented Generation (RAG) system
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
FlashInfer: Kernel Library for LLM Serving
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM