Skip to content
View csguoh's full-sized avatar
👋
Hi, there~
👋
Hi, there~

Highlights

  • Pro

Block or report csguoh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Solution of the NTIRE 2024 Challenge on Efficient Super-Resolution

Python 2 Updated Feb 6, 2025
35 Updated Feb 6, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 14,989 1,945 Updated Feb 1, 2025

📚 Collection of awesome generation acceleration resources.

115 3 Updated Feb 4, 2025

Official Pytorch Implementation of Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think (ICLR 2025)

Python 822 40 Updated Jan 28, 2025

Towards Unified Deep Image Deraining: A Survey and A New Benchmark

Python 18 Updated Nov 29, 2024

Unified KV Cache Compression Methods for Auto-Regressive Models

Python 862 113 Updated Jan 4, 2025

A paper list of some recent works about Token Compress for Vit and VLM

302 15 Updated Feb 3, 2025

[NeurIPS 2024] official code release for our paper "Revisiting the Integration of Convolution and Attention for Vision Backbone".

Python 31 3 Updated Jan 21, 2025

Sample codes for my CUDA programming book

Cuda 1 Updated Jul 27, 2023

Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

Python 916 35 Updated Jan 21, 2025

This repo contains the code for 1D tokenizer and generator

Jupyter Notebook 679 35 Updated Jan 25, 2025

A method to increase the speed and lower the memory footprint of existing vision transformers.

Python 1,003 71 Updated Jun 17, 2024

📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

Cuda 2,219 237 Updated Feb 7, 2025

[NeurIPS 24] PromptFix: You Prompt and We Fix the Photo

Python 708 38 Updated Oct 4, 2024

Code for BLT research paper

Python 1,376 102 Updated Feb 7, 2025

CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient

Python 75 1 Updated Jan 24, 2025
Cuda 3 Updated Jul 29, 2024

APOLLO: SGD-like Memory, AdamW-level Performance

Python 136 7 Updated Feb 6, 2025

This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality"

Python 45 2 Updated Jan 17, 2025

Low-bit optimizers for PyTorch

Python 125 9 Updated Oct 9, 2023

[ICLR25] High-performance Image Tokenizers for VAR and AR

Python 187 2 Updated Feb 6, 2025

Solve puzzles. Improve your pytorch.

Jupyter Notebook 3,405 304 Updated Jul 15, 2024

Solve puzzles. Learn CUDA.

Jupyter Notebook 10,453 808 Updated Sep 1, 2024

Puzzles for learning Triton

Jupyter Notebook 1,367 99 Updated Nov 18, 2024

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 695 55 Updated Sep 4, 2024

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Python 2,018 164 Updated Mar 27, 2024

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python 4,671 505 Updated Jan 21, 2025

4 bits quantization of LLaMA using GPTQ

Python 3,033 460 Updated Jul 13, 2024

Solution of the NTIRE 2024 Challenge on Efficient Super-Resolution

Python 73 13 Updated Jul 12, 2024
Next