Skip to content
View suchot's full-sized avatar
🎈
life
🎈
life

Highlights

  • Pro

Block or report suchot

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Edit-R1: Reinforce Image Editing with Diffusion Negative-Aware Finetuning and MLLM Implicit Feedback

Python 201 5 Updated Dec 15, 2025

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 1,826 107 Updated Nov 4, 2025

The best ChatGPT that $100 can buy.

Python 39,709 5,075 Updated Jan 4, 2026

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 17,546 1,483 Updated Jan 4, 2026

LLM-Powered Semi-Structured Table Question Answering

Python 289 32 Updated Dec 12, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 18,016 2,943 Updated Jan 4, 2026

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.

Python 1,494 217 Updated Dec 15, 2025

Ongoing research training transformer models at scale

Python 14,785 3,450 Updated Jan 4, 2026

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 11,961 1,103 Updated Jan 3, 2026

A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.

Python 2,198 251 Updated Jan 4, 2026

DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation

Python 782 52 Updated Jul 9, 2025

MAGI-1: Autoregressive Video Generation at Scale

Python 3,620 229 Updated Jun 17, 2025
Python 2,327 128 Updated Dec 23, 2025

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,517 58 Updated Jun 14, 2025

The simplest, fastest repository for training/finetuning small-sized VLMs.

Python 4,472 437 Updated Oct 27, 2025

Open Image Curation Tools

Python 47 3 Updated Apr 22, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,662 2,236 Updated Feb 1, 2025
Python 167 19 Updated Jan 3, 2026

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 36,129 5,094 Updated Dec 31, 2025

Enjoy the magic of Diffusion models!

Python 11,328 1,081 Updated Dec 30, 2025

[ICCV 2025 Highlight] OminiControl: Minimal and Universal Control for Diffusion Transformer

Python 1,874 141 Updated Jul 3, 2025

A unified inference and post-training framework for accelerated video generation.

Python 2,895 233 Updated Jan 4, 2026
Python 4 Updated Feb 3, 2025

Scripts and doc for https://www.dolthub.com/repositories/chenditc/investment_data

Python 825 118 Updated Jan 4, 2026

Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.

Python 267 14 Updated Dec 2, 2025

A PyTorch native platform for training generative AI models

Python 4,921 658 Updated Jan 4, 2026

Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).

Python 11,495 1,061 Updated Nov 5, 2025

You can easily calculate FVD, PSNR, SSIM, LPIPS for evaluating the quality of generated or predicted videos.

Python 531 23 Updated Jan 6, 2025

[CVPR2025] We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference ima…

Python 1,405 95 Updated Sep 21, 2025
Next