Highlights
- Pro
Stars
[ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).
The simplest, fastest repository for training/finetuning small-sized VLMs.
Mechanistic Interpretability in Transformers: This repository explores advanced techniques like Induction Head Detection and QK Circuit Analysis to uncover the inner workings of transformer-based m…
BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays
Reconstructs the IVIM MRI images using LSTM networks.
Lab Materials for MIT 6.S191: Introduction to Deep Learning
🌀 React library to safely render HTML, filter attributes, autowrap text with matchers, render emoji characters, and much more.
The ultimate preparation strategy for coding interviews
A list of Medical imaging datasets.
Noise Conditional Score Networks (NeurIPS 2019, Oral)
Generating randomized brain MRI images from random noise using a GAN. Additionally translating from one image domain to another with a conditional GAN (pix2pix): Segmenting brain anatomy - Generati…
A collection of machine learning examples and tutorials.
Repo for Udacity's Secure & Private AI course
Tensorflow based implementation of deep siamese LSTM network to capture phrase/sentence similarity using character/word embeddings
Topics (Tutorials) from HackerRank
Auto-suggestion of questions with same meaning as the input question.
Solid - Re-decentralizing the web (project directory)

