-
DeepSeek-VL2 Public
Forked from deepseek-ai/DeepSeek-VL2DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Python MIT License UpdatedJan 16, 2025 -
mem0 Public
Forked from mem0ai/mem0The Memory layer for your AI apps
Python Apache License 2.0 UpdatedDec 8, 2024 -
SplatFormer Public
Forked from ChenYutongTHU/SplatFormerSplatFormer: Point Transformer for Robust 3D Gaussian Splatting
Python UpdatedNov 26, 2024 -
wvp-GB28181-pro Public
Forked from 648540858/wvp-GB28181-proWEB VIDEO PLATFORM是一个基于GB28181-2016标准实现的网络视频平台,支持NAT穿透,支持海康、大华、宇视等品牌的IPC、NVR、DVR接入。支持国标级联,支持rtsp/rtmp等视频流转发到国标平台,支持rtsp/rtmp等推流转发到国标平台。
Java MIT License UpdatedOct 17, 2024 -
-
ShareGPT4Video Public
Forked from ShareGPT4Omni/ShareGPT4Video[NeurIPS 2024] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Python UpdatedOct 9, 2024 -
LLaMA-Omni Public
Forked from ictnlp/LLaMA-OmniLLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Python Apache License 2.0 UpdatedSep 23, 2024 -
SlowFast Public
Forked from facebookresearch/SlowFastPySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Python Apache License 2.0 UpdatedAug 13, 2024 -
pipecat Public
Forked from pipecat-ai/pipecatOpen Source framework for voice and multimodal conversational AI
Python BSD 2-Clause "Simplified" License UpdatedAug 12, 2024 -
CosyVoice Public
Forked from FunAudioLLM/CosyVoiceMulti-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Python Apache License 2.0 UpdatedAug 8, 2024 -
VastGaussian Public
Forked from kangpeilun/VastGaussianThis is an unofficial Implementation
C++ Apache License 2.0 UpdatedJul 28, 2024 -
LW-DETR Public
Forked from Atten4Vis/LW-DETRThis repository is an official implementation of the paper "LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection".
Python Apache License 2.0 UpdatedJul 25, 2024 -
2d-gaussian-splatting Public
Forked from hbb1/2d-gaussian-splatting[SIGGRAPH'24] 2D Gaussian Splatting for Geometrically Accurate Radiance Fields
Python Other UpdatedJun 5, 2024 -
GaussianPro Public
Forked from kcheng1021/GaussianPro[ICML2024] Official code for GaussianPro: 3D Gaussian Splatting with Progressive Propagation
Python MIT License UpdatedMay 31, 2024 -
FunASR Public
Forked from modelscope/FunASRA Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Python Other UpdatedMay 28, 2024 -
-
gaussian-splatting Public
Forked from graphdeco-inria/gaussian-splattingOriginal reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
Python Other UpdatedMay 6, 2024 -
AISP Public
Forked from mv-lab/AISPAI Image SIgnal Processing and Computational Photography - Bokeh Rendering , Reversed ISP Challenge, Model-Based Image Signal Processors via Learnable Dictionaries. Official repo for NTIRE and AIM …
Jupyter Notebook UpdatedApr 20, 2024 -
Person_reID_baseline_pytorch Public
Forked from layumi/Person_reID_baseline_pytorch⛹️ Pytorch ReID: A tiny, friendly, strong pytorch implement of person re-id / vehicle re-id baseline. Tutorial 👉https://github.com/layumi/Person_reID_baseline_pytorch/tree/master/tutorial
Python MIT License UpdatedApr 15, 2024 -
yolo_tracking Public
Forked from mikel-brostrom/boxmotBoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models
Python GNU Affero General Public License v3.0 UpdatedApr 3, 2024 -
projectaria_tools Public
Forked from facebookresearch/projectaria_toolsprojectaria_tools is an C++/Python open-source toolkit to interact with Project Aria data
C++ Apache License 2.0 UpdatedMar 29, 2024 -
co-tracker Public
Forked from facebookresearch/co-trackerCoTracker is a model for tracking any point (pixel) on a video.
Jupyter Notebook Other UpdatedMar 28, 2024 -
sherpa-onnx Public
Forked from k2-fsa/sherpa-onnxSpeech-to-text, text-to-speech, and speaker recongition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, x86_64 servers, webs…
C++ Apache License 2.0 UpdatedFeb 28, 2024 -
espnet Public
Forked from espnet/espnetEnd-to-End Speech Processing Toolkit
Python Apache License 2.0 UpdatedFeb 27, 2024 -
Depth-Anything Public
Forked from LiheYoung/Depth-AnythingDepth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
Python Apache License 2.0 UpdatedFeb 21, 2024 -
edge-tts Public
Forked from rany2/edge-ttsUse Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
Python GNU General Public License v3.0 UpdatedFeb 16, 2024 -
build-openwrt Public
Forked from topak47/build-openwrt利用Actions在线云编译openwrt固件,适合官方源码,lede,lienol和immortalwrt源码,支持X86,电视盒子等众多设备!
Shell GNU General Public License v2.0 UpdatedFeb 4, 2024 -
act-plus-plus Public
Forked from MarkFzp/act-plus-plusImitation Learning algorithms with Co-traing for Mobile ALOHA: ACT, Diffusion Policy, VINN
Python MIT License UpdatedJan 4, 2024 -
mobile-aloha Public
Forked from MarkFzp/mobile-alohaMobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
Jupyter Notebook MIT License UpdatedJan 3, 2024 -
sound_distance_estimation Public
Forked from sakshamsingh1/sound_distance_estimationOfficial implementation of "sound distance estimation" WASPAA 23
Python UpdatedDec 31, 2023