Skip to content
View arctanbell's full-sized avatar

Block or report arctanbell

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
  • DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

    Python MIT License Updated Jan 16, 2025
  • mem0 Public

    Forked from mem0ai/mem0

    The Memory layer for your AI apps

    Python Apache License 2.0 Updated Dec 8, 2024
  • SplatFormer: Point Transformer for Robust 3D Gaussian Splatting

    Python Updated Nov 26, 2024
  • WEB VIDEO PLATFORM是一个基于GB28181-2016标准实现的网络视频平台,支持NAT穿透,支持海康、大华、宇视等品牌的IPC、NVR、DVR接入。支持国标级联,支持rtsp/rtmp等视频流转发到国标平台,支持rtsp/rtmp等推流转发到国标平台。

    Java MIT License Updated Oct 17, 2024
  • LLaVA-NeXT Public

    Forked from LLaVA-VL/LLaVA-NeXT
    Python Apache License 2.0 Updated Oct 16, 2024
  • [NeurIPS 2024] An official implementation of ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

    Python Updated Oct 9, 2024
  • LLaMA-Omni Public

    Forked from ictnlp/LLaMA-Omni

    LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

    Python Apache License 2.0 Updated Sep 23, 2024
  • PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

    Python Apache License 2.0 Updated Aug 13, 2024
  • pipecat Public

    Forked from pipecat-ai/pipecat

    Open Source framework for voice and multimodal conversational AI

    Python BSD 2-Clause "Simplified" License Updated Aug 12, 2024
  • CosyVoice Public

    Forked from FunAudioLLM/CosyVoice

    Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

    Python Apache License 2.0 Updated Aug 8, 2024
  • This is an unofficial Implementation

    C++ Apache License 2.0 Updated Jul 28, 2024
  • LW-DETR Public

    Forked from Atten4Vis/LW-DETR

    This repository is an official implementation of the paper "LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection".

    Python Apache License 2.0 Updated Jul 25, 2024
  • [SIGGRAPH'24] 2D Gaussian Splatting for Geometrically Accurate Radiance Fields

    Python Other Updated Jun 5, 2024
  • [ICML2024] Official code for GaussianPro: 3D Gaussian Splatting with Progressive Propagation

    Python MIT License Updated May 31, 2024
  • FunASR Public

    Forked from modelscope/FunASR

    A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

    Python Other Updated May 28, 2024
  • discocal Public

    Forked from chaehyeonsong/discocal
    C++ MIT License Updated May 7, 2024
  • Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"

    Python Other Updated May 6, 2024
  • AISP Public

    Forked from mv-lab/AISP

    AI Image SIgnal Processing and Computational Photography - Bokeh Rendering , Reversed ISP Challenge, Model-Based Image Signal Processors via Learnable Dictionaries. Official repo for NTIRE and AIM …

    Jupyter Notebook Updated Apr 20, 2024
  • ⛹️ Pytorch ReID: A tiny, friendly, strong pytorch implement of person re-id / vehicle re-id baseline. Tutorial 👉https://github.com/layumi/Person_reID_baseline_pytorch/tree/master/tutorial

    Python MIT License Updated Apr 15, 2024
  • BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models

    Python GNU Affero General Public License v3.0 Updated Apr 3, 2024
  • projectaria_tools is an C++/Python open-source toolkit to interact with Project Aria data

    C++ Apache License 2.0 Updated Mar 29, 2024
  • CoTracker is a model for tracking any point (pixel) on a video.

    Jupyter Notebook Other Updated Mar 28, 2024
  • sherpa-onnx Public

    Forked from k2-fsa/sherpa-onnx

    Speech-to-text, text-to-speech, and speaker recongition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, x86_64 servers, webs…

    C++ Apache License 2.0 Updated Feb 28, 2024
  • espnet Public

    Forked from espnet/espnet

    End-to-End Speech Processing Toolkit

    Python Apache License 2.0 Updated Feb 27, 2024
  • Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

    Python Apache License 2.0 Updated Feb 21, 2024
  • edge-tts Public

    Forked from rany2/edge-tts

    Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

    Python GNU General Public License v3.0 Updated Feb 16, 2024
  • 利用Actions在线云编译openwrt固件,适合官方源码,lede,lienol和immortalwrt源码,支持X86,电视盒子等众多设备!

    Shell GNU General Public License v2.0 Updated Feb 4, 2024
  • Imitation Learning algorithms with Co-traing for Mobile ALOHA: ACT, Diffusion Policy, VINN

    Python MIT License Updated Jan 4, 2024
  • Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation

    Jupyter Notebook MIT License Updated Jan 3, 2024
  • Official implementation of "sound distance estimation" WASPAA 23

    Python Updated Dec 31, 2023