Skip to content

qiwei-ma/Multi-Agent-Conversational-AI

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

211 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-Agent-Spoken

TODO

  • Coze API
  • 评分系统
  • 对话展示
  • 输入检测
  • 模型
  • 音色
  • 界面美化
  • 手机适配
  • 部署

Start

Linux

  1. 环境配置
conda create -n nerfstream python=3.10
conda activate nerfstream
# If the cuda version is not 11.3 (confirm the version by running nvidia-smi), install the corresponding version of pytorch according to <https://pytorch.org/get-started/previous-versions/> 
conda install pytorch==1.12.1 torchvision==0.13.1 cudatoolkit=11.3 -c pytorch
pip install -r requirements.txt
# If you need to train the ernerf model, install the following libraries
# pip install "git+https://github.com/facebookresearch/pytorch3d.git"
# pip install tensorflow-gpu==2.8.0
# pip install --upgrade "protobuf<=3.20.1"
  1. 端口开放
firewall-cmd --zone=public ---permanent -add-port=8010/tcp
firewall-cmd --zone=public ---permanent -add-port=1-65535/udp
  1. 运行
python app.py --transport webrtc --model wav2lip --avatar_id wav2lip256_avatar5 --tts tencent --REF_FILE 101006 --customvideo_config data/custom_config.json
python app.py --transport webrtc --model wav2lip --avatar_id wav2lip256_avatar5 --tts tencent --REF_FILE 501009 --customvideo_config data/custom_config.json (大模型音色字数限制:100000)
  1. 查看
http://127.0.0.1:8010/login.html
http://127.0.0.1:8010/dashboard.html
  1. 视频编排
  • 素材生成
ffmpeg -i xxx.mp4 -vf fps=25 -qmin 1 -q:v 1 -start_number 0 data/customvideo/image/%08d.png
# 生成空音频
ffmpeg -f lavfi -i anullsrc=channel_layout=mono:sample_rate=16000 -t 10 -acodec pcm_s16le data\customvideo\audio.wav

Acknowledgements

https://github.com/lipku/LiveTalking

About

Digital human interface for EFL speaking practice

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 70.6%
  • Cuda 10.6%
  • JavaScript 10.6%
  • HTML 5.3%
  • CSS 1.7%
  • C 0.7%
  • Other 0.5%