DeepSeek-Coder_pytorch代码生成方向大模型&支持87种编程语言

最新推荐文章于 2025-05-27 14:23:51 发布

技术瘾君子1573

最新推荐文章于 2025-05-27 14:23:51 发布

阅读量972

点赞数 28

CC 4.0 BY-SA版权

分类专栏：人工智能&深度学习&机器学习文章标签： pytorch 人工智能 python DeepSeek

本文链接：https://blog.csdn.net/qq_27815483/article/details/147396047

人工智能&深度学习&机器学习专栏收录该内容

197 篇文章

订阅专栏

DeepSeek-Coder

DeepSeek Coder系列包括1B、5.7B、6.7B及33B多个版本，涵盖广泛的代码和自然语言处理任务。

论文

DeepSeek-Coder: When the Large Language Model Meets Programming - The Rise of Code Intelligence deepseek-coder

模型结构

DeepSeek-Coder LLM架构主要参照了LLama，并建立在与DeepSeek LLM同样的架构之下。每个模型都是一个decoder-only的Transformer架构。在同size的情况下，DeepSeek-Coder在多个代码生成任务上表现出色，包括代码生成、跨文件代码补全以及程序解决数学问题等，其性能超过了多个开源基准模型，如CodeLlama等。

算法原理

其中33B模型使用了GQA模块，能够在带来一定模型表征能力的同时，也能够对提高模型的性能。而6.7B等则使用了MHA,以提高模型的表征能力。并且在该系列的模型中使用了RoPE旋转位置编码，使得模型能够具有更好的外推性。

环境配置

-v 路径、docker_name和imageID根据实际情况修改

Docker（方法一）

docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-centos7.6-dtk24.04-py310
docker run -it -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal/:/opt/hyhal/:ro --shm-size=80G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash

cd /your_code_path/deepseek-coder_pytorch
pip install -r requirements.txt
pip install -U huggingface_hub hf_transfer
export HF_ENDPOINT=https://hf-mirror.com

Dockerfile（方法二）

cd docker
docker build --no-cache -t deepseek_coder:latest .
docker run -it -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal/:/opt/hyhal/:ro --shm-size=80G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash

cd /your_code_path/deepseek-coder_pytorch
pip install -r requirements.txt
pip install -U huggingface_hub hf_transfer
export HF_ENDPOINT=https://hf-mirror.com

Anaconda（方法三）

关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装。

DTK驱动: dtk24.04
python: python3.10
torch: 2.1.0

Tips：以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应

其它非深度学习库安装方式如下：

pip install -r requirements.txt
pip install -U huggingface_hub hf_transfer
export HF_ENDPOINT=https://hf-mirror.com

数据集

finetune训练样例数据采用nickrosh/Evol-Instruct-Code-80k-v1 下载地址

训练

单机四卡

预训练模型下载地址：官方下载地址具体参数更改请在train_ft.sh文件中进行,以下为必要参数
DATA_PATH="{数据集地址}"
OUTPUT_PATH="{训练文件保存地址}"
MODEL_PATH="{预训练模型加载地址}"

cd finetune
./train.sh

推理

基于Huggingface's Transformers进行推理.
模型下载后默认需存放至weights文件夹中
也可自行更改 inference.py文件中的 model_name 参数

HIP_VISIBLE_DEVICES=0 python inference.py

Result

prompt：用verilog写一个读和写的FIFO模块
result：

精度

暂无

应用场景

算法类别

代码生成

热点应用行业

制造,能源,教育

预训练权重

模型目录结构如下：

# deepseek-coder-6.7b-instruct/
├── config.json
├── generation_config.json
├── LICENSE
├── model-00001-of-00002.safetensors
├── model-00002-of-00002.safetensors
├── model.safetensors.index.json
├── pytorch_model-00001-of-00002.bin
├── pytorch_model-00002-of-00002.bin
├── pytorch_model.bin.index.json
├── README.md
├── tokenizer_config.json
└── tokenizer.json