LLaMA-Factory大模型微调报错合集

最新推荐文章于 2025-07-15 16:15:50 发布

叁玖27

最新推荐文章于 2025-07-15 16:15:50 发布

阅读量407

点赞数 5

CC 4.0 BY-SA版权

文章标签： llama

本文链接：https://blog.csdn.net/weixin_51522849/article/details/148427934

参考github中readme_zh.md的推入门教程：LLaMA-Factory QuickStart

transformers版本问题：在阿里云服务器上用了最新的transformers出现了报错，后面改为稳定的版本：
pip install --upgrade transformers==4.51.2

在模型下载与可用性校验中，跑一下官方raedme里提供的原始推理demo，验证模型文件的正确性和transformers库等软件的可用可能会出现问题（本人是用的Qwen3，也有可能是LLM不一致的问题）：
TypeError: ‘NoneType’ object cannot be interpreted as an integer
这是由于终止符的设定出现的问题，即eos_token_id=[terminators],，参数eos_token_id是指定生成文本的终止符（End Of Sequence），当模型生成该 token 时立即停止生成，但是terminators值为 [151645, None]，由pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>") 生成了None，不符合要求；
可以直接将改代码删掉以解决该问题，或者配置正确的非None的终止符

在配置文件/examples/inference/llama3.yaml当中adapter_name_or_path参数作用为指定 LoRA 适配器的存储路径，让加载时自动合并原始模型与适配器权重，所以只是加载初始模型时可以不需要该参数。

关于数据集的配置与介绍：[LLaMA-Factory 数据集参数介绍与配置]
(https://github.com/hiyouga/LLaMA-Factory/blob/main/data/README_zh.md)
直接用文本编辑器，notebook或者编译器等打开，直接选择Ctrl + Shift + H快捷键，然后做替换操作。另外，windows下系统字段替换可以直接用命令执行，有可能不够准确，破坏格式：

powershell -Command "(gc data/identity.json) -replace '{{name}}','PonyBot' -replace '{{author}}','LLaMA Factory' | sc data/identity.json"

# 或者，分步骤替换文件中的文本
(Get-Content "data/identity.json") -replace "{{name}}", "PonyBot" | Set-Content "data/identity.json"
(Get-Content "data/identity.json") -replace "{{author}}", "LLaMA Factory" | Set-Content "data/identity.json"

llamafactory命令

llamafactory命令
例如：

llamafactory-cli webchat \
    --model_name_or_path /media/codingma/LLM/llama3/Meta-Llama-3-8B-Instruct \
    --adapter_name_or_path ./saves/LLaMA3-8B/lora/sft  \
    --template llama3 \
    --finetuning_type lora
# 或者使用配置文件代替参数
llamafactory-cli webchat examples/inference/qwen3_lora_sft.yaml

直接调用图形化界面：其具体操作可参考blog六，七章节