解决C++中文字符乱码终极指南

xxxxxxhuan

于 2025-06-25 15:52:09 发布

阅读量844

点赞数 8

CC 4.0 BY-SA版权

文章标签： c++ 开发语言中文乱码

本文链接：https://blog.csdn.net/xxxxxxhuan/article/details/148900207

一、问题背景

最近在修改课程设计时，遇到了C++中文字符乱码问题（如�Ż��）。这种问题主要源于字符编码不一致、编译器处理差异及终端支持不足。本文将深入探讨问题根源并提供多种解决方案。

二、乱码原因深度解析

1. 编码不一致性问题

源文件编码：源代码以UTF-8保存，但编译器默认使用ANSI（即GBK）编码解析
文件读写编码：文件写入使用UTF-8，读取时却使用ANSI编码
控制台编码：Windows命令提示符默认GBK编码，与程序输出编码不一致

可以通过cmd来查看电脑的输出格式：Win+R cmd

输入：CHCP

936表示电脑默认是GBK格式

三、全面解决方案

方案1：修改读取文件的编码格式（最为推荐）

用记事本打开需要读取的文件：

将文档另存为（注意不要修改文件命名和后缀），选择编码格式为ANSI

然后重新打开程序测试就可以了，如下为修改前后对比图：.

修改前

修改后

方案2：统一工程所有环节的编码（推荐UTF-8）

// 设置控制台为UTF-8编码（Windows）
#ifdef _WIN32
#include <windows.h>
#endif

void SetConsoleUTF8() {
#ifdef _WIN32
    SetConsoleOutputCP(CP_UTF8); // 设置输出编码
    SetConsoleCP(CP_UTF8);       // 设置输入编码
#endif
}

// 编译器选项配置（CMake示例）
add_compile_options("$<$<CXX_COMPILER_ID:MSVC>:/utf-8>")
add_compile_options("$<$<NOT:$<CXX_COMPILER_ID:MSVC>>:-fexec-charset=UTF-8>")

方案3：转换字符编码

#include <string>
#include <locale>
#include <codecvt>

// UTF-8转本地编码（Windows为GBK）
std::string UTF8ToANSI(const std::string& utf8) {
#ifdef _WIN32
    int len = MultiByteToWideChar(CP_UTF8, 0, utf8.c_str(), -1, nullptr, 0);
    wchar_t* wstr = new wchar_t[len];
    MultiByteToWideChar(CP_UTF8, 0, utf8.c_str(), -1, wstr, len);
    
    len = WideCharToMultiByte(CP_ACP, 0, wstr, -1, nullptr, 0, nullptr, nullptr);
    char* str = new char[len];
    WideCharToMultiByte(CP_ACP, 0, wstr, -1, str, len, nullptr, nullptr);
    
    std::string result(str);
    delete[] wstr;
    delete[] str;
    return result;
#else
    return utf8; // Linux/Unix默认UTF-8
#endif
}

// 使用示例
std::cout << UTF8ToANSI("你好，世界！") << std::endl;