Never use ANSI functions in modern Windows systems,use UNICODE functions instead, conversion between UTF8 and wide characters

⌚Time: 2026-02-07 19:01:00

👨‍💻Author: Jack Ge

I used ANSI functions in a previous program to perform operations such as obtaining file information, directory traversal, and file reading. When encountering file paths with some Chinese characters, it works normally on a Chinese system, but on an English system, it cannot handle Chinese character paths at all. This is very normal.

The "A" version of the function uses the current system code page, which is English (CP1252), and it does not encode Chinese characters. It will cause an error if the path contains Chinese characters. Only the Chinese system code page GB2312 (CP936) can recognize Chinese characters.

The encoding used by ANSI is the local code page of the current system. No ANSI encoding in Windows can represent English, Chinese, Japanese, Korean, and various other languages at the same time. Therefore, since Windows 2000, the internal system has been entirely Unicode. So, you should always use the wide-character "W" functions, which use UTF-16 encoding and support the complete Unicode character set.

Additionally, for some native C/C++ functions like std::fstream and fopen, they still call the ANSI version of the Windows API for file operations, so they cannot be used in modern systems. You should use the wide-character versions like std::wifstream and _wfopen instead.

In C++, UTF-8 encoding should be used for programming and string handling. When interacting with the Windows API, conversion between UTF-8 and wide characters is needed. Below are two conversion functions. They can effectively convert between UTF-8 and wide characters.

#include <string>
#include <windows.h>
#include <vector>

std::wstring utf8_to_wide(const std::string& s)
{
    if (s.empty()) {
        return {};
    }

    const int src_len = static_cast<int>(s.size());
    
    // Calculate the required buffer size
    const int dst_len = MultiByteToWideChar(
        CP_UTF8,
        MB_ERR_INVALID_CHARS,
        s.c_str(),
        src_len,
        nullptr,
        0
    );

    if (dst_len <= 0) {
        // You can consider throwing an exception or logging an error
        DWORD err = GetLastError();
        // Handle error...
        return {};
    }

    // Use vector as a temporary buffer
    std::vector<wchar_t> buffer(dst_len);
    
    const int result = MultiByteToWideChar(
        CP_UTF8,
        MB_ERR_INVALID_CHARS,
        s.c_str(),
        src_len,
        buffer.data(),
        dst_len
    );

    if (result <= 0) {
        DWORD err = GetLastError();
        return {};
    }

    return std::wstring(buffer.data(), buffer.size());
}

std::string wide_to_utf8(const std::wstring& ws)
{
    if (ws.empty()) {
        return {};
    }

    const int src_len = static_cast<int>(ws.size());
    
    const int dst_len = WideCharToMultiByte(
        CP_UTF8,
        0,
        ws.c_str(),
        src_len,
        nullptr,
        0,
        nullptr,
        nullptr
    );

    if (dst_len <= 0) {
        DWORD err = GetLastError();
        return {};
    }

    std::vector<char> buffer(dst_len);
    
    const int result = WideCharToMultiByte(
        CP_UTF8,
        0,
        ws.c_str(),
        src_len,
        buffer.data(),
        dst_len,
        nullptr,
        nullptr
    );

    if (result <= 0) {
        DWORD err = GetLastError();
        return {};
    }

    return std::string(buffer.data(), buffer.size());
}

In C++, use utf8_to_wide to convert to wide characters to call Windows "W" functions, and then use wide_to_utf8 to convert back to UTF-8 encoding for use in C++.