Character Encoding Detection With Chardet in Python
Last Updated :
23 Jul, 2025
We are given some characters in the form of text files, unknown encoded text, and website content and our task is to detect the character encoding with Chardet in Python. In this article, we will see how we can perform character encoding detection with Chardet in Python.
Example:
Input: data = b'\xff\xfe\x41\x00\x42\x00\x43\x00'
Output: UTF-16
Explanation: Encoding is detected of the above given data.
Character Encoding Detection With Chardet in Python
Below are some of the examples by which we can understand how to detect the character encoding with Chardet in Python:
Installing Chardet in Python
First of all, we will install chardet in Python by using the following command and then we will perform other operations to detect character encoding in Python:
pip install chardet
Example 1: Detecting Encoding of a String
In this example, the Python script uses the chardet
library to detect the character encoding of a given byte sequence (data
). The detected encoding and its confidence level are printed, revealing information about the encoding scheme of the provided binary data.
Python3
import chardet
# String with unknown encoding
data = b'\xff\xfe\x41\x00\x42\x00\x43\x00'
# Detect the encoding
result = chardet.detect(data)
print(result['encoding'])
Output:
UTF-16
Example 2: Detecting Encoding of a Website Content
In this example, the Python script utilizes the requests
library to fetch the HTML content of the GeeksforGeeks webpage. The chardet
library is then employed to detect the character encoding of the retrieved content. The detected encoding and its confidence level are printed, providing insights into the encoding scheme used by the webpage.
Python3
import requests
import chardet
# Fetch the web page content
response = requests.get('https://www.geeksforgeeks.org/&/#39;)
html_content = response.content
# Detect the encoding
result = chardet.detect(html_content)
print(result['encoding'])
Output:
utf-8
Example 3: Detecting Encoding of a Text File
In this example, the Python script reads the content of a text file ('utf-8.txt') in binary mode using open
and rb
. The chardet
library is then used to detect the character encoding of the file's content. The detected encoding and its confidence level are printed, offering information about the encoding scheme used in the specified text file.
utf-8.txt

Python3
import chardet
# Read the text file
with open('utf-8.txt', 'rb') as f:
data = f.read()
# Detect the encoding
result = chardet.detect(data)
print(result['encoding'])
Output:
utf-8
Similar Reads
Check if string contains character - Python We are given a string and our task is to check if it contains a specific character, this can happen when validating input or searching for a pattern. For example, if we check whether 'e' is in the string 'hello', the output will be True.Using in Operatorin operator is the easiest way to check if a c
2 min read
Detect Encoding of CSV File in Python When working with CSV (Comma Separated Values) files in Python, it is crucial to handle different character encodings appropriately. Encoding determines how characters are represented in binary format, and mismatched encodings can lead to data corruption or misinterpretation. In this article, we wil
3 min read
Python program to read character by character from a file Python is a great language for file handling, and it provides built-in functions to make reading files easy with which we can read file character by character. In this article, we will cover a few examples of it.ExampleInput: GeeksOutput: G e e k sExplanation: Iterated through character by character
2 min read
Python program to read character by character from a file Python is a great language for file handling, and it provides built-in functions to make reading files easy with which we can read file character by character. In this article, we will cover a few examples of it.ExampleInput: GeeksOutput: G e e k sExplanation: Iterated through character by character
2 min read
Python program to read character by character from a file Python is a great language for file handling, and it provides built-in functions to make reading files easy with which we can read file character by character. In this article, we will cover a few examples of it.ExampleInput: GeeksOutput: G e e k sExplanation: Iterated through character by character
2 min read
Python Program to find if a character is vowel or Consonant Given a character, check if it is vowel or consonant. Vowels are 'a', 'e', 'i', 'o' and 'u'. All other characters ('b', 'c', 'd', 'f' ....) are consonants. Examples: Input : x = 'c'Output : ConsonantInput : x = 'u'Output : VowelRecommended: Please try your approach on {IDE} first, before moving on t
5 min read