Convert String to Unicode characters means transforming a string into its corresponding Unicode representations. Unicode is a standard for encoding characters, assigning a unique code point to every character.
For example:
- string "A" has the Unicode code point U+0041.
- string "你好" corresponds to U+4F60 and U+597D.
Using List Comprehension
List comprehension can split a string into individual characters by iterating through each character in the string.
s = "hello"
unicode = [ord(char) for char in s]
print(unicode)
Output
[104, 101, 108, 108, 111]
Using a For Loop
A for loop can iterate through each character in a string, appending each character to a list. This results in a list where every character from the string is an individual element.
s = "hello"
unicode = []
for char in s:
unicode.append(ord(char))
print(unicode)
Output
[104, 101, 108, 108, 111]
Explanation:
forloop iterates through each character in the stringsand uses theord()function to get the Unicode value of each character.- These Unicode values are appended to the
unicodelist, resulting in[104, 101, 108, 108, 111]for the string"hello".
Joining Unicode Representations as a String
In this method, each character in the string is converted to its Unicode (ASCII) value using ord(char). A generator expression iterates through the string and str() converts the Unicode values to strings.
s = "hello"
unicode = ' '.join(str(ord(char)) for char in s)
print(unicode)
Output
104 101 108 108 111
Explanation:
- generator expression
str(ord(char)) for char in sconverts each character insto its Unicode value usingord()and then to a string. ' '.join()method combines these Unicode string values into a single space-separated string, resulting in"104 101 108 108 111"for the input"hello"