Python Tokens and Character Sets
Last Updated :
15 Dec, 2022
Python is a general-purpose, high-level programming language. It was designed with an emphasis on code readability, and its syntax allows programmers to express their concepts in fewer lines of code, and these codes are known as scripts. These scripts contain character sets, tokens, and identifiers. In this article, we will learn about these character sets, tokens, and identifiers.
Character set
A character set is a set of valid characters acceptable by a programming language in scripting. In this case, we are talking about the Python programming language. So, the Python character set is a valid set of characters recognized by the Python language. These are the characters we can use during writing a script in Python. Python supports all ASCII / Unicode characters that include:
- Alphabets: All capital (A-Z) and small (a-z) alphabets.
- Digits: All digits 0-9.
- Special Symbols: Python supports all kind of special symbols like, ” ‘ l ; : ! ~ @ # $ % ^ ` & * ( ) _ + – = { } [ ] \ .
- White Spaces: White spaces like tab space, blank space, newline, and carriage return.
- Other: All ASCII and UNICODE characters are supported by Python that constitutes the Python character set.
Tokens
A token is the smallest individual unit in a python program. All statements and instructions in a program are built with tokens. The various tokens in python are :
1. Keywords: Keywords are words that have some special meaning or significance in a programming language. They can’t be used as variable names, function names, or any other random purpose. They are used for their special features. In Python we have 33 keywords some of them are: try, False, True, class, break, continue, and, as, assert, while, for, in, raise, except, or, not, if, elif, print, import, etc.
Python3
for x in range ( 1 , 9 ):
print (x)
if x < 6 :
continue
else :
break
|
Output:
1
2
3
4
5
6
2. Identifiers: Identifiers are the names given to any variable, function, class, list, methods, etc. for their identification. Python is a case-sensitive language and it has some rules and regulations to name an identifier. Here are some rules to name an identifier:-
- As stated above, Python is case-sensitive. So case matters in naming identifiers. And hence geeks and Geeks are two different identifiers.
- Identifier starts with a capital letter (A-Z) , a small letter (a-z) or an underscore( _ ). It can’t start with any other character.
- Except for letters and underscore, digits can also be a part of identifier but can’t be the first character of it.
- Any other special characters or whitespaces are strictly prohibited in an identifier.
- An identifier can’t be a keyword.
For Example: Some valid identifiers are gfg, GeeksforGeeks, _geek, mega12, etc. While 91road, #tweet, i am, etc. are not valid identifiers.
Python3
GFG = 'Hello'
b = "Geeks"
print (GFG)
print (b)
|
Output:
Hello
Geeks
3. Literals or Values: Literals are the fixed values or data items used in a source code. Python supports different types of literals such as:
(i) String Literals: The text written in single, double, or triple quotes represents the string literals in Python. For example: “Computer Science”, ‘sam’, etc. We can also use triple quotes to write multi-line strings.
Python3
a = 'Hello'
b = "Geeks"
c =
print (a)
print (b)
print (c)
|
Output
Hello
Geeks
Geeks for Geeks is a
learning platform
(ii) Character Literals: Character literal is also a string literal type in which the character is enclosed in single or double-quotes.
Python3
a = 'G'
b = "W"
print (a)
print (b)
|
Output:
G
W
(iii) Numeric Literals: These are the literals written in form of numbers. Python supports the following numerical literals:
- Integer Literal: It includes both positive and negative numbers along with 0. It doesn’t include fractional parts. It can also include binary, decimal, octal, hexadecimal literal.
- Float Literal: It includes both positive and negative real numbers. It also includes fractional parts.
- Complex Literal: It includes a+bi numeral, here a represents the real part and b represents the complex part.
Python3
a = 5
b = 10.3
c = - 17
print (a)
print (b)
print (c)
|
(iv) Boolean Literals: Boolean literals have only two values in Python. These are True and False.
Python3
a = 3
b = (a = = 3 )
c = True + 10
print (a, b, c)
|
(v) Special Literals: Python has a special literal ‘None’. It is used to denote nothing, no values, or the absence of value.
(vi) Literals Collections: Literals collections in python includes list, tuple, dictionary, and sets.
- List: It is a list of elements represented in square brackets with commas in between. These variables can be of any data type and can be changed as well.
- Tuple: It is also a list of comma-separated elements or values in round brackets. The values can be of any data type but can’t be changed.
- Dictionary: It is the unordered set of key-value pairs.
- Set: It is the unordered collection of elements in curly braces ‘{}’.
Python3
my_list = [ 23 , "geek" , 1.2 , 'data' ]
my_tuple = ( 1 , 2 , 3 , 'hello' )
my_dict = { 1 : 'one' , 2 : 'two' , 3 : 'three' }
my_set = { 1 , 2 , 3 , 4 }
print (my_list)
print (my_tuple)
print (my_dict)
print (my_set)
|
Output
[23, 'geek', 1.2, 'data']
(1, 2, 3, 'hello')
{1: 'one', 2: 'two', 3: 'three'}
{1, 2, 3, 4}
4. Operators: These are the tokens responsible to perform an operation in an expression. The variables on which operation is applied are called operands. Operators can be unary or binary. Unary operators are the ones acting on a single operand like complement operator, etc. While binary operators need two operands to operate.
Python3
a = 12
b = ~ a
c = a + b
print (b)
print (c)
|
5. Punctuators: These are the symbols that used in Python to organize the structures, statements, and expressions. Some of the Punctuators are: [ ] { } ( ) @ -= += *= //= **== = , etc.
Similar Reads
Iterate over a set in Python
The goal is to iterate over a set in Python. Since sets are unordered, the order of elements may vary each time you iterate. You can use a for loop to access and process each element, but the sequence may change with each execution. Let's explore different ways to iterate over a set. Using for loopW
2 min read
Internal working of Set in Python
Sets and their working Set in Python can be defined as the collection of items. In Python, these are basically used to include membership testing and eliminating duplicate entries. The data structure used in this is Hashing, a popular technique to perform insertion, deletion and traversal in O(1) on
3 min read
How to use Unicode and Special Characters in Tkinter ?
Prerequisite: Tkinter Python offers multiple options for developing a GUI (Graphical User Interface). Out of all the GUI methods, Tkinter is the most commonly used method. It is a standard Python interface to the Tk GUI toolkit shipped with Python. Python with Tkinter is the fastest and easiest way
1 min read
Ways to increment a character in Python
In python there is no implicit concept of data types, though explicit conversion of data types is possible, but it not easy for us to instruct operator to work in a way and understand the data type of operand and manipulate according to that. For e.g Adding 1 to a character, if we require to increme
4 min read
Zip function in Python to change to a new character set
Given a 26 letter character set, which is equivalent to character set of English alphabet i.e. (abcdâ¦.xyz) and act as a relation. We are also given several sentences and we have to translate them with the help of given new character set. Examples: New character set : qwertyuiopasdfghjklzxcvbnm Input
2 min read
Transliterating non-ASCII characters with Python
Transliteration is a process of writing the word of one language using similarly pronounced alphabets in other languages. It deals with the pronunciation of words in other languages. Similarly, in computer language, the computer can handle ASCII characters but has problems with non-ASCII characters.
3 min read
Python | Pandas Series.str.pad()
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas provide a method to add padding (whitespaces or other characters) to every stri
4 min read
Python Data Structures
Data Structures are a way of organizing data so that it can be accessed more efficiently depending upon the situation. Data Structures are fundamentals of any programming language around which a program is built. Python helps to learn the fundamental of these data structures in a simpler way as comp
15+ min read
C strings conversion to Python
For C strings represented as a pair char *, int, it is to decide whether or not - the string presented as a raw byte string or as a Unicode string. Byte objects can be built using Py_BuildValue() as // Pointer to C string data char *s; // Length of data int len; // Make a bytes object PyObject *obj
2 min read
How To Print Unicode Character In Python?
Unicode characters play a crucial role in handling diverse text and symbols in Python programming. This article will guide you through the process of printing Unicode characters in Python, showcasing five simple and effective methods to enhance your ability to work with a wide range of characters Pr
2 min read