Lecture 3 C Character Set and Tokens
Lecture 3 C Character Set and Tokens
C Character set
A character
set denotes a
collection of
symbols such
as alphabets,
digits or
special
symbols,
used to
represent
information.
C TOKENS
• A token is the basic and the smallest unit of a program. There are six
types of tokens in C which are as follows:
C KEYWORDS
• There are some reserved words in C, called keywords.
• All the keywords have standard pre-defined meanings and can be used only
for the purpose intended.
• There are 32 keywords and must be written in lowercase.
IDENTIFIERS
• Identifiers refer to the names of program elements such as variables, functions and
arrays.
• Identifiers are sequence of characters chosen from the set A–Z, a–z, 0–9, and _
(underscore).
• C is a case sensitive language so that ALFA and Alfa are different.
• Identifiers may be of reasonable length of 8–10 characters though certain computers
allow up to 32 characters.
• Identifier names must start with an alphabet or underscore followed by letters, digit
or a combination of both.
• We can not use any keyword as an identifier.
• All the identifiers should have a unique name in the same scope.
• The special characters such as '*','#','@','$' are not allowed within an identifier.
CONSTANTS
• Any fixed value that does not change during the execution of a program is
known as a constant.
Integer Constants
• An integer constant consists of a sequence of digits and is an integer-valued
number. There are 3 types of integer constants depending on the number
system. They are: Decimal, Octal, Hexadecimal
Real Constants
• Quantities which are represented by numbers with fractional part are
called real or floating point constants.
• A real constant must have at least 1 digit.
• It must have a decimal point.
• It can be either positive or negative.
• No commas or blanks are allowed within a real constant.
Character Constants
• A character constant is a single character within single quotes.
• The maximum length of a character constant is 1.
• Arithmetic operations are possible on character constant since they
too represent integer values.
• C also recognizes all the backlash character constants (Escape
sequences) available.
String Constants
• A string constant consists of zero or more number of characters
enclosed within double quotes.
• The characters within quotes could be digits, characters, special
characters and blank spaces.
Special Symbols
• Brackets[]: Opening and closing brackets are used as array element references. These
indicate single and multidimensional subscripts.
• Parentheses(): These special symbols are used to indicate function calls and function
parameters.
• Braces{}: These opening and ending curly braces mark the start and end of a block of code
containing more than one executable statement.
• Comma (, ): It is used to separate more than one statement like for separating parameters in
function calls.
• Colon(:): It is an operator that essentially invokes something called an initialization list.
• Semicolon(;): It is known as a statement terminator. It indicates the end of one logical
entity. That’s why each individual statement must be ended with a semicolon.
• Asterisk (*): It is used to create a pointer variable and for the multiplication of variables.
• Assignment operator(=): It is used to assign values and for logical operation validation.
• Pre-processor (#): The preprocessor is a macro processor that is used automatically by the
compiler to transform your program before actual compilation.
• Period (.): Used to access members of a structure or union.
• Tilde(~): Used as a destructor to free some space from memory.