Describing Syntax and Semantics: CSE 325/CSE 425: Concepts of Programming Language
Describing Syntax and Semantics: CSE 325/CSE 425: Concepts of Programming Language
03
DESCRIBING
SYNTAX
AND SEMANTICS
CSE 325/CSE 425:
CONCEPTS OF
PROGRAMMING
LANGUAGE
• Intro
• The General Problem of Describing Syntax
• Formal Method of Describing Syntax
• Attribute Grammars
• Describing the Meanings of Dynamic
Programs: Semantics
INTRODUCTION
• Syntax:
• The form or the expressions, statements and
structure of program
units
• Semantics:
• The meaning of the expressions, statements, and program units.
• E.G.-
• The syntax of a Java while statement is-
while (boolean_expr) statements
• The semantics of this statement form is-
• If (boolean_expr == true) then {statements will be executed}
• If (boolean_expr == false) then {statements will be skipped}
• Syntax and semantics provide a language’s definition and they
are closely related.
T H E G E N E R A LP R O B L E MO
F DESCRIBING SYNTAX:
TERMINOLOGY
• A sentence is a string of characters over some alphabet
• A language is a set of meaningful sentences
• A lexeme is the lowest level syntactic unit of a language (e.g.
*, sum, begin)
• An token is an object describing the lexeme.
• A token has a type (e.g. Keyword, identifier, Number, or
Operator) and a value (the actual characters of the described
lexeme).
• Lexemes are into groups (e.g. the names
partitioned
variables, classes and so forth) of is
methods, called as
identifiers.
EXAMPLE
Lexemes Tokens
index identifier
= equal_sign
2 Int_literal
• index = 2 * count + 17; * Mult_op
count Identifier
+ Plus_op
17 Int_literal
; semicolon
TERMINOLOGY
• Recognizers-
• A recognition device reads input strings over the alphabet
of the language and decides whether the input strings
belong to language.
• Ex- Syntax analysis part of a computer
• Generators
• A device that generates sentences of a language
• One can determine if the syntax of a particular sentence is
syntactically correct by comparing it to the structure of the
generator.
BNF AND
CFG
• Noam Chomsky and John Backus, developed the same
syntax description formalism, which became the most widely
used method for programming language syntax.
• Context-Free Grammars
• Developed by Noam Chomsky in the mid-1950s
• Language generators, meant to describe the syntax of natural
languages
• Define a class of languages called context-free languages
• Backus-Naur Form (1959)
• Invented by John Backus to describe the syntax of ALGOL 58
• BNF is equivalent to context-free grammers.
BNF
FUNDAMENTALS
• A metalanguage is a language that is used to
describe another language.
• BNF is a metalanguage for programming languages.
• BNF uses abstractions for syntactic structures.
• <assign> <var> = <expression>
• In BNF, abstractions are used to represent classes of
syntactic structures – they act like syntactic variables (also
called nonterminal symbols, or just nonterminals).
• Terminals are lexemes or tokens
• A rule has a left-hand side (LHS), which is a nonterminal, a
right-hand side (RHS), which is a string of terminals and/or
nonterminals.
BNF
FUNDAMENTALS
• Nonterminals are often enclosed in angle brackets.
• Example-
• <ident_list> identifier | identifier, <ident_list>
• <if_stmt> if <logic_expr> then <stmt>
• Grammar: a finite non-empty set of rules
• A start symbol is a special element of the nonterminals of a
grammar
• BNF Rules-
• An abstraction (or nonterminal symbol) can have more
than one RHS
• <stmt> <single_stmt>
| begin <stmt_list> end
DESCRIBING
LISTS
R
P
E
C
O
E
D
E Leftmost
Derivation LMD
O
PARSE TREE
(UNIQUE) If we use the parse tree to indicate
precedence levels of the operators, we
cannot have ambiguity
ASSOCIATIVITY
OF OPERATORS
• When an expression includes
two operators that have the
same precedence ( * & /) – a
semantic rule is required to
specify which should have
precedence.
• This rule is n am e d as
associativity.
• Example-
• A=B+C+A
ASSOCIATIVITY
• Operators associativity can also be indicated by a grammar.
EXTENDED
BNF
• Incorporated for few minor inconvenience in BNF.
• Most extended versions are called as Extended BNF, or
simply EBNF.
• The extensions do not enhance the descriptive power of BNF;
they only increase its readability and writability.
• Optional parts are placed in brackets [ ].
<proc_call> ident [ (<expr_list>) ]
• Alternative parts of RHSs are placed inside parentheses and
separated via vertical bars.
<term> <term> (+|-) const
• Repetitions (0 or more) are placed inside braces { }
<ident> letter { letter | digit }
EXAMPLE
VARIATIONS ON
BNF AND EBNF
• In place of the arrow, a colon is used and the RHS is placed on the next
line.
• Instead of a vertical bar to separate alternative RHS, they are simply
placed on separate lines.
• In place of square brackets to indicate something being optional, the
subscript opt is used.
• Ex- Constructor Declarator SimpleName
(FormalParameterListopt)
• Rather than using the | symbol in a parenthesized list of elements to
indicate a choice, the words ‘one of’ are used.
• Ex- Assignment Operator one of = *= /=
ATTRIBUTE
GRAMMARS
(AGS)
• Is a device used to describe more of the structure of a
programming language than can be described with a
context-free grammar.
• An extension to a context-free grammar.
• AGS have additions to CFGs to carry some semantic info
on parse tree nodes.
• Primary value of AGs:
• Static semantics specification
• Compiler design (static semantics checking)
STATIC
SEMANTICS
• Nothing to do with meaning
• CFGs cannot describe all of the syntax of programming
languages
• The problems exemplify the categories of language rules
(called as static semantic rules) :
• Context-free, but cumbersome (e.g., types of operands in
expressions)
• Non-context-free (e.g., variables must be declared before
they are used)
ATTRIBUTE
GRAMMARS:
DEFINITION
• Def: An attribute grammer is a context-free grammar G =
(S, N, T, P) with the following additions:
• For each grammar symbol x there is a set A(x) of attribute
values
• Each rule has a set of functions that define certain
attributes of the nonterminals in the rule
• Each rule has a (Possibly empty) set of predicates to
check for attribute consistency.
ATTRIBUTE
GRAMMARS:
(CONT.)
• Let Xo X1 …… Xn be a rule (X is a set of attributes)
• Functions of the form S(X0) f(A(X1), …., A(Xn)) define as
synthesized attributes
• Functions of the form I(XJ) f(A(X1), …., A(Xn)),
for i <= j <= n, define as inherited attributes
• Initially, there are intrinsic attributes on the leaves.
• Intrinsic attributes are synthesized attributes of
leaf nodes whose values are determined outside the
parse tree.
• Ex- the type of an instance of a variable in a program could
come from the symbol table.
ATTRIBUTE
GRAMMAR EXAMPLE
STATEMENTS
• Is an expression evaluation plus the setting of the target
variable to the expression’s value.
• Maps state sets to state sets U {error}
Checking of names
THANKS