APP Unit I
APP Unit I
Programming Language
Programming Language is a notational system for describing computations in
both machine readable and human readable form. It allows the programmer to
specify a task and the computer to execute the task. It is a notation for writing
computer programs and are text based formal languages but could also be
graphical.
Syntax
It converts the High level input program into a sequence of Tokens. A lexical
token is a sequence of characters that can be treated as a unit in the grammar of
the programming languages.
First, a lexer turns the linear sequence of characters into a linear sequence of
tokens; this is known as "lexical analysis" or "lexing". Second, the parser turns
the linear sequence of tokens into a hierarchical syntax tree; this is known as
"parsing" narrowly speaking. Thirdly, the contextual analysis resolves names
and checks types.
The parsing stage itself can be divided into two parts: the parse tree, or
"concrete syntax tree", which is determined by the grammar, but is generally far
too detailed for practical use, and the abstract syntax tree (AST), which
simplifies this into a usable form.
Words are in regular language, phrases are in CFL (context free language) and
context in context-sensitive language.
The syntax of a computer language is the set of rules that define the
combinations of symbols that are considered to be correctly structured
statements or expressions in that language. Computer language syntax is
generally distinguished into three levels
• Words – the lexical level, determining how characters form tokens;
• Phrases – the grammar level, narrowly speaking, determining how tokens
form phrases;
• Context – determining what objects or variables names refer to, if types
are valid, etc.
Semantics
Semantics assigns computational meaning to valid strings in a programming
language syntax. It describes the processes a computer follows when executing
a program in that specific language. Syntax therefore refers to the valid form of
the code, and is contrasted with semantics – the meaning.
Both syntax tree of previous phase and symbol table are used to check the
consistency of the given code.
Type checking is an important part of semantic analysis where compiler makes
sure that each operator has matching operands.
Semantic errors:
• Type mismatch
• Undeclared variables
• Reserved identifier misuse
• No break outside loop
Corrado Böhm and Giuseppe Jacopini showed in 1966 that any non-structured
program can be rewritten by combining three techniques: sequence, selection,
and repetition (or loop).
• Large problems are divided into smaller ones, which can then be solved at the
same time.
• For example, a program that needs to calculate the sum of a large number of
numbers can be parallelized by dividing the numbers into smaller groups and
assigning each group to a different processor.
• Data parallelism (divides a program's data into smaller pieces and assigns
each piece to a different processor. This approach is well-suited for
problems that can be divided into independent tasks.).
• Groups instructions with the part of the state they operate on.
• Programmer merely declares properties of the desired result, but not how to
compute it.
• Rules: Rules are statements that describe how facts can be deduced from
other facts
Example: In python
• Features
• It uses recursions and pure functions (functions that do not have side
effects. A side effect is a change to the state of the program outside of the
function.)
• Example: Lisp, Standard ML
Pure function: It does not produce side effect, It only depends of its input, It is
always deterministic.
def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n - 1)
def even_factorial(n):
result = 0
for i in range(n + 1):
if is_even(i):
result += factorial(i)
return result
Machine codes
In 1940, John von Neumann had the idea that a computer should be
permanently hardwired with a small set of general-purpose operations. The
operator could then input into the computer a series of binary codes that would
organize the basic hardware operations to solve more-specific problems.
In this program, each line contains 16 bits or binary digits. A line of 16 bits
represents either a single machine language instruction or a single data value.
Program execution begins with the first line of code, which is fetched from
memory, interpreted, and executed. Control then moves to the next line of code,
and the process is repeated, until a special halt instruction is reached.
The first 4 bits are opcode (type of operation) and remaining 12 bit are
operands. Operands could be registers or memory location. It was error prone to
input 0s and 1s.
1950, Assembly language: mnemonic symbols were used in place of binary
codes. Assembler could help translate the mnemonics back to machine code.
Example:
.ORIG x300 ; LD R1, FIRS ; LD R2, SECON ; ADD R3, R2, R ; ST
R3, SU ; HALT
Each computer architecture has a specific set of machine code and required
varied sets of assembly language codes.
1960, Algol was released which removed machine specific codes. ALGOL
compilers converted standard Algol programs for any type of machine.
Function call overload
• Overload refers to the amount of time and resources that are consumed during
the function call.
• The arguments of the function has to be pushed onto the stack, time taken to
call the function, and the time taken to pop the arguments off the stack.
• Reduce the overload:
• Use inline functions (compiler exapands the code and no need for stack
push/pop)