0% found this document useful (0 votes)
79 views5 pages

LEX Notes

The document provides an overview of LEX, a lexical analyzer generator used to create scanners that break input into tokens. It outlines the structure of a LEX specification, including declarations, transition rules, and auxiliary functions, and provides several code examples demonstrating pattern matching, counting words and numbers, and analyzing C code components. Additionally, it covers basic pattern matching techniques and counting string lengths, vowels, and consonants.

Uploaded by

aatreyeedev05
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views5 pages

LEX Notes

The document provides an overview of LEX, a lexical analyzer generator used to create scanners that break input into tokens. It outlines the structure of a LEX specification, including declarations, transition rules, and auxiliary functions, and provides several code examples demonstrating pattern matching, counting words and numbers, and analyzing C code components. Additionally, it covers basic pattern matching techniques and counting string lengths, vowels, and consonants.

Uploaded by

aatreyeedev05
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

CODING PART

Basics:
 LEX = LEXical analyzer generator
 It is a tool used to create lexical analyzers (scanners)
 Basically, LEX programs that read input and break it into tokens, which are
meaningful sequences of characters (like keywords, numbers, or identifiers)
 Alternatives to LEX – FLEX (Fast LEX) and JLEX (Java LEX)
 Structure of a LEX specification (specification ≈ program):

 Structure of a LEX Specification: There are three sections – Declarations,


Transition rules and Auxiliary functions
%{Declarations%}
%%
Transition rules
%%
Auxiliary functions

 Here, %% acts as a separator between two sections


Some codes
1. Pattern matching:
%{
#include <stdio.h>
%}

%%

[0-9]+ { printf("NUMBER: %s\n", yytext); }


[a-zA-Z]+ { printf("WORD: %s\n", yytext); }
[ \t\n]+ { /* ignore whitespace */ }
. { printf("UNKNOWN: %s\n", yytext); }

%%

int main() {
yylex(); // Call the lexer
return 0;
}
 Here,
a. [0-9]+ matches a sequence of digits from 0 to 9 (it is basically a regular
expression) from the input, and categorises it as a number
b. [a-zA-Z]+ matches a sequence of lowercase or uppercase characters
c. [ \t\n]+ matches a sequence of spaces, tabs and new line characters
d. All other sequences (indicated by ‘.’) are ignored
 yytext is a built in variable that contains the text matched by the current rule
(like for instance, in “April 1st 2025”, the sequence “April” is matched by the
second rule as we scan from left to right ⟶ “April” is stored in the variable
yytext temporarily, and that is printed
 After that, the space is ignored
 And then the sequence “1” is matched by the first rule ⟶ “1” is stored in the
variable yytext temporarily (replaces “April”, and that is printed. This goes on
 yylex() is the function that starts the lexical analysis from left to right and that
must be called in the main function
2. Counting the number of words and numbers:

%{
#include <stdio.h>
int words = 0, numbers = 0;
%}

%%

[0-9]+ { numbers++; }
[a-zA-Z]+ { words++; }
[ \t\n]+ { /* do nothing & skip spaces */ }

%%

int main() {
yylex();
printf("Total Words: %d\n", words);
printf("Total Numbers: %d\n", numbers);
return 0;
}

 We declare variables words = 0 and numbers = 0 initially


 When the lexer scans from left to right and identifies a sequence of digits (0 to
9), that is considered as a number, and the variable ‘numbers’ is incremented
 Similarly, when a sequence of characters is encountered, it is considered as a
word and the variable ‘words’ is incremented
 Here, we don’t really have to make use of yytext
 When we run yylex(), it takes in the input and based on that, performs whatever
has been described amongst the productions
3. Breaking down the components of a C Code:

%{
#include <stdio.h>
%}

%%

"if" { printf("IF keyword\n"); }


"else" { printf("ELSE keyword\n"); }
"while" { printf("WHILE keyword\n"); }
"return" { printf("RETURN keyword\n"); }
[a-zA-Z_][a-zA-Z0-9_]* { printf("ID: %s\n", yytext); }
[0-9]+ { printf("NUMBER: %s\n", yytext); }
. { /* ignore other characters */ }

%%

int main() {
yylex();
return 0;
}

 The ones within “” are counted as strings, and are matched based only if the
length, and the characters match (including lower/upper case)
Basic pattern matches:
Pattern Matches
a Only the character ‘a’
a|b Either the character ‘a’ or ‘b’
Anything except whatever has been
.
declared previously (equivalent to ‘else’)
\n New line characters
\t Tab character
\r Carriage return
\\ A single backslash ⟶ \
\” A single doublequote ⟶ “
[abc] Any one of a, b, or c
[^abc] Any character except a, b, c
[a-z] Any lowercase letter
[A-Z] Any uppercase letter
[0-9] Any digit
[a-zA-Z] Any letter
[a-zA-Z0-9_] Any letter, digit, or underscore
a* Zero or more a
a+ One or more a
a? Zero or one a
a{3} Exactly three as
a{2,4} Between 2 and 4 as
ab a followed by b
“something” The exact word “something”
^ Start of line (outside brackets)
End of line (not supported in old LEX
$
versions)
\ Escape next character
[aA][a-zA-Z0-9_]* Words starting with a or A

4. Counting the length of a string:

%{
#include <stdio.h>
#include <string.h>
%}

%%

[a-zA-Z0-9]+ {
printf("Length of input: %lu\n", strlen(yytext));
}

.|\n { /* ignore everything else */ }

%%

int main() {
yylex();
return 0;
}
5. Counting the number of vowels and consonants:

%{
#include <stdio.h>
#include <ctype.h>

int v_count = 0;
int c_count = 0;
%}

%%

[aAeEiIoOuU] { v_count++; }
[b-df-hj-np-tv-zB-DF-HJ-NP-TV-Z] { c_count++; }
.|\n

%%

int main() {
yylex(); // Start scanning input
printf("Vowels: %d\n", v_count);
printf("Consonants: %d\n", c_count);
return 0;
}

You might also like