0% found this document useful (0 votes)
15 views

Using Lex

This document provides an overview of the structure and components of a Lex program including: 1. A Lex program consists of three main sections: definition, rules, and user subroutines. 2. The rules section contains patterns and actions. Patterns are matched against the input and when a match occurs, the corresponding action is performed. 3. Regular expressions are used to define patterns in Lex rules. Common regular expression operators include ., *, [], ^, $, {}, \, +, ?, |. 4. Examples demonstrate how to write Lex programs for tasks like word counting, number recognition, and lexical analysis for a simple programming language.

Uploaded by

mdhuq1
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Using Lex

This document provides an overview of the structure and components of a Lex program including: 1. A Lex program consists of three main sections: definition, rules, and user subroutines. 2. The rules section contains patterns and actions. Patterns are matched against the input and when a match occurs, the corresponding action is performed. 3. Regular expressions are used to define patterns in Lex rules. Common regular expression operators include ., *, [], ^, $, {}, \, +, ?, |. 4. Examples demonstrate how to write Lex programs for tasks like word counting, number recognition, and lexical analysis for a simple programming language.

Uploaded by

mdhuq1
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Using Lex

The Structure of a Lex Program


(Definition section) %% (Rules section) %% (User subroutines section)

Example 1-1: Word recognizer ch1-02.l


%{ /* * this sample demonstrates (very) simple recognition: * a verb/not a verb. */ %} %% [\t ]+ is | am | are | were | was | be | being | been | do | does | did | /* ignore white space */ ; [a-zA-Z]+ .|\n %% { printf("%s: is not a verb\n", yytext); } { ECHO; /* normal default anyway */ } will | would | should | can | could | has | have | had | go

{ printf("%s: is a verb\n", yytext); }

main() { yylex(); }

The definition section


Lex copies the material between %{ and %} directly to the generated C file, so you may write any valid C codes here

Rules section
Each rule is made up of two parts
A pattern An action

E.g.
[\t ]+ /* ignore white space */ ;

Rules section (Contd)


E.g.
is | am | are | were | was | be | being | been | do | does | did | will | would | should | can | could | has | have | had | go

{ printf("%s: is a verb\n", yytext); }

Rules section (Contd)


E.g.
{ printf("%s: is not a verb\n", yytext); } { ECHO; /* normal default anyway */ } [a-zA-Z]+ .|\n

Lex had a set of simple disambiguating rules:


1. 2. Lex patterns only match a given input character or string once Lex executes the action for the longest possible match for the current input

User subroutines section


It can consists of any legal C code Lex copies it to the C file after the end of the Lex generated code

%% main() { yylex(); }
8

Regular Expressions
Regular expressions used by Lex
. * [] ^ $ {} \ + ? | / ()

Examples of Regular Expressions


[0-9] [0-9]+ [0-9]* -?[0-9]+ [0-9]*\.[0-9]+ ([0-9]+)|([0-9]*\.[0-9]+) -?(([0-9]+)|([0-9]*\.[0-9]+)) [eE][-+]?[0-9]+ -?(([0-9]+)|([0-9]*\.[0-9]+))([eE][-+]?[0-9]+)?)
10

Example 2-1
%% [\n\t ] ;
-?(([0-9]+)|([0-9]*\.[0-9]+)([eE][-+]?[0-9]+)?) { printf("number\n"); }

. ECHO; %% main() { yylex(); }

11

A Word Counting Program


The definition section
%{ unsigned charCount = 0, wordCount = 0, lineCount = 0; %} word [^ \t\n]+ eol \n

12

A Word Counting Program (Contd)


The rules section
{word} { wordCount++; charCount += yyleng; } {eol} { charCount++; lineCount++; } . charCount++;

13

A Word Counting Program (Contd)


The user subroutines section
main(argc,argv) int argc; char **argv; { if (argc > 1) { FILE *file; file = fopen(argv[1], "r"); if (!file) { fprintf(stderr,"could not open %s\n",argv[1]); exit(1); } yyin = file; } yylex(); printf("%d %d %d\n",charCount, wordCount, lineCount); return 0; }

14

Another Problem
%{ letter digit }% [A-Za-z] [0-9]

%% begin end := {letter} ({letter}|{digit})* {digit}+

{return (BEGIN);} {return (END);} {return (ASGOP);} {yyval = enter_id(); return(ID);} {yyval = enter_num(); return (NUM);}

%% enter_id() { /* enter the id in the symbol table and returns entry number */} enter_num() { /* enter the number in the constant table and return entry number */}

15

You might also like