Lex & Yacc

  • View
    48

  • Download
    0

Embed Size (px)

DESCRIPTION

Lex & Yacc. ... . UNIX (Language Development Tools) lex ( LEXical analyzer) yacc (Yet Another Compiler Compiler). Lex (Pattern matching) - PowerPoint PPT Presentation

Transcript

  • Lex & Yacc...

  • UNIX (Language Development Tools)

    lex ( LEXical analyzer) yacc (Yet Another Compiler Compiler)

  • Lex (Pattern matching) Input Tokens lex yylex() Scanner

    Yacc (Syntax Analysis) yacc yyparse() Parser

  • 3 (Parser) Yacc Lex

    Lex specif.

    Yacc specif.

    lex

    yacc

    Lexical

    analyzer

    parser

    Lexical errors

    Parsing errors

    Parser

    output

    Program

    input

    tokens

  • Lex Input Lex specifications file Output C

    Lex Specifications file definitions%%lex regular expressions and actions%%User functions

  • Lex specifications file 1%%[0-9]+{ printf(INTEGER\n); }[-*/+]{ printf(OPERATOR\n); } 2%%[0-9]+{ return (INT); }[-*/+]{ return (OPR); }

  • Lex Definitions

    INTEGER [0-9]+%%{ INTEGER }printf(" Integer encountered\n");{ INTEGER } \. { INTEGER } {printf(" Real number encountered\n");}

  • Yacc specifications file Lex specifications file declarations%%yacc rules and actions%%User functions

  • Yacc specifications program Parser

  • %tokenINT OPR%start expr%%expt: INT OPR INT{printf("The input expression was syntactically correct\n");}|error{printf("The input expression was syntactically incorrect\n");}%%

  • Main program

    #include main( ){yyparse( );}#include "y.tab.c"#include "lex.yy.c"

  • Run Parser

    lex file-nameFile-name File Lex specifications Execute File lex.yy.c C program function yylex

    yacc file-nameFile-name File yacc specifications Execute File y.tab.c C program function yyparse

  • Run Parser

    a.out parser

    cc main.c -ly

    Compile Standard CC command Yacc library -ly option Lex library -ll option CC command line

    Yacc Lex libraries

    -ly Yacc library Error option

  • Table 1: Pattern Matching Primitives

    Metacharacter

    Matches

    .

    \n

    *

    +

    ?

    ^

    $

    a|b

    (ab)+

    "a+b"

    [ ]

    any character except newline

    newline

    zero or more copies of preceding expression

    one or more copies of preceding expression

    zero or one copy of preceding expression

    beginning of line

    end of line

    a or b

    one or more copies of ab (grouping)

    literal a+b

    character class

  • Table 2: Pattern Matching Examples

    Expression

    Matches

    abc

    abc*

    abc+

    a(bc)+

    a(bc)?

    [abc]

    [a-z]

    [a\-z]

    [-az]

    [A-Za-z0-9]+

    [ \t\n]+

    [^ab]

    [a^b]

    [a|b]

    a|b

    abc

    ab, abc, abcc, abccc,

    abc, abcc, abccc,

    abc, abcbc, abcbcbc,

    a, abc

    a, b, c

    any letter, a through z

    a, -, z

    -, a, z

    one or more alphanumeric characters

    whitespace

    anything except: a, b

    a, ^, b

    a, |, b

    a or b

  • Context Sensitivity

    Lex (Start condition) Left context sensitivity

    Left context sensitivity Token Token

    start condition Set Unset Token

    Start conditions Lex

    %start start1 start2 start3

  • start1, start2 start3 Start conditions

    Set start condition Statement BIGIN start1

    start1 Start condition

    Unset start condition Statement BEGIN 0;

    Match Regular expression Start condition < start1 > reg_expr{ actions }

  • Start Condition

    input $ (Dollar sign) Input $23 output Twenty three dollars

    Lex specifications file

  • %startTT ZZ DD TN%%\ $ { BEGIN TT; }< TT > [ 0-9 ] {tenth = yytext[0];BEGIN 0;Switch (yytext[0]){case '0': BEGIN ZZ; break;case '1' : BEGIN TN; break;default: BEGIN DD; break;}}

  • < ZZ > [ 0-9 ]{ BEGIN 0;Switch (yytext[0]){case '0': printf("Zero dollars"); break;case '1': printf("One dollars"); break;case '2': printf("Two dollars"); break;case '3': printf("Three dollars"); break;case '4': printf("Four dollars"); break;case '5': printf("Five dollars"); break;case '6': printf("Six dollars"); break;case '7': printf("Seven dollars"); break;case '8': printf("Eight dollars"); break;case '9': printf("Nine dollars"); break;}}

  • < TN > [ 0-9]{ BEGIN 0;Switch (yytext[0]){case '0': printf("Ten dollars"); break;case '1': printf("Eleven dollars"); break;case '2': printf("Twelve dollars"); break;case '3': printf("Thirteen dollars"); break;case '4': printf("Fourteen dollars"); break;case '5': printf("Fifteen dollars"); break;case '6': printf("Sixteen dollars"); break;case '7': printf("Seventeen dollars"); break;case '8': printf("Eighteen dollars"); break;case '9': printf("Nineteen dollars"); break;}}

  • < DD > [ 0-9 ]{BEGIN 0;Switch (tenth){case '2': printf("Twenty "); break;case '3': printf("Thirty "); break;case '4': printf("Forty "); break;case '5': printf("Fifty "); break;case '6': printf("Sixty "); break;case '7': printf("Seventy "); break;case '8': printf("Eighty "); break;case '9': printf("Ninety "); break;}

  • Switch (yytext[0]){case '0': printf("dollars"); break;case '1': printf("One dollars"); break;case '2': printf("Two dollars"); break;case '3': printf("Three dollars"); break;case '4': printf("Four dollars"); break;case '5': printf("Five dollars"); break;case '6': printf("Six dollars"); break;case '7': printf("Seven dollars"); break;case '8': printf("Eight dollars"); break;case '9': printf("Nine dollars"); break;}}

  • main function C

    #include < stdio.h>chartenth;main( ){while (1)yylex( );}yywrap( ){exit (0);}#include " lex.yy.c "

  • Predefined Pseudovariables

    Token Lexical analyzer Yacc

    Precefined pseudovariables Token $ (Dollar Sign) (Integer) Integer Terminal symbol Rule $$ Left-hand nontermal symbol result: VAR1 '+' VAR2 '+' VAR3{ $$ = $1 + $3+ $5; };

  • $1 VAR1, $2 Terminal symbol +,...$5 VAR3

    Nonterminal result

    Predefined pseudovariables (Parsing) (Arithmetic expressions)

  • Calculator

  • Lex specifications file> cat cal.l

    %{#include "y.tab.h"extern char yytext[];extern int yylval;%}%%[0-9]+ { yylval = atoi(yytext); return INTEGER; }()

  • ()[- +\n] return *yytext;[\t] ;. { exit(0); }%%yywrap(){ return 1; }

  • Yacc specifications file> cat cal.y | more

    %token INTEGER%%program:program expr '\n' {printf("%d\n",$2);} | ;expr: INTEGER {$$ = $1;} |expr '+' expr {$$ = $1 + $3 ;} |expr '-' expr {$$ = $1 - $3 ;} ;()

  • ()%%yyerror(){ exit(0);}main(){ yyparse(); return 0;}

  • Compile Run >yacc d cal.y> lex cal.l> cc lex.yy.c y.tab.c -o cal> ./cal3-1/* input */2/* */4-2+11/* */5+3+2-19/* */./* exit */>

  • REFERENCEUNIX UTILITIES, International Edition, R.S. Tare, McGraw-Hill, 1988

    A Compact Guide to LEX and YACC http://epaperpress.com/y_man.html,Access date Feb,4 2001