Programming Languages (Udacity, CS-262)

 Student

Asim Ihsan

Unit 1

1.4: Breaking Up Strings

1.5: Selecting Substrings

"hello"[1:3] = "el"
"hello"[1:] = "ello"

1.6: Split

"Jane Eyre".split() = ["Jane", "Eyre"]

1.7: Regular Expressions

1.9: Import Re

import re
re.findall(r"[0-9]", "1+2==3") = ["1", "2", "3"]

1.12: Concatenation

r"[a-c][1-2]"

"a1", "a2", "b1", …

1.14: One or more

1.15: Finite State Machines

1.18: Disjunction

impore re
r = r"[a-z]+|[0-9]+"
re.findall(r, "Goethe 1749") = ["oethe", "1749"]

note that:

[0-2] = “0|1|2”

1.21: Options

import re
r = r"-?[0-9]+"
re.findall(r, "1861-1941 R. Tagore") = ["1861", "-1941"]

1.22: Escape Sequences

1.23: Hyphenation

r = r"[a-z]+-?[a-z]+"

1.26: Quoted Strings

1.27: Structure

(?:xyz)+
r = r"do+|re+|mi+"
r = r"(?:do|re|mi)+"

1.28: Escaping the escape

regexp = r'"(?:(?:\\.)*|[^\\])*"'

1.29: Representing a FSM

edges[(1, 'a')] = 2
accepting = [3]

 1.30: FSM simulator

edges = {(1, 'a') : 2,
         (2, 'a') : 2,
         (2, '1') : 3,
         (3, '1') : 3}

accepting = [3]

def fsmsim(string, current, edges, accepting):
    if len(string) == 0:
        return current in accepting
    letter = string[0]
    next_state = edges.get((current, letter), None)
    if next_state is None:
        return False
    return fsmsim(string[1:], next_state, edges, accepting)

1.31: FSM Interpretation

edges = {(1, 'q'): 1}        
accepting = [1]

1:32: More FSM Encoding

r"[a-b][c-d]?"

edges = {(1, 'a'): 2,
         (1, 'b'): 2,
         (2, 'c'): 3,
         (2, 'd'): 3}

accepting = [2, 3]

 1.34: Epsilon and Ambiguity

1.35: Phone It In.

regexp = r'[0-9]+(?:[0-9]|-[0-9])*'

1.37: Non-deterministic FSM

1.38: Save the world

Problem Set 1

s1 = "12+34"
fsmsim() for '[0-9]+'.

call fsmsim("1"), it matches.
call fsmsim("2"), it matches.
call fsmsim("12+"), it doesn't match. Hence one 'token' is '12', and advance input to '3'.

call fsmsim("3"), it matches.
call fsmsim("4"), it matches.
end of string.

result is ["12", "34"].

Unit 2

2.1: Introduction

2.3: Specification

2.8: Taking HTML Apart

2.9: HTML Structure

2.10: Specifying tokens

def t_RANGLE(token):
    r'>' # I am a regexp!
    return token # return text unchanged, but can transform it.
     
def t_LANGLESLASH(token):
    r'/>'
    return token

2.11: Token values

def t_NUMBER(token):
    r'[0-9]+'
    token.value = int(token.value)
    return token

2.12: Quoted Strings

def t_STRING(token):
    r'"[^"]*"'
    return token

2.13: Whitespace

def t_WHITESPACE(token):
    r' '
    pass

And if we define a word as any number of characters except <, >, or space, leaving the value unchanges:

def t_WORD(token):
    r'[^<> ]+'
    return token

2.14: Lexical Analyzer

2.15: Ambiguity

2.17: String snipping

def t_STRING(token):
    r'"[^"]*"'
    token.value = token.value[1:-1]
    return token

Making a lexer

import ply.lex as lex

tokens = (
    'LANGLE',        # <
    'LANGLESLASH',   # </
    'RANGLE',        # >
    'EQUAL',         # =
    'STRING',        # ".."
    'WORD'           # dada
)   

t_ignore = ' ' # shortcut for whitespace

# note this is before t_LANGLE, want it to win
def t_LANGLESLASH(token):
    r'</'
    return token
    
def t_LANGLE(token):
    r'<'
    return token
    
def t_RANGLE(token):
    r'>'
    return token
    
def t_EQUAL(token):
    r'='
    return token
   
def t_STRING(token):
    r'"[^"]*"'
    token.value = token.value[1:-1]
    return token
    
def t_WORD(token):
    r'[^ <>]+'
    return token
    
webpage = "This is <b>my</b> webpage!"
htmllexer = lex.lex()
htmllexer.input(webpage)
while True:
    tok = htmllexer.token()
    if not tok: break
    print tok

2.19: Tracking line numbers

def t_newline(token):
    r'\n'
    token.lexer.lineno += 1
    pass

2.21: Commented HTML

How to add to lexer.

states = (
    ('htmlcomment', 'exclusive'),
)

If we are in the state htmlcomment we cannot be doing anything else at the same time, like looking for strings or words.

def t_htmlcomment(token):
    r'<!--'
    token.lexer.begin('htmlcomment')
    
def t_htmlcomment_end(token):
    r'-->'
    token.lexer.lineno += token.value.count('\n')
    token.lexer.begin('INITIAL')
    
def t_htmlcomment_error(token):
    token.lexer.skip(1)

2.26: Identifier

def t_identifier(token):
    r'[A-Za-z][A-Za-z0-9_]+'
    return token

2.27: Number

def t_NUMBER(token):
    r'-?[0-9]+(?:\.[0-9]*)?'
    token.value = float(token.value)
    return token

2.28: The End Of The Line

Comments to the end of the line in JavaScript.

def t_eolcomment(token):
    r'//[^\n]*'
    pass

2.30: Wrap Up

Office Hours 2

 Unit 3: Syntactic Analysis

3.2: Bags of Words

3.3: Syntactic Structure

3.5: Infinity and Beyond

3.7: An Arithmetic Grammar

3.8: Syntactical analysis

3.9: Statements

    Stmt -> identifier = Exp
    Exp -> Exp + Exp
    Exp -> Exp - Exp
    Exp -> number

    lata = 1, good
    lata = lata + 1, bad

3.10: Optional Parts

 3.11: More Digits

3.12: Grammars and Regexps

3.13: Context-Free Languages

3.14: Parentheses

 3.16: Intuition

 3.18: Extracting Information

3.20: Ambiguity

3.21: To The Rescue

3.23: Grammar for HTML

    <b>Welcome to <i>my</i> webpage!</b>
    
    Html -> Element Html
    Html -> \epsilon
    Element -> word
    Element -> TagOpen Html TagClose
    TagOpen -> < word >
    TagClose -> </ word >

3.25: Revenge of JavaScript

def absval(x):
    if x < 0:
        return 0 - x
    else:
        return x
function absval(x) {
    if x < 0 { 
        return 0 - x;
    } else {
        return x;
    }
}
print "hello" + "!"

3.27: Universal Meaning

Partial grammar for JavaScript:

    Exp -> identifier
    Exp -> TRUE
    Exp -> FALSE
    Exp -> number
    Exp -> string
    Exp -> Exp + Exp
    Exp -> Exp - Exp
    Exp -> Exp * Exp
    Exp -> Exp / Exp
    Exp -> Exp < Exp
    Exp -> Exp == Exp
    Exp -> Exp && Exp
    Exp -> Exp || Exp
    

3.29: JavaScript Grammar

3.31: JavaScript Functions

3.33: Lambda

3.34: List Power

def mysquare(x): return x*x
map(mysquare, [1,2,3,4,5]) # = [1,4,9,16,25]

map(lambda(x): x*x, [1,2,3,4,5]) # same!

[x*x for x in [1,2,3,4,5] # same!

3.37: Generators

def odds_only(numbers):
    for n in numbers:
        if n % 2 == 1:
            yield n
[x for x in [1,2,3,4,5] if x % 2 == 1]

3.39: Checking Valid Strings

grammar = … (as above)

def expand(tokens, grammar):
    for i, token in enumerate(tokens):
        for (rule_lhs, rule_rhs) in grammar:
            if token == rule_lhs:
                result = tokens[0:i] + rule_rhs + tokens[i+1:]
                yield result

depth = 2
utterances = [["exp"]]
for x in xrange(depth):
    for sentence in utterances:
        utterances = utterances + [ i for i in expand(sentence, grammar)]

for sentence in utterances:
    print sentence

Office Hours 3

Unit 4

4.1: Introduction

4.2: Time Flies

4.3: Brute Force

4.4: Fibonacci numbers

def memofibo(n, chart = None):
    if chart is None:
        chart = {}
    if n <= 2:
        chart[n] = 1
    if n not in chart:
        chart[n] = memofibo(n-1, chart) + memofibo(n-2, chart)
    return chart[n]

4.8: Memoization for Parsing

4.9: Parsing state

4.10: Possible States

    Input: 1 +
    State: E -> 1 + <dot> E
    

4.11: Charting Parse States

4:14: Magical Power

4.16: Building the chart

4.17: Closure



4.18: Computing the Closure

Suppose:

    E -> E - E
    E -> (F)
    E -> int
    F -> string
    
    Input: int - int
    Seen 2 tokens so far
 
    chart[2] has E -> E - <dot> E, from 0

Then the result of computing the closure:

    E -> <dot> int from 2
    E -> <dot> (F) from 2
    E -> <dot> E - E from 2
    

The following are not in the result:

    E -> <dot> E - E from 0 # wrong from
    F -> <dot> string from 2 # wrong LHS

4.19: Consuming the Input

4.21: Reduction

    x -> ab <dot> cd
    

4.23: Reduction Walkthrough

    E -> E + E <dot> from B in chart [A]
    

4.25: Addtochart

Adding state to chart:

def addtochart(chart, index, state):
    if not state in chart[index]:
        chart[index] = [state] + chart[index]
        return True
    else:
        return False

4.26: Revenge of List Comprehensions

Grammar:

    S -> P
    P -> (P)
    P ->
    

In Python:

grammar = [
    ("S", ["P"]),
    ("P", ["(", "P", ")"]),
    ("P", []),
] 

Parser state:

    X -> ab<dot>cd from j
    

In Python:

state = ("x", ["a", "b"], ["c", "d"], j)

4.27: Writing the closure

def closure(grammar, i, x, ab, cd, j):
    next_states = [
        (rule[0], [], rule[1], i)
        for rule in grammar
        if len(cd) > 0 and
           rule[0] == cd[0]
    ]
    return next_states

4.29: Writing shift

next_state = shift(tokens, i, x, ab, cd, j)
if next_state is not None:
    any_changes = addtochart(chart, i+1, next_state)
                  or any_changes
def shift(tokens, i,x, ab, cd, j):
    if len(cd) > 0 and tokens[i] == cd[0]:
        return (x, ab + [cd[0]], cd[1:], j)
    else:
        return None

4.30: Writing reductions

next_states = reductions(chart, i, x, ab, cd, j)
for next_state in next_states:
    any_changes = addtochart(chart i, next_state)
                  or any_changes
                  

def reductions(chart, i, x, ab, cd, j):
    # x -> ab<dot> from j
    # chart[j] has y -> ... <dot>x ... from k
    return [
        (jstate[0],
         jstate[1] + [x],
         jstate[2][1:],
         jstate[3])
        for jstate in chart[j]
        if len(cd) > 0 and
           len(jstate[2]) > 0 and
           jstate[2][0] == x
    ]

4.31: Putting it together

# see notes/src/programming_languages/ps4_parser.py
# above has closure, shift, and reductions in-lined.

def parse(tokens, grammar):
    tokens = tokens + ["end_of_input_marker"]
    chart = {}
    start_rule = grammar[0]
    for i in xrange(len(tokens) + 1):
        chart[i] = []
    start_state = (start_rule[0], [], start_rule[1], 0)
    chart[0] = [start_state]
    for i in xrange(len(tokens)):
        while True:
            changes = False
            for state in chart[i]:
                # State === x -> ab<dot>cd, j
                (x, ab, cd, j) = state
                
                # Current state == x -> ab<dot>cd, j
                # Option 1: For each grammar rule
                # c -> pqr (where the c's match)
                # make a next state:
                #
                # c -> <dot>pqr, i
                #
                # English: We're about to start
                # parsing a "c", but "c" may be
                # something like "exp" with its
                # own production rules. We'll bring
                # those production rules in.
                next_states = closure(grammar, i, x, ab, cd, j)
                for next_state in next_states:
                    changes = addtochart(chart, i, next_state) or changes
                    

                # Current State == x -> ab<dot>cd, j
                # Option 2: If tokens[i] == c,
                # make a next state:
                #
                # x -> abc<dot>d, j
                #
                # £nglish: We're looking for a parse
                # token c next and the current token
                # is exactly c! Aren't we lucky!
                # So we can parse over it and move
                # to j+1.
                next_state = shift(tokens, i, x, ab, cd, j)
                if next_state is not None:
                    any_changes = addtochart(chart, i+1, next_state) or any_changes
                    
                # Current state == x -> ab<dot>cd, j
                # Option 3: if cd is [], the state is
                # just x -> ab<dot>, j
                # For each p -> q<dot>xr, l in chart[j]
                # Make a new state:
                #
                # p -> qx<dot>r, l
                #
                # in chart[i].
                #
                # English: We've just finished parsing
                # an "x" with this token, but that
                # may have been a sub-step (like
                # matching "exp->2" in "2+3"). We
                # should update the higher-level
                # rules as well.
                next_states = reductions(chart, i, x, ab, cd, j)
                for next_state in next_states:
                    changes = addtochart(chart, i, next_state) or changes
                    
        if not changes:
            break

    accepting_state = (start_rule[0], start_rule[1], [], 0)
    return accepting_state in chart[len(tokens)-1]

result = parse(tokens, grammar)
print result

4.33: Parse Trees

# tokens
def t_STRING(t):
    r'"[^"]*"'
    t.value = t.value[1:-1]
    return t
    
# parsing rules
def p_exp_number(p):
    'exp : NUMBER' # exp -> NUMBER
    p[0] = ("number", p[1])
    # p[0] is returned parse tree
    # p[0] refers to exp
    # p[1] refers to NUMBER.
    
def p_exp_not(p):
    'exp : NOT exp' # exp -> NOT exp
    p[0] = ("not", p[2])
    # p[0] refers to exp
    # p[1] refers to NOT
    # p[2] refers to exp

4.34: Parsing HTML

def p_html(p):
    'html : elt html'
    p[0] = [p[1]] + p[2]
    
def p_html_empty(p):
    'html : '
    p[0] = []
    
def p_elt_word(p):
    'elt : WORD'
    p[0] = ("word-element", p[1])

4.35: Parsing tags

def p_elt_tag(p):
    # <span color="red">Text!</span>:
    'elt : LANGLE WORD tag_args RANGLE html LANGLESLASH WORD RANGLE'
    p[0] = ("tag-element", p[2], p[3], p[5], p[7])

4.36: Parsing JavaScript

def p_exp_binop(p):
    """exp : exp PLUS exp
           | exp MINUS exp
           | exp TIMES exp"""
    p[0] = ("binop", p[1], p[2], p[3])
def p_exp_call(p):
    'exp : IDENTIFIER LPAREN optargs RPAREN'
    p[0] = ("call", p[1], p[3])
def p_exp_number(p):     
    'exp : NUMBER'
    p[0] = ("number", p[1])

4.38: Precedence

precedence = (
    # lower precedence at the top
    ('left', 'PLUS', 'MINUS'),
    ('left', 'TIMES', 'DIVIDE'),
    # higher precedence at the bottom 
)

4.41: Optional Arguments

def p_exp_call(p):
    'exp : IDENTIFIER LPAREN optargs RPAREN'
    p[0] = ("call", p[1], p[3])
    
def p_exp_number(p):
    'exp : NUMBER'
    p[0] = ("number", p[1])
    
def p_optargs(p):
    """optargs : exp COMMA optargs 
               | exp
               | """
    if len(p) == 1:
        p[0] = []
    elif len(p) == 2:
        p[0] = [p[1]]
    else:
        p[0] = [p[1]] + p[3]
        
# or can separate out parsing rules in OR statement
# into its own function. separate rules give better
# performance, as the parser has done all of your 
# len() work for you.

Office Hours 4

Problem Set 4

Problem 1: Parsing States

What is in chart[2], given:

    S -> id(OPTARGS)
    OPTARGS ->
    OPTARGS -> ARGS
    ARGS -> exp,ARGS
    ARGS -> exp

    input: id(exp,exp)

    chart[0]
        S -> <dot>id(OPTARGS)$, from 0
        
    chart[1]
        # shift
        S -> id<dot>(OPTARGS)$, from 0
        
    chart[2]
        # shift
        S -> id(<dot>OPTARGS)$, from 0
        
        # OPTARGS could be epsilon, hence
        # in one world:
        S -> id(OPTARGS<dot>)$, from 0
        
        # In another world we see OPTARGS
        # and it isn't epsilon, so we closure.
        OPTARGS -> <dot>ARGS, from 2
        OPTARGS -> <dot>, from 2
        
        # !!AI I think by recursion we apply closure to ARGS; reminiscent of epsilon-closure during DFA->NFA conversion.
        ARGS -> <dot>exp,ARGS from 2
        ARGS -> <dot>exp from 2

Unit 5

5.1: Formal Semantics

5.2: Interpreters

5.3: Syntax vs. Semantics

5.4: Bad Programs

5.5: Types

5.7: HTML interpreter

5.8: Graphics

graphics.word(string)
# draw on screen

graphics.begintag(string, dictionary)
# doesn't draw, just makes a note. like changing pen colours.
# dictionary passes in attributes, e.g. href.

graphics.endtag()
# most recent tag.

graphics.warning(string)
# debugging, in bold red color.
import graphics

def interpret(trees): # Hello, friend
    for tree in trees: # Hello,
        # ("word-element","Hello")
        nodetype=tree[0] # "word-element"
        if nodetype == "word-element":
            graphics.word(tree[1])
        elif nodetype == "tag-element":
            # <b>Strong text</b>
            tagname = tree[1] # b
            tagargs = tree[2] # []
            subtrees = tree[3] # ...Strong Text!...
            closetagname = tree[4] # b
            # QUIZ: (1) check that the tags match
            # if not use graphics.warning()
            if tagname != closetagname:
                graphics.warning("Mismatched tag. start: '%s', end: '%s'" % (tagname, closetagname))
            else:
                #  (2): Interpret the subtree
                # HINT: Call interpret recursively
                graphics.begintag(tagname, {})
                interpret(subtrees)
                graphics.endtag()

5.10: Arithmetic

def eval_exp(tree):
    # ("number" , "5")
    # ("binop" , ... , "+", ... )
    nodetype = tree[0]
    if nodetype == "number":
        return int(tree[1])
    elif nodetype == "binop":
        left_child = tree[1]
        operator = tree[2]
        right_child = tree[3]
        # QUIZ: (1) evaluate left and right child
        left_value = eval_exp(left_child)
        right_value = eval_exp(right_child)
        
        # (2) perform "operator"'s work
        assert(operator in ["+", "-"])
        if operator == "+":
            return left_value + right_value
        elif operator == "-":
            return left_value - right_value

5.12: Context

def env_lookup(environment, variable_name):
    ...
def eval_exp(tree, environment):
    nodetype = tree[0]
    if nodetype == "number":
        return int(tree[1])
    elif nodetype == "binop":
        # ...
    elif nodetype == "identifier":
        # ("binop", ("identifier","x"), "+", ("number","2"))
        # QUIZ: (1) find the identifier name
        # (2) look it up in the environment and return it
        return env_lookup(environment, tree[1])

 5.15: Control Flow

def eval_stmts(tree, environment):
    stmttype = tree[0]
    if stmttype == "assign":
        # ("assign", "x", ("binop", ..., "+",  ...)) <=== x = ... + ...
        variable_name = tree[1]
        right_child = tree[2]
        new_value = eval_exp(right_child, environment)
        env_update(environment, variable_name, new_value)
    elif stmttype == "if-then-else": # if x < 5 then A;B; else C;D;
        conditional_exp = tree[1] # x < 5
        then_stmts = tree[2] # A;B;
        else_stmts = tree[3] # C;D;
        # QUIZ: Complete this code
        # Assume "eval_stmts(stmts, environment)" exists
        if eval_exp(conditional_exp, environment):
            return eval_stmts(then_stmts, environment)
        else:
            return eval_stmts(else_stmts, environment)

5.17: Creating an Environment

    Python:
        x = 0
        print x + 1

    JavaScript:
        var x = 0
        write(x+1)

5.18: Scope

 5.19: Identifiers and storage

5.20: Environments.

5.22: Chained environments

  1. Create a new environment.
  2. Create storage places in the new environment for each formal parameter.
  3. Fill in these places with the values of the actual arguments.
  4. Evaluate the function body in the new environment.

5.23: Greetings

5.24: Environment Needs

def env_lookup(var_name, env):
    # env = (parent, dictionary)
    if var_name in env[1]:
        # do we have it?
        return (env[1])[var_name]
    elif env[0] is None:
        # am global?
        return None
    else:
        # ask parents
        return env_lookup(var_name, env[0])

def env_update(var_name, value, env):
    if var_name in env[1]:
        # do we have it?
        (env[1])[var_name] = value
    elif not (env[0] is None):
        # if not global, ask parents.
        env_update(var_name, value, env[0])

5.25: Declaring and Calling Functions

def mean(x):
    return x
    print "one thousand and one nights"

5.26: Catching Errors

def eval_stmt(true, environment):
    stmttype = tree[0]
    if stmttype == "return":
        return_exp = tree[1] # return 1 + 2
        retval = eval_exp(return_exp, environment)
        raise Exception(retval)
def eval_stmt(tree,environment):
    stmttype = tree[0]
    if stmttype == "call": # ("call", "sqrt", [("number","2")])
        fname = tree[1] # "sqrt"
        args = tree[2] # [ ("number", "2") ]
        fvalue = env_lookup(fname, environment)
        if fvalue[0] == "function":
            # We'll make a promise to ourselves:
            # ("function", params, body, env)
            fparams = fvalue[1] # ["x"]
            fbody = fvalue[2]
            fenv = fvalue[3]
            if len(fparams) <> len(args):
                print "ERROR: wrong number of args"
            else:
                #QUIZ: Make a new environment frame
                newfenv = (fenv, {})
                for param, value in zip(fparams, args):
                    newfenv[1][param] = None
                    eval_value = eval_exp(value, environment)
                    env_update(param, eval_value, newfenv)
                try:
                    # QUIZ : Evaluate the body
                    eval_stmts(fbody, newfenv)
                    return None
                except Exception as return_value:
                    return return_value
        else:
            print  "ERROR: call to non-function"
    elif stmttype == "return": 
        retval = eval_exp(tree[1],environment) 
        raise Exception(retval) 
    elif stmttype == "exp": 
        eval_exp(tree[1],environment) 

5.29: Calling functions

def eval_elt(tree, env):
    elttype = tree[0]
    if elttype == "function":
        fname = tree[1]
        fparams = tree[2]
        fbody = tree[3]
        fvalue = ("function", fparams, fbody, env)
        add_to_env(env, fname, fvalue)

5.31: Double-edged sword

5.33: Comparing Languages

x = 0
while True:
    x = x + 1
print x

5.34: Infinite Loop

def tsif():
    if halts(tsif):
        x = 0
        while True:
            x = x + 1
    else:
        return 0

Office Hours 5

Unit 6

Fitting Them Together

def t_javascript(token):
    r'\<script\ type=\"text\/javascript\"\>'
    token.lexer.code_start = token.lexer.lexpos
    token.lexer.begin('javascript')

    # note that lexpos is such that we've already
    # stripped off the initial text/javascript part.

def t_javascript_end(token):
    r'\<\/script\>' # </script>
    token.value = token.lexer.lexdata[token.lexer.code_start:token.lexer.lexpos-9]
    token.type = 'JAVASCRIPT'
    token.lexer.lineno += token.value.count('\n')
    token.lexer.begin('INITIAL')
    return token

    # note that lexdata is such that we need to
    # manually strip off </script> 

 Extending our HTML grammar

def p_element_word(p):
    'element : WORD'
    p[0] = ("word-element", p[1])

    # p[0] is the parse tree
    # p[1] is the child parse tree
def p_element_javascript(p):
    'element : JAVASCRIPT'
    p[0] = ("javascript-element", p[1])
HTML input:

    hello my
    <script type="text/javascript">document.write(99);</script>
    luftballons

Parse tree:

[("word-element", "hello"),
 ("word-element", "my"),
 ("javascript-element", "document.write(99)"),
 ("word-element", "luftballons")]

 Calling the Interpreter

def interpret(trees):
    for tree in trees:
        treetype = tree[0]
        if treetype == "word-element":
            graphics.word(node[1])

        # covered HTML tags in another quiz...

        elif tree.type == "javascript-element":
            jstext = tree[1] # "document.write(55);"

            # jstokens is an external module
            jslexer = lex.lex(module=jstokens)

            # jsgrammar is another external module
            jsparser = yacc.yacc(module=jsgrammar)

            # jstree is a parse tree for JavaScript
            jstree = jsparser.parse(jstext, lexer=jslexer)

            # We want to call the interpreter on our AST
            result = jsinterp.interpret(jstree)
            graphics.word(result) 

 Evil problem

 JavaScript output

def interpret(trees):
    # recall env = (parent, dictionary), and as this is the global environment the parent pointer is None
    global_env = (None, {"javascript output": ""})
    for elt in trees:
        eval_elt(elt, global_env)
    return (global_env[1])["javascript output"]

Updating output

def eval_exp(tree, env):
    exptype = tree[0]
    if exptype == "call":
        fname = tree[1] # myfun in myfun(a,3+4)
        fargs = tree[2] # [a,3+4] in myfun(a,3+4)
        fvalue = envlookup(fname,env) # None for "write"; built-in
        if fname == "write":
            argval = eval_exp(fargs[0],env)
            output_sofar = env_lookup("javascript output",env)
            env_update("javascript output", \
                output_sofar + str(argval), env)
            return None

Counting Frames

function factorial(n) {
    if (n == 0) {
        return 1;
    }
    return n * factorial(n-1);
}
document.write(1260 + factorial(6));

Debugging

Testing

Testing in depth

def env_lookup(vname,env):
    # env = (parent-poiner, {"x": 22, "y": 33})
    if vname in env[1]:
        return (env[1])[vname]
    else: # BUG
        return None # BUG 
var a = 1;
function mistletoe(baldr) {
    baldr = baldr + 1;
    a = a + 2;
    baldr = baldr + a;
    return baldr;
}
write(mistletoe(5));

Testing at Mozilla

Anonymous functions

greeting = "hola"
def makegreeter(greeting):
    def greeter(person):
        print greeting + " " + person
    return greeter
sayhello = makegreeter("hello")
sayhello("gracie")
var greeting = "hola";
function makegreeter(greeting) {
    var greeter = function(person) {
        write(greeting + " " + person);
    }
    return greeter;
}
var sayhello = makegreeter("hello");
def eval_exp(tree,env):
    exptype = tree[0]
    # function(x,y) { return x+y }
    if exptype == "function":
        # ("function", ["x","y"], [ ("return", ("binop", ...) ])
        fparams = tree[1]
        fbody = tree[2]
        return ("function", fparams, fbody, env)
        # "env" allows local functions to see local variables
        # can see variables that were in scope *when the function was defined*

Mistakes in anonymous functions

return ("function", fparams, fbody, global_env)

Optimization

function factorial(n) {
    if (n == 0) { return 1; }
    return 1 * n * factorial(n-1);
}

Implementing Optimizations

  1. Think of optimizations

    x \* 1 == x
    x + 0 == x
  2. Transform parse tree

Optimizing Timing

def optimize(tree):
    etype = tree[0]
    if etype == "binop": # a * 1 = a
        a = tree[1]
        op = tree[2]
        b = tree[3]
        if op == "*" and b == ("number","1"):
            return a
        elif op == "*" and b == ("number","0"):
            return ("number","0")
        elif op == "+" and b == ("number","0"):
            return a
        return tree

i.e. this:

("binop",
    ("number", "5"),
    ("*"),
    ("number", "1")
)

becomes:

("number", "5")

Rebuilding the Parse Tree

  1. Recursive calls
  2. Look for patterns.
  3. Done
def optimize(tree): # Expression trees only
    etype = tree[0]
    if etype == "binop":
        # Fix this code so that it handles a + ( 5 * 0 )
        # recursively! QUIZ!
        a = optimize(tree[1])
        op = tree[2]
        b = optimize(tree[3])
        if op == "*" and b == ("number","1"):
            return a
        elif op == "*" and b == ("number","0"):
            return ("number","0")
        elif op == "+" and b == ("number","0"):
            return a
    return ("binop", a, op, b) # return optimized tree, not original

Bending the Rules

Wrap up

Unit 7 - The Final unit

The List

Regular expressions

Context-Free Grammars

Security

Parsing states

S -> aSb
S -> \epsilon
S -> c

Input: acb

What parsing states are in chart[2]?

Interpretation and Evaluation

Optimization

What next?

References

Unit 1