INTRODUCTION
Creating a compiler from scratch is one of the most rewarding exercises in computer science. It demystifies how programming languages work and provides deep insights into language design, parsing theory, and code execution. In this article, we will build a complete recursive descent compiler for a minimal but functional programming language. Our language will support mathematical expressions, function definitions, control flow constructs including for loops and if-then-else statements, and three basic data types: integers, floats, and strings.
The journey of compilation involves several distinct phases. First, we transform raw source code into a stream of tokens through lexical analysis. Next, we parse these tokens according to our language's grammar rules to build an Abstract Syntax Tree, commonly abbreviated as AST. Finally, we traverse this tree structure to execute the program. Each phase builds upon the previous one, creating a pipeline that transforms human-readable code into executable instructions.
Our mini language will be simple enough to understand completely yet powerful enough to write meaningful programs. We will implement a print function that supports formatted output, allowing us to concatenate strings with numeric values. For instance, the expression print("Result: " + {42.5}) will output "Result: 42.5" to the console. String operations will be limited to concatenation, keeping the implementation straightforward while still being useful.
To make our language truly impressive, we will integrate a comprehensive mathematical library including trigonometric functions such as sine, cosine, and tangent, along with their inverse counterparts. We will add logarithmic functions for both natural and base-10 logarithms, exponential functions, square root calculations, and general power operations. We will also include logical operations for boolean algebra and a binomial coefficient function for combinatorial calculations. These additions transform our simple language into a capable tool for scientific and mathematical computation.
Throughout this article, we will examine each component in detail, explaining not just what the code does but why it works that way. We will use a running example to illustrate concepts, showing how a simple program flows through each compilation phase. By the end, you will have a complete, working compiler that you can extend and modify for your own purposes.
LANGUAGE SPECIFICATION
Before we begin implementation, we must precisely define our language. A clear specification prevents ambiguity and guides our design decisions. Our mini language, which we will call MiniLang, has a straightforward syntax inspired by common programming languages but simplified to focus on core concepts.
The language supports three primitive data types. Integers represent whole numbers and can be positive or negative. Floats represent decimal numbers with fractional parts. Strings are sequences of characters enclosed in double quotes. Type conversions happen implicitly in certain contexts, particularly when concatenating values for output.
Variables in MiniLang are dynamically typed, meaning a variable can hold any type of value and can change types during execution. Variable names must start with a letter or underscore and can contain letters, digits, and underscores. Assignment uses the equals sign, as in x = 42 or name = "Alice".
Mathematical expressions support the standard arithmetic operators: addition, subtraction, multiplication, division, and modulus. Operator precedence follows conventional mathematics, with multiplication and division having higher precedence than addition and subtraction. Parentheses can override precedence. Comparison operators include less than, greater than, less than or equal, greater than or equal, equality, and inequality.
The mathematical library provides a rich set of built-in functions. Trigonometric functions include sin, cos, tan for computing sine, cosine, and tangent of angles in radians. Their inverse functions asin, acos, atan compute arc sine, arc cosine, and arc tangent respectively. The sqrt function computes square roots. The pow function raises a number to a power, as in pow(2, 8) which computes two to the eighth power. The log function computes natural logarithms while log10 computes base-10 logarithms. The exp function computes e raised to a power. The abs function returns absolute values.
Logical operations include the standard boolean operators: and for logical conjunction, or for logical disjunction, and not for logical negation. These operators work with boolean values but also follow truthiness rules where zero, empty strings, and null are considered false while all other values are true.
The binomial function computes binomial coefficients, also known as combinations or n-choose-k. The expression binomial(5, 2) computes the number of ways to choose 2 items from 5, which equals 10. This function is essential for probability calculations and combinatorial problems.
Function definitions use the syntax func name(param1, param2) { body }. Functions can take zero or more parameters and return a value using the return statement. If no return statement is executed, the function returns an implicit null value.
Control flow constructs include if-then-else statements and for loops. The if statement evaluates a condition and executes a block of code if the condition is true, with an optional else clause for the false case. The for loop has the syntax for(initialization; condition; increment) { body }, similar to C-style languages. While loops are also supported with the syntax while(condition) { body }.
The print function is a built-in feature that outputs text to the console. It accepts a single argument which can be a string, a number, or an expression that evaluates to one of these types. String concatenation uses the plus operator. To embed numeric values in strings, we use curly braces, as in "Value: " + {x}.
Comments begin with two forward slashes and extend to the end of the line. Whitespace is generally ignored except where it separates tokens. Statements are terminated by semicolons, though our implementation will be somewhat forgiving about their omission in certain contexts.
TOKENIZATION: LEXICAL ANALYSIS
The first phase of compilation is lexical analysis, also called tokenization or scanning. This process reads the raw source code character by character and groups characters into meaningful units called tokens. A token represents the smallest meaningful element of the language, such as a keyword, identifier, operator, or literal value.
Consider the simple statement x = 42 + y. The tokenizer breaks this into five tokens: an identifier "x", an equals operator, an integer literal "42", a plus operator, and another identifier "y". Each token has a type and a value. The identifier tokens have type IDENTIFIER with values "x" and "y". The integer literal has type INTEGER with value 42. The operators have types EQUALS and PLUS respectively.
The tokenizer maintains a position in the source code and advances through it, examining each character to determine what token it belongs to. When it encounters whitespace, it typically skips over it, though whitespace does serve to separate tokens. When it finds a digit, it continues reading digits (and possibly a decimal point for floats) until it has consumed the entire number. When it finds a letter, it reads letters and digits to form an identifier or keyword.
Let us examine a concrete implementation of the tokenizer. We will define a Token class to represent individual tokens and a Tokenizer class to perform the lexical analysis.
class Token:
def __init__(self, token_type, value, line, column):
self.type = token_type
self.value = value
self.line = line
self.column = column
def __repr__(self):
return f"Token({self.type}, {self.value}, {self.line}:{self.column})"
The Token class stores four pieces of information. The type indicates what kind of token this is, such as INTEGER, IDENTIFIER, or PLUS. The value holds the actual content, like the number 42 or the variable name "x". The line and
column numbers help with error reporting, allowing us to tell the user exactly where a problem occurred in their source code.
The Tokenizer class maintains the source code and current position. It provides methods to peek at the current character, advance to the next character, and skip whitespace. The main tokenize method returns a list of all tokens in the source code.
class Tokenizer:
def __init__(self, source):
self.source = source
self.position = 0
self.line = 1
self.column = 1
self.tokens = []
def current_char(self):
if self.position >= len(self.source):
return None
return self.source[self.position]
def peek_char(self, offset=1):
pos = self.position + offset
if pos >= len(self.source):
return None
return self.source[pos]
def advance(self):
if self.position < len(self.source):
if self.source[self.position] == '\n':
self.line += 1
self.column = 1
else:
self.column += 1
self.position += 1
The current_char method returns the character at the current position, or None if we have reached the end of the source code. The peek_char method looks ahead without advancing the position, which is useful for distinguishing between single-character and multi-character operators. The advance method moves to the next character and updates line and column tracking for error messages.
Skipping whitespace is straightforward. We advance past any spaces, tabs, or newlines. Comments are also handled here since they function like whitespace from the parser's perspective.
def skip_whitespace(self):
while self.current_char() and self.current_char() in ' \t\n\r':
self.advance()
def skip_comment(self):
if self.current_char() == '/' and self.peek_char() == '/':
while self.current_char() and self.current_char() != '\n':
self.advance()
self.advance()
Reading numbers requires careful handling of both integers and floats. We accumulate digits, and if we encounter a decimal point followed by more digits, we know we have a float. Otherwise, we have an integer.
def read_number(self):
start_line = self.line
start_column = self.column
num_str = ''
has_dot = False
while self.current_char() and (self.current_char().isdigit() or self.current_char() == '.'):
if self.current_char() == '.':
if has_dot:
break
has_dot = True
num_str += self.current_char()
self.advance()
if has_dot:
return Token('FLOAT', float(num_str), start_line, start_column)
else:
return Token('INTEGER', int(num_str), start_line, start_column)
Reading identifiers and keywords follows a similar pattern. We accumulate letters, digits, and underscores. After reading the complete word, we check if it matches a keyword. If so, we create a token with the keyword type. Otherwise, we create an identifier token. Our extended keyword list now includes the names of all built-in mathematical and logical functions.
def read_identifier(self):
start_line = self.line
start_column = self.column
identifier =
while self.current_char() and (self.current_char().isalnum() or self.current_char() == '_'):
identifier += self.current_char()
self.advance()
keywords = {
'func': 'FUNC',
'return': 'RETURN',
'if': 'IF',
'else': 'ELSE',
'for': 'FOR',
'while': 'WHILE',
'int': 'INT_TYPE',
'float': 'FLOAT_TYPE',
'string': 'STRING_TYPE',
'print': 'PRINT',
'true': 'TRUE',
'false': 'FALSE',
'null': 'NULL',
'and': 'AND',
'or': 'OR',
'not': 'NOT',
'sin': 'SIN',
'cos': 'COS',
'tan': 'TAN',
'asin': 'ASIN',
'acos': 'ACOS',
'atan': 'ATAN',
'sqrt': 'SQRT',
'pow': 'POW',
'log': 'LOG',
'log10': 'LOG10',
'exp': 'EXP',
'abs': 'ABS',
'binomial': 'BINOMIAL'
}
token_type = keywords.get(identifier, 'IDENTIFIER')
return Token(token_type, identifier, start_line, start_column)
String literals are enclosed in double quotes. We read characters until we find the closing quote, handling escape sequences along the way. Common escape sequences include backslash-n for newline and backslash-quote for a literal quote character within the string.
def read_string(self):
start_line = self.line
start_column = self.column
self.advance()
string_value = ''
while self.current_char() and self.current_char() != '"':
if self.current_char() == '\\':
self.advance()
if self.current_char() == 'n':
string_value += '\n'
elif self.current_char() == 't':
string_value += '\t'
elif self.current_char() == 'r':
string_value += '\r'
elif self.current_char() == '"':
string_value += '"'
elif self.current_char() == '\\':
string_value += '\\'
else:
string_value += self.current_char()
self.advance()
else:
string_value += self.current_char()
self.advance()
if self.current_char() == '"':
self.advance()
else:
raise SyntaxError(f"Unterminated string at {start_line}:{start_column}")
return Token('STRING', string_value, start_line, start_column)
The main tokenize method orchestrates the entire process. It loops through the source code, skipping whitespace and comments, then determines what kind of token to read based on the current character.
def tokenize(self):
while self.position < len(self.source):
self.skip_whitespace()
if self.current_char() is None:
break
if self.current_char() == '/' and self.peek_char() == '/':
self.skip_comment()
continue
if self.current_char().isdigit():
self.tokens.append(self.read_number())
continue
if self.current_char().isalpha() or self.current_char() == '_':
self.tokens.append(self.read_identifier())
continue
if self.current_char() == '"':
self.tokens.append(self.read_string())
continue
self.read_operator()
self.tokens.append(Token('EOF', None, self.line, self.column))
return self.tokens
Reading operators requires checking for both single-character and multi-character operators. For example, the less-than sign can be a standalone operator, or it can be part of the less-than-or-equal operator when followed by an equals sign.
def read_operator(self):
start_line = self.line
start_column = self.column
char = self.current_char()
if char == '=' and self.peek_char() == '=':
self.advance()
self.advance()
self.tokens.append(Token('EQUALS_EQUALS', '==', start_line, start_column))
return
if char == '!' and self.peek_char() == '=':
self.advance()
self.advance()
self.tokens.append(Token('NOT_EQUALS', '!=', start_line, start_column))
return
if char == '<' and self.peek_char() == '=':
self.advance()
self.advance()
self.tokens.append(Token('LESS_EQUALS', '<=', start_line, start_column))
return
if char == '>' and self.peek_char() == '=':
self.advance()
self.advance()
self.tokens.append(Token('GREATER_EQUALS', '>=', start_line, start_column))
return
single_char_tokens = {
'+': 'PLUS',
'-': 'MINUS',
'*': 'MULTIPLY',
'/': 'DIVIDE',
'%': 'MODULO',
'=': 'EQUALS',
'<': 'LESS_THAN',
'>': 'GREATER_THAN',
'!': 'NOT_OP',
'(': 'LPAREN',
')': 'RPAREN',
'{': 'LBRACE',
'}': 'RBRACE',
'[': 'LBRACKET',
']': 'RBRACKET',
';': 'SEMICOLON',
',': 'COMMA',
':': 'COLON'
}
if char in single_char_tokens:
self.advance()
self.tokens.append(Token(single_char_tokens[char], char, start_line, start_column))
return
raise SyntaxError(f"Unexpected character '{char}' at {start_line}:{start_column}")
Let us trace through a simple example to see tokenization in action. Consider the source code snippet angle = 3.14159 / 4; result = sin(angle). The tokenizer begins at position zero with the character 'a'. Since this is a letter, it calls read_identifier, which accumulates the characters 'angle' and returns an IDENTIFIER token. After skipping whitespace, it encounters the equals sign and creates an EQUALS token. More whitespace is skipped, then the digit '3' triggers read_number, which accumulates '3.14159' and returns a FLOAT token with value 3.14159. The forward slash becomes a DIVIDE token. The digit '4' becomes an INTEGER token. The semicolon becomes a SEMICOLON token.
Continuing with the second statement, we get an IDENTIFIER token for 'result', an EQUALS token, and then the identifier 'sin'. Because 'sin' matches a keyword in our table, it becomes a SIN token rather than an IDENTIFIER token. The opening parenthesis becomes an LPAREN token, 'angle' becomes an IDENTIFIER token, and the closing parenthesis becomes an RPAREN token.
The resulting token stream looks like this:
Token(IDENTIFIER, angle, 1:1)
Token(EQUALS, =, 1:7)
Token(FLOAT, 3.14159, 1:9)
Token(DIVIDE, /, 1:17)
Token(INTEGER, 4, 1:19)
Token(SEMICOLON, ;, 1:20)
Token(IDENTIFIER, result, 1:22)
Token(EQUALS, =, 1:29)
Token(SIN, sin, 1:31)
Token(LPAREN, (, 1:34)
Token(IDENTIFIER, angle, 1:35)
Token(RPAREN, ), 1:40)
Token(EOF, None, 1:41)
The EOF token marks the end of the input, allowing the parser to know when it has consumed all available tokens.
PARSING: BUILDING THE ABSTRACT SYNTAX TREE
With tokens in hand, we move to the parsing phase. The parser consumes tokens according to the grammar rules of our language and constructs an Abstract Syntax Tree. The AST is a hierarchical representation of the program's structure, where each node represents a construct in the language.
A recursive descent parser is a top-down parser built from a set of mutually recursive functions. Each function corresponds to a rule in the language's grammar. For example, we might have a parse_expression function that handles expressions, a parse_statement function for statements, and a parse_function function for function definitions.
The grammar of our language can be expressed in a notation called Backus-Naur Form, or BNF. This notation uses angle brackets to denote non-terminal symbols (grammar rules) and plain text for terminal symbols (tokens). For example, an expression might be defined as a term followed by zero or more additions or subtractions of terms.
Let us define the key grammar rules for MiniLang:
- A program consists of zero or more statements.
- A statement can be a variable assignment, a function definition, a return statement, an if statement, a for loop, a while loop, or an expression statement.
- An expression can be a logical OR expression, which can be a logical AND expression, which can be a comparison, which can be an arithmetic expression, which can be a term, which can be a factor, which can be a primary expression.
This hierarchy reflects operator precedence. By parsing multiplication and division at a lower level than addition and subtraction, we ensure that 2 + 3 * 4 is correctly interpreted as 2 + (3 * 4) rather than (2 + 3) * 4. Logical operators have lower precedence than comparison operators, which have lower precedence than arithmetic operators.
We begin by defining AST node classes. Each type of construct in our language has a corresponding node class. For instance, a binary operation like addition has a BinaryOp node that stores the operator and the left and right operands. Built-in mathematical functions have a BuiltinCall node that stores the function name and arguments.
class ASTNode:
pass
class Program(ASTNode):
def __init__(self, statements):
self.statements = statements
class BinaryOp(ASTNode):
def __init__(self, left, operator, right):
self.left = left
self.operator = operator
self.right = right
class UnaryOp(ASTNode):
def __init__(self, operator, operand):
self.operator = operator
self.operand = operand
class Number(ASTNode):
def __init__(self, value):
self.value = value
class String(ASTNode):
def __init__(self, value):
self.value = value
class Boolean(ASTNode):
def __init__(self, value):
self.value = value
class Null(ASTNode):
pass
class Identifier(ASTNode):
def __init__(self, name):
self.name = name
class Assignment(ASTNode):
def __init__(self, name, value):
self.name = name
self.value = value
class FunctionDef(ASTNode):
def __init__(self, name, parameters, body):
self.name = name
self.parameters = parameters
self.body = body
class FunctionCall(ASTNode):
def __init__(self, name, arguments):
self.name = name
self.arguments = arguments
class BuiltinCall(ASTNode):
def __init__(self, function_name, arguments):
self.function_name = function_name
self.arguments = arguments
class Return(ASTNode):
def __init__(self, value):
self.value = value
class IfStatement(ASTNode):
def __init__(self, condition, then_branch, else_branch):
self.condition = condition
self.then_branch = then_branch
self.else_branch = else_branch
class ForLoop(ASTNode):
def __init__(self, initialization, condition, increment, body):
self.initialization = initialization
self.condition = condition
self.increment = increment
self.body = body
class WhileLoop(ASTNode):
def __init__(self, condition, body):
self.condition = condition
self.body = body
class PrintStatement(ASTNode):
def __init__(self, expression):
self.expression = expression
class Block(ASTNode):
def __init__(self, statements):
self.statements = statements
Each node class inherits from ASTNode and stores the relevant information for that construct. A BinaryOp stores the left operand, operator, and right operand. A FunctionDef stores the function name, parameter list, and body. A BuiltinCall stores the name of the built-in function and its arguments. These simple data structures will be traversed during the execution phase.
The Parser class maintains the list of tokens and a current position. It provides methods to peek at the current token, consume a token, and check if the current token matches an expected type.
class Parser:
def __init__(self, tokens):
self.tokens = tokens
self.position = 0
def current_token(self):
if self.position < len(self.tokens):
return self.tokens[self.position]
return None
def peek_token(self, offset=1):
pos = self.position + offset
if pos < len(self.tokens):
return self.tokens[pos]
return None
def advance(self):
self.position += 1
def expect(self, token_type):
token = self.current_token()
if token is None or token.type != token_type:
raise SyntaxError(f"Expected {token_type} but got {token.type if token else 'EOF'}")
self.advance()
return token
def match(self, *token_types):
token = self.current_token()
if token and token.type in token_types:
return True
return False
The expect method is particularly useful. It verifies that the current token has the expected type, raises an error if not, and advances to the next token. This pattern appears frequently in parsing: we expect a certain token, consume it, and continue.
Parsing begins with the parse method, which creates a Program node containing all top-level statements.
def parse(self):
statements = []
while self.current_token() and self.current_token().type != 'EOF':
statements.append(self.parse_statement())
return Program(statements)
The parse_statement method dispatches to the appropriate parsing function based on the current token. If we see the func keyword, we parse a function definition. If we see the if keyword, we parse an if statement. Otherwise, we try to parse an assignment or expression statement.
def parse_statement(self):
token = self.current_token()
if token is None or token.type == 'EOF':
raise SyntaxError("Unexpected end of file")
if token.type == 'FUNC':
return self.parse_function_def()
if token.type == 'RETURN':
return self.parse_return()
if token.type == 'IF':
return self.parse_if_statement()
if token.type == 'FOR':
return self.parse_for_loop()
if token.type == 'WHILE':
return self.parse_while_loop()
if token.type == 'PRINT':
return self.parse_print_statement()
if token.type == 'LBRACE':
return self.parse_block()
if token.type == 'IDENTIFIER' and self.peek_token() and self.peek_token().type == 'EQUALS':
return self.parse_assignment()
expr = self.parse_expression()
if self.match('SEMICOLON'):
self.advance()
return expr
Parsing a function definition involves reading the func keyword, the function name, the parameter list in parentheses, and the function body in braces.
def parse_function_def(self):
self.expect('FUNC')
name_token = self.expect('IDENTIFIER')
name = name_token.value
self.expect('LPAREN')
parameters = []
if not self.match('RPAREN'):
parameters.append(self.expect('IDENTIFIER').value)
while self.match('COMMA'):
self.advance()
parameters.append(self.expect('IDENTIFIER').value)
self.expect('RPAREN')
body = self.parse_block()
return FunctionDef(name, parameters, body)
A block is a sequence of statements enclosed in braces. We parse statements until we encounter the closing brace.
def parse_block(self):
self.expect('LBRACE')
statements = []
while not self.match('RBRACE'):
if self.current_token() is None or self.current_token().type == 'EOF':
raise SyntaxError("Expected '}' but reached end of file")
statements.append(self.parse_statement())
self.expect('RBRACE')
return Block(statements)
Parsing an if statement requires reading the condition, the then branch, and optionally the else branch.
def parse_if_statement(self):
self.expect('IF')
self.expect('LPAREN')
condition = self.parse_expression()
self.expect('RPAREN')
then_branch = self.parse_statement()
else_branch = None
if self.match('ELSE'):
self.advance()
else_branch = self.parse_statement()
return IfStatement(condition, then_branch, else_branch)
For loops follow the C-style syntax with initialization, condition, and increment expressions separated by semicolons.
def parse_for_loop(self):
self.expect('FOR')
self.expect('LPAREN')
initialization = None
if not self.match('SEMICOLON'):
if self.match('IDENTIFIER') and self.peek_token() and self.peek_token().type == 'EQUALS':
initialization = self.parse_assignment()
else:
initialization = self.parse_expression()
if self.match('SEMICOLON'):
self.advance()
else:
self.advance()
condition = None
if not self.match('SEMICOLON'):
condition = self.parse_expression()
self.expect('SEMICOLON')
increment = None
if not self.match('RPAREN'):
increment = self.parse_expression()
self.expect('RPAREN')
body = self.parse_statement()
return ForLoop(initialization, condition, increment, body)
While loops are simpler, requiring only a condition and a body.
def parse_while_loop(self):
self.expect('WHILE')
self.expect('LPAREN')
condition = self.parse_expression()
self.expect('RPAREN')
body = self.parse_statement()
return WhileLoop(condition, body)
The print statement is straightforward. It expects the print keyword followed by an expression in parentheses.
def parse_print_statement(self):
self.expect('PRINT')
self.expect('LPAREN')
expression = self.parse_expression()
self.expect('RPAREN')
if self.match('SEMICOLON'):
self.advance()
return PrintStatement(expression)
Assignment statements consist of an identifier, an equals sign, and an expression.
def parse_assignment(self):
name = self.expect('IDENTIFIER').value
self.expect('EQUALS')
value = self.parse_expression()
if self.match('SEMICOLON'):
self.advance()
return Assignment(name, value)
Return statements have the return keyword followed by an optional expression.
def parse_return(self):
self.expect('RETURN')
value = None
if not self.match('SEMICOLON') and not self.match('RBRACE'):
value = self.parse_expression()
if self.match('SEMICOLON'):
self.advance()
return Return(value)
Expression parsing is where operator precedence comes into play. We parse expressions as a hierarchy of operations, with logical OR at the top level, then logical AND, then comparison operators, then addition and subtraction, then multiplication and division, and finally primary expressions at the bottom.
def parse_expression(self):
return self.parse_logical_or()
def parse_logical_or(self):
left = self.parse_logical_and()
while self.match('OR'):
operator = self.current_token()
self.advance()
right = self.parse_logical_and()
left = BinaryOp(left, operator, right)
return left
def parse_logical_and(self):
left = self.parse_comparison()
while self.match('AND'):
operator = self.current_token()
self.advance()
right = self.parse_comparison()
left = BinaryOp(left, operator, right)
return left
def parse_comparison(self):
left = self.parse_additive()
while self.match('EQUALS_EQUALS', 'NOT_EQUALS', 'LESS_THAN', 'GREATER_THAN', 'LESS_EQUALS', 'GREATER_EQUALS'):
operator = self.current_token()
self.advance()
right = self.parse_additive()
left = BinaryOp(left, operator, right)
return left
def parse_additive(self):
left = self.parse_multiplicative()
while self.match('PLUS', 'MINUS'):
operator = self.current_token()
self.advance()
right = self.parse_multiplicative()
left = BinaryOp(left, operator, right)
return left
def parse_multiplicative(self):
left = self.parse_unary()
while self.match('MULTIPLY', 'DIVIDE', 'MODULO'):
operator = self.current_token()
self.advance()
right = self.parse_unary()
left = BinaryOp(left, operator, right)
return left
def parse_unary(self):
if self.match('MINUS', 'NOT', 'NOT_OP'):
operator = self.current_token()
self.advance()
operand = self.parse_unary()
return UnaryOp(operator, operand)
return self.parse_primary()
Primary expressions are the atomic elements: numbers, strings, identifiers, function calls, built-in function calls, and parenthesized expressions. We also handle the special syntax for embedding expressions in strings using curly braces. Built-in mathematical functions are recognized by their keyword tokens and parsed as BuiltinCall nodes.
def parse_primary(self):
token = self.current_token()
if token is None:
raise SyntaxError("Unexpected end of file in expression")
if token.type == 'INTEGER':
self.advance()
return Number(token.value)
if token.type == 'FLOAT':
self.advance()
return Number(token.value)
if token.type == 'STRING':
self.advance()
return self.parse_string_with_interpolation(token.value)
if token.type == 'TRUE':
self.advance()
return Boolean(True)
if token.type == 'FALSE':
self.advance()
return Boolean(False)
if token.type == 'NULL':
self.advance()
return Null()
builtin_functions = ['SIN', 'COS', 'TAN', 'ASIN', 'ACOS', 'ATAN',
'SQRT', 'POW', 'LOG', 'LOG10', 'EXP', 'ABS', 'BINOMIAL']
if token.type in builtin_functions:
func_name = token.value
self.advance()
self.expect('LPAREN')
arguments = []
if not self.match('RPAREN'):
arguments.append(self.parse_expression())
while self.match('COMMA'):
self.advance()
arguments.append(self.parse_expression())
self.expect('RPAREN')
return BuiltinCall(func_name, arguments)
if token.type == 'IDENTIFIER':
name = token.value
self.advance()
if self.match('LPAREN'):
self.advance()
arguments = []
if not self.match('RPAREN'):
arguments.append(self.parse_expression())
while self.match('COMMA'):
self.advance()
arguments.append(self.parse_expression())
self.expect('RPAREN')
return FunctionCall(name, arguments)
return Identifier(name)
if token.type == 'LPAREN':
self.advance()
expr = self.parse_expression()
self.expect('RPAREN')
return expr
raise SyntaxError(f"Unexpected token {token.type} at {token.line}:{token.column}")
String interpolation allows us to embed expressions within strings using curly braces. When we encounter a string literal, we scan it for curly brace pairs and parse the contents as expressions. The result is a concatenation of string literals and expression results.
def parse_string_with_interpolation(self, string_value):
parts = []
current = ''
i = 0
while i < len(string_value):
if string_value[i] == '{':
if current:
parts.append(String(current))
current = ''
i += 1
expr_str = ''
brace_count = 1
while i < len(string_value) and brace_count > 0:
if string_value[i] == '{':
brace_count += 1
elif string_value[i] == '}':
brace_count -= 1
if brace_count == 0:
break
expr_str += string_value[i]
i += 1
if expr_str:
expr_tokenizer = Tokenizer(expr_str)
expr_tokens = expr_tokenizer.tokenize()
expr_parser = Parser(expr_tokens)
expr = expr_parser.parse_expression()
parts.append(expr)
i += 1
else:
current += string_value[i]
i += 1
if current:
parts.append(String(current))
if not parts:
return String('')
result = parts[0]
for part in parts[1:]:
result = BinaryOp(result, Token('PLUS', '+', 0, 0), part)
return result
Let us trace through parsing a simple expression to see how the AST is built. Consider the expression sqrt(pow(3, 2) + pow(4, 2)). The parser begins with parse_expression, which calls parse_logical_or, which calls parse_logical_and, which calls parse_comparison, which calls parse_additive, which calls parse_multiplicative, which calls parse_unary, which calls parse_primary.
The primary parser sees the SQRT token, which is in the builtin_functions list. It advances past SQRT, expects and consumes the opening parenthesis, then parses the argument expression. This argument is itself a complex expression: pow(3, 2) + pow(4, 2).
Parsing this argument starts again at parse_expression and works down through the precedence levels. At parse_additive, we parse the left side pow(3, 2), see the PLUS token, and then parse the right side pow(4, 2). Each pow call is recognized as a built-in function and parsed as a BuiltinCall node with two arguments.
The resulting AST structure looks like this in tree form:
BuiltinCall(sqrt)
|
BinaryOp(+)
/ \
BuiltinCall(pow) BuiltinCall(pow)
/ \ / \
Number(3) Number(2) Number(4) Number(2)
This hierarchical structure makes it clear that we first compute 3 squared and 4 squared, add them together, and then take the square root of the result, giving us 5, the length of the hypotenuse of a 3-4-5 right triangle.
EXECUTING THE ABSTRACT SYNTAX TREE
With the AST constructed, we move to the execution phase. The interpreter traverses the tree and performs the operations represented by each node.
This is sometimes called tree-walking interpretation because we walk through the tree structure, visiting each node and executing its associated action.
The interpreter maintains an environment, which is a mapping from variable names to their values. When we encounter an assignment, we store the value in the environment. When we encounter an identifier, we look up its value in the environment. Function definitions are also stored in the environment, allowing us to call them later.
We implement the interpreter as a class with methods for evaluating different types of AST nodes. The main evaluate method dispatches to the appropriate handler based on the node type. For built-in mathematical functions, we use Python's math library to perform the actual calculations.
import math
class Interpreter:
def __init__(self):
self.global_env = {}
self.local_envs = []
def current_env(self):
if self.local_envs:
return self.local_envs[-1]
return self.global_env
def push_env(self):
self.local_envs.append({})
def pop_env(self):
if self.local_envs:
self.local_envs.pop()
def get_variable(self, name):
for env in reversed(self.local_envs):
if name in env:
return env[name]
if name in self.global_env:
return self.global_env[name]
raise NameError(f"Undefined variable '{name}'")
def set_variable(self, name, value):
self.current_env()[name] = value
The interpreter uses a stack of environments to handle function scopes. When we call a function, we push a new environment onto the stack. When the function returns, we pop the environment. This allows local variables to shadow global variables and ensures that variables defined in a function do not leak into the global scope.
The main evaluate method checks the type of the node and calls the appropriate handler.
def evaluate(self, node):
if isinstance(node, Program):
return self.evaluate_program(node)
if isinstance(node, Block):
return self.evaluate_block(node)
if isinstance(node, Number):
return node.value
if isinstance(node, String):
return node.value
if isinstance(node, Boolean):
return node.value
if isinstance(node, Null):
return None
if isinstance(node, Identifier):
return self.get_variable(node.name)
if isinstance(node, BinaryOp):
return self.evaluate_binary_op(node)
if isinstance(node, UnaryOp):
return self.evaluate_unary_op(node)
if isinstance(node, Assignment):
return self.evaluate_assignment(node)
if isinstance(node, FunctionDef):
return self.evaluate_function_def(node)
if isinstance(node, FunctionCall):
return self.evaluate_function_call(node)
if isinstance(node, BuiltinCall):
return self.evaluate_builtin_call(node)
if isinstance(node, Return):
return self.evaluate_return(node)
if isinstance(node, IfStatement):
return self.evaluate_if_statement(node)
if isinstance(node, ForLoop):
return self.evaluate_for_loop(node)
if isinstance(node, WhileLoop):
return self.evaluate_while_loop(node)
if isinstance(node, PrintStatement):
return self.evaluate_print_statement(node)
raise RuntimeError(f"Unknown node type: {type(node).__name__}")
Evaluating a program means evaluating each statement in sequence. We return the result of the last statement, though in practice most programs do not use this return value.
def evaluate_program(self, node):
result = None
for statement in node.statements:
result = self.evaluate(statement)
if isinstance(result, ReturnValue):
return result.value
return result
A block is similar, but we need to handle return statements specially. When a return statement is executed, we need to propagate the return value up through the call stack. We do this by wrapping the return value in a special ReturnValue object.
class ReturnValue:
def __init__(self, value):
self.value = value
def evaluate_block(self, node):
result = None
for statement in node.statements:
result = self.evaluate(statement)
if isinstance(result, ReturnValue):
return result
return result
Binary operations require evaluating both operands and then applying the operator. For arithmetic operators, we perform the calculation. For comparison operators, we return a boolean result. For the plus operator, we handle both numeric addition and string concatenation. For logical operators, we implement short-circuit evaluation where the right operand is only evaluated if necessary.
def evaluate_binary_op(self, node):
operator = node.operator.type
if operator == 'AND':
left = self.evaluate(node.left)
if not self.is_truthy(left):
return False
right = self.evaluate(node.right)
return self.is_truthy(right)
if operator == 'OR':
left = self.evaluate(node.left)
if self.is_truthy(left):
return True
right = self.evaluate(node.right)
return self.is_truthy(right)
left = self.evaluate(node.left)
right = self.evaluate(node.right)
if operator == 'PLUS':
if isinstance(left, str) or isinstance(right, str):
return self.value_to_string(left) + self.value_to_string(right)
return left + right
if operator == 'MINUS':
return left - right
if operator == 'MULTIPLY':
return left * right
if operator == 'DIVIDE':
if right == 0:
raise RuntimeError("Division by zero")
return left / right
if operator == 'MODULO':
return left % right
if operator == 'EQUALS_EQUALS':
return left == right
if operator == 'NOT_EQUALS':
return left != right
if operator == 'LESS_THAN':
return left < right
if operator == 'GREATER_THAN':
return left > right
if operator == 'LESS_EQUALS':
return left <= right
if operator == 'GREATER_EQUALS':
return left >= right
raise RuntimeError(f"Unknown binary operator: {operator}")
Unary operations are simpler. We support the unary minus for negation and the not operator for logical negation.
def evaluate_unary_op(self, node):
operand = self.evaluate(node.operand)
operator = node.operator.type
if operator == 'MINUS':
return -operand
if operator == 'NOT' or operator == 'NOT_OP':
return not self.is_truthy(operand)
raise RuntimeError(f"Unknown unary operator: {operator}")
Assignment evaluates the right-hand side and stores the result in the environment.
def evaluate_assignment(self, node):
value = self.evaluate(node.value)
self.set_variable(node.name, value)
return value
Function definitions store the function object in the environment. We create a simple function object that holds the parameter names and body.
class Function:
def __init__(self, name, parameters, body, closure_env):
self.name = name
self.parameters = parameters
self.body = body
self.closure_env = closure_env
def evaluate_function_def(self, node):
func = Function(node.name, node.parameters, node.body, dict(self.global_env))
self.set_variable(node.name, func)
return func
Function calls are more complex. We evaluate the arguments, create a new environment, bind the parameters to the argument values, and evaluate the function body in this new environment.
def evaluate_function_call(self, node):
func = self.get_variable(node.name)
if not isinstance(func, Function):
raise RuntimeError(f"'{node.name}' is not a function")
if len(node.arguments) != len(func.parameters):
raise RuntimeError(
f"Function '{node.name}' expects {len(func.parameters)} arguments "
f"but got {len(node.arguments)}"
)
arg_values = [self.evaluate(arg) for arg in node.arguments]
self.push_env()
for param, arg_value in zip(func.parameters, arg_values):
self.set_variable(param, arg_value)
result = self.evaluate(func.body)
self.pop_env()
if isinstance(result, ReturnValue):
return result.value
return None
Built-in function calls are handled separately. We evaluate the arguments and then dispatch to the appropriate mathematical function. Each built-in function has specific requirements for the number and type of arguments.
def evaluate_builtin_call(self, node):
func_name = node.function_name
args = [self.evaluate(arg) for arg in node.arguments]
if func_name == 'sin':
if len(args) != 1:
raise RuntimeError("sin() expects exactly 1 argument")
return math.sin(args[0])
if func_name == 'cos':
if len(args) != 1:
raise RuntimeError("cos() expects exactly 1 argument")
return math.cos(args[0])
if func_name == 'tan':
if len(args) != 1:
raise RuntimeError("tan() expects exactly 1 argument")
return math.tan(args[0])
if func_name == 'asin':
if len(args) != 1:
raise RuntimeError("asin() expects exactly 1 argument")
if args[0] < -1 or args[0] > 1:
raise RuntimeError("asin() argument must be in range [-1, 1]")
return math.asin(args[0])
if func_name == 'acos':
if len(args) != 1:
raise RuntimeError("acos() expects exactly 1 argument")
if args[0] < -1 or args[0] > 1:
raise RuntimeError("acos() argument must be in range [-1, 1]")
return math.acos(args[0])
if func_name == 'atan':
if len(args) != 1:
raise RuntimeError("atan() expects exactly 1 argument")
return math.atan(args[0])
if func_name == 'sqrt':
if len(args) != 1:
raise RuntimeError("sqrt() expects exactly 1 argument")
if args[0] < 0:
raise RuntimeError("sqrt() argument must be non-negative")
return math.sqrt(args[0])
if func_name == 'pow':
if len(args) != 2:
raise RuntimeError("pow() expects exactly 2 arguments")
return math.pow(args[0], args[1])
if func_name == 'log':
if len(args) != 1:
raise RuntimeError("log() expects exactly 1 argument")
if args[0] <= 0:
raise RuntimeError("log() argument must be positive")
return math.log(args[0])
if func_name == 'log10':
if len(args) != 1:
raise RuntimeError("log10() expects exactly 1 argument")
if args[0] <= 0:
raise RuntimeError("log10() argument must be positive")
return math.log10(args[0])
if func_name == 'exp':
if len(args) != 1:
raise RuntimeError("exp() expects exactly 1 argument")
return math.exp(args[0])
if func_name == 'abs':
if len(args) != 1:
raise RuntimeError("abs() expects exactly 1 argument")
return abs(args[0])
if func_name == 'binomial':
if len(args) != 2:
raise RuntimeError("binomial() expects exactly 2 arguments (n, k)")
n, k = int(args[0]), int(args[1])
if n < 0 or k < 0:
raise RuntimeError("binomial() arguments must be non-negative")
if k > n:
return 0
return math.factorial(n) // (math.factorial(k) * math.factorial(n - k))
raise RuntimeError(f"Unknown built-in function: {func_name}")
Return statements create a ReturnValue object that propagates up through the evaluation.
def evaluate_return(self, node):
value = None
if node.value:
value = self.evaluate(node.value)
return ReturnValue(value)
If statements evaluate the condition and execute the appropriate branch.
def evaluate_if_statement(self, node):
condition = self.evaluate(node.condition)
if self.is_truthy(condition):
return self.evaluate(node.then_branch)
elif node.else_branch:
return self.evaluate(node.else_branch)
return None
def is_truthy(self, value):
if value is None or value is False:
return False
if value == 0 or value == '':
return False
return True
For loops execute the initialization, then repeatedly check the condition and execute the body and increment.
def evaluate_for_loop(self, node):
if node.initialization:
self.evaluate(node.initialization)
while True:
if node.condition:
condition = self.evaluate(node.condition)
if not self.is_truthy(condition):
break
result = self.evaluate(node.body)
if isinstance(result, ReturnValue):
return result
if node.increment:
self.evaluate(node.increment)
return None
While loops are similar but simpler, with only a condition and body.
def evaluate_while_loop(self, node):
while True:
condition = self.evaluate(node.condition)
if not self.is_truthy(condition):
break
result = self.evaluate(node.body)
if isinstance(result, ReturnValue):
return result
return None
The print statement evaluates its expression and outputs the result to the console. We convert the value to a string if necessary.
def evaluate_print_statement(self, node):
value = self.evaluate(node.expression)
print(self.value_to_string(value))
return None
def value_to_string(self, value):
if isinstance(value, bool):
return 'true' if value else 'false'
if isinstance(value, float):
if value == int(value):
return str(int(value))
return str(value)
if value is None:
return 'null'
return str(value)
The run method ties everything together, taking source code through tokenization, parsing, and execution.
def run(self, source_code):
tokenizer = Tokenizer(source_code)
tokens = tokenizer.tokenize()
parser = Parser(tokens)
ast = parser.parse()
return self.evaluate(ast)
Let us trace through the execution of a program that uses mathematical functions. Consider this program:
angle = 3.14159 / 4; result = sin(angle); print("sin(pi/4) = " + {result});
The parser produces an AST with three statements: two assignments and a print statement. The interpreter evaluates each statement in sequence.
For the first assignment, we evaluate the right-hand side, which is a division operation. We evaluate 3.14159 to get the float value, evaluate 4 to get the integer value, and apply the division operator to get approximately 0.7854. We then set the variable angle to this value in the global environment.
The second assignment evaluates a built-in function call. We look up the BuiltinCall node for sin, evaluate its argument which is the Identifier(angle) node. Looking up angle in the environment gives us 0.7854. We then call the evaluate_builtin_call method with function name 'sin' and argument 0.7854. This method uses Python's math.sin function to compute the sine, which returns approximately 0.7071. We set the variable result to this value.
The print statement evaluates a string concatenation. The left operand is the String("sin(pi/4) = ") node, which evaluates to the string "sin(pi/4) = ". The right operand is an expression embedded in curly braces, specifically the identifier result. We look up result in the environment and get 0.7071. The PLUS operator detects that one operand is a string, so it converts both to strings and concatenates them, producing "sin(pi/4) = 0.7071067811865476". The print statement outputs this string to the console.
COMPLETE RUNNING EXAMPLE
Now we present a complete, production-ready implementation of the MiniLang compiler with full mathematical extensions. This code includes all the components discussed above, integrated into a cohesive system. The implementation is fully functional and can compile and execute MiniLang programs with trigonometric functions, logarithms, power operations, square roots, logical operations, and binomial coefficients.
#!/usr/bin/env python3
"""
MiniLang Compiler and Interpreter with Mathematical Extensions
A complete implementation of a mini programming language with support for:
- Integer, float, and string data types
- Mathematical expressions with proper operator precedence
- Trigonometric functions: sin, cos, tan, asin, acos, atan
- Mathematical functions: sqrt, pow, log, log10, exp, abs
- Combinatorial functions: binomial (n choose k)
- Logical operations: and, or, not
- Function definitions and calls
- Control flow: if-then-else, for loops, while loops
- String interpolation with curly braces
- Print statement for formatted output
"""
import math
class Token:
"""Represents a single token in the source code."""
def __init__(self, token_type, value, line, column):
self.type = token_type
self.value = value
self.line = line
self.column = column
def __repr__(self):
return f"Token({self.type}, {repr(self.value)}, {self.line}:{self.column})"
class Tokenizer:
"""
Lexical analyzer that converts source code into a stream of tokens.
Handles numbers, strings, identifiers, keywords, operators, and comments.
"""
def __init__(self, source):
self.source = source
self.position = 0
self.line = 1
self.column = 1
self.tokens = []
def current_char(self):
"""Return the character at the current position."""
if self.position >= len(self.source):
return None
return self.source[self.position]
def peek_char(self, offset=1):
"""Look ahead at a character without advancing position."""
pos = self.position + offset
if pos >= len(self.source):
return None
return self.source[pos]
def advance(self):
"""Move to the next character and update line/column tracking."""
if self.position < len(self.source):
if self.source[self.position] == '\n':
self.line += 1
self.column = 1
else:
self.column += 1
self.position += 1
def skip_whitespace(self):
"""Skip over whitespace characters."""
while self.current_char() and self.current_char() in ' \t\n\r':
self.advance()
def skip_comment(self):
"""Skip over single-line comments starting with //."""
if self.current_char() == '/' and self.peek_char() == '/':
while self.current_char() and self.current_char() != '\n':
self.advance()
if self.current_char() == '\n':
self.advance()
def read_number(self):
"""Read an integer or float literal."""
start_line = self.line
start_column = self.column
num_str = ''
has_dot = False
while self.current_char() and (self.current_char().isdigit() or self.current_char() == '.'):
if self.current_char() == '.':
if has_dot:
break
has_dot = True
num_str += self.current_char()
self.advance()
if has_dot:
return Token('FLOAT', float(num_str), start_line, start_column)
else:
return Token('INTEGER', int(num_str), start_line, start_column)
def read_identifier(self):
"""Read an identifier or keyword."""
start_line = self.line
start_column = self.column
identifier = ''
while self.current_char() and (self.current_char().isalnum() or self.current_char() == '_'):
identifier += self.current_char()
self.advance()
keywords = {
'func': 'FUNC',
'return': 'RETURN',
'if': 'IF',
'else': 'ELSE',
'for': 'FOR',
'while': 'WHILE',
'int': 'INT_TYPE',
'float': 'FLOAT_TYPE',
'string': 'STRING_TYPE',
'print': 'PRINT',
'true': 'TRUE',
'false': 'FALSE',
'null': 'NULL',
'and': 'AND',
'or': 'OR',
'not': 'NOT',
'sin': 'SIN',
'cos': 'COS',
'tan': 'TAN',
'asin': 'ASIN',
'acos': 'ACOS',
'atan': 'ATAN',
'sqrt': 'SQRT',
'pow': 'POW',
'log': 'LOG',
'log10': 'LOG10',
'exp': 'EXP',
'abs': 'ABS',
'binomial': 'BINOMIAL'
}
token_type = keywords.get(identifier, 'IDENTIFIER')
return Token(token_type, identifier, start_line, start_column)
def read_string(self):
"""Read a string literal enclosed in double quotes."""
start_line = self.line
start_column = self.column
self.advance()
string_value = ''
while self.current_char() and self.current_char() != '"':
if self.current_char() == '\\':
self.advance()
if self.current_char() == 'n':
string_value += '\n'
elif self.current_char() == 't':
string_value += '\t'
elif self.current_char() == 'r':
string_value += '\r'
elif self.current_char() == '"':
string_value += '"'
elif self.current_char() == '\\':
string_value += '\\'
else:
string_value += self.current_char()
self.advance()
else:
string_value += self.current_char()
self.advance()
if self.current_char() == '"':
self.advance()
else:
raise SyntaxError(f"Unterminated string at {start_line}:{start_column}")
return Token('STRING', string_value, start_line, start_column)
def read_operator(self):
"""Read an operator or punctuation token."""
start_line = self.line
start_column = self.column
char = self.current_char()
if char == '=' and self.peek_char() == '=':
self.advance()
self.advance()
self.tokens.append(Token('EQUALS_EQUALS', '==', start_line, start_column))
return
if char == '!' and self.peek_char() == '=':
self.advance()
self.advance()
self.tokens.append(Token('NOT_EQUALS', '!=', start_line, start_column))
return
if char == '<' and self.peek_char() == '=':
self.advance()
self.advance()
self.tokens.append(Token('LESS_EQUALS', '<=', start_line, start_column))
return
if char == '>' and self.peek_char() == '=':
self.advance()
self.advance()
self.tokens.append(Token('GREATER_EQUALS', '>=', start_line, start_column))
return
single_char_tokens = {
'+': 'PLUS',
'-': 'MINUS',
'*': 'MULTIPLY',
'/': 'DIVIDE',
'%': 'MODULO',
'=': 'EQUALS',
'<': 'LESS_THAN',
'>': 'GREATER_THAN',
'!': 'NOT_OP',
'(': 'LPAREN',
')': 'RPAREN',
'{': 'LBRACE',
'}': 'RBRACE',
'[': 'LBRACKET',
']': 'RBRACKET',
';': 'SEMICOLON',
',': 'COMMA',
':': 'COLON'
}
if char in single_char_tokens:
self.advance()
self.tokens.append(Token(single_char_tokens[char], char, start_line, start_column))
return
raise SyntaxError(f"Unexpected character '{char}' at {start_line}:{start_column}")
def tokenize(self):
"""Convert the entire source code into a list of tokens."""
while self.position < len(self.source):
self.skip_whitespace()
if self.current_char() is None:
break
if self.current_char() == '/' and self.peek_char() == '/':
self.skip_comment()
continue
if self.current_char().isdigit():
self.tokens.append(self.read_number())
continue
if self.current_char().isalpha() or self.current_char() == '_':
self.tokens.append(self.read_identifier())
continue
if self.current_char() == '"':
self.tokens.append(self.read_string())
continue
self.read_operator()
self.tokens.append(Token('EOF', None, self.line, self.column))
return self.tokens
class ASTNode:
"""Base class for all AST nodes."""
pass
class Program(ASTNode):
"""Root node representing the entire program."""
def __init__(self, statements):
self.statements = statements
class BinaryOp(ASTNode):
"""Binary operation like addition, comparison, etc."""
def __init__(self, left, operator, right):
self.left = left
self.operator = operator
self.right = right
class UnaryOp(ASTNode):
"""Unary operation like negation."""
def __init__(self, operator, operand):
self.operator = operator
self.operand = operand
class Number(ASTNode):
"""Numeric literal (int or float)."""
def __init__(self, value):
self.value = value
class String(ASTNode):
"""String literal."""
def __init__(self, value):
self.value = value
class Boolean(ASTNode):
"""Boolean literal."""
def __init__(self, value):
self.value = value
class Null(ASTNode):
"""Null literal."""
pass
class Identifier(ASTNode):
"""Variable or function name reference."""
def __init__(self, name):
self.name = name
class Assignment(ASTNode):
"""Variable assignment."""
def __init__(self, name, value):
self.name = name
self.value = value
class FunctionDef(ASTNode):
"""Function definition."""
def __init__(self, name, parameters, body):
self.name = name
self.parameters = parameters
self.body = body
class FunctionCall(ASTNode):
"""Function call."""
def __init__(self, name, arguments):
self.name = name
self.arguments = arguments
class BuiltinCall(ASTNode):
"""Built-in function call (mathematical, logical, etc.)."""
def __init__(self, function_name, arguments):
self.function_name = function_name
self.arguments = arguments
class Return(ASTNode):
"""Return statement."""
def __init__(self, value):
self.value = value
class IfStatement(ASTNode):
"""If-then-else statement."""
def __init__(self, condition, then_branch, else_branch):
self.condition = condition
self.then_branch = then_branch
self.else_branch = else_branch
class ForLoop(ASTNode):
"""For loop statement."""
def __init__(self, initialization, condition, increment, body):
self.initialization = initialization
self.condition = condition
self.increment = increment
self.body = body
class WhileLoop(ASTNode):
"""While loop statement."""
def __init__(self, condition, body):
self.condition = condition
self.body = body
class PrintStatement(ASTNode):
"""Print statement for output."""
def __init__(self, expression):
self.expression = expression
class Block(ASTNode):
"""Block of statements enclosed in braces."""
def __init__(self, statements):
self.statements = statements
class Parser:
"""
Recursive descent parser that builds an AST from tokens.
Implements the grammar of MiniLang with proper operator precedence.
"""
def __init__(self, tokens):
self.tokens = tokens
self.position = 0
def current_token(self):
"""Return the token at the current position."""
if self.position < len(self.tokens):
return self.tokens[self.position]
return None
def peek_token(self, offset=1):
"""Look ahead at a token without advancing position."""
pos = self.position + offset
if pos < len(self.tokens):
return self.tokens[pos]
return None
def advance(self):
"""Move to the next token."""
self.position += 1
def expect(self, token_type):
"""Consume a token of the expected type or raise an error."""
token = self.current_token()
if token is None or token.type != token_type:
raise SyntaxError(f"Expected {token_type} but got {token.type if token else 'EOF'}")
self.advance()
return token
def match(self, *token_types):
"""Check if the current token matches any of the given types."""
token = self.current_token()
if token and token.type in token_types:
return True
return False
def parse(self):
"""Parse the entire program."""
statements = []
while self.current_token() and self.current_token().type != 'EOF':
statements.append(self.parse_statement())
return Program(statements)
def parse_statement(self):
"""Parse a single statement."""
token = self.current_token()
if token is None or token.type == 'EOF':
raise SyntaxError("Unexpected end of file")
if token.type == 'FUNC':
return self.parse_function_def()
if token.type == 'RETURN':
return self.parse_return()
if token.type == 'IF':
return self.parse_if_statement()
if token.type == 'FOR':
return self.parse_for_loop()
if token.type == 'WHILE':
return self.parse_while_loop()
if token.type == 'PRINT':
return self.parse_print_statement()
if token.type == 'LBRACE':
return self.parse_block()
if token.type == 'IDENTIFIER' and self.peek_token() and self.peek_token().type == 'EQUALS':
return self.parse_assignment()
expr = self.parse_expression()
if self.match('SEMICOLON'):
self.advance()
return expr
def parse_function_def(self):
"""Parse a function definition."""
self.expect('FUNC')
name_token = self.expect('IDENTIFIER')
name = name_token.value
self.expect('LPAREN')
parameters = []
if not self.match('RPAREN'):
parameters.append(self.expect('IDENTIFIER').value)
while self.match('COMMA'):
self.advance()
parameters.append(self.expect('IDENTIFIER').value)
self.expect('RPAREN')
body = self.parse_block()
return FunctionDef(name, parameters, body)
def parse_block(self):
"""Parse a block of statements enclosed in braces."""
self.expect('LBRACE')
statements = []
while not self.match('RBRACE'):
if self.current_token() is None or self.current_token().type == 'EOF':
raise SyntaxError("Expected '}' but reached end of file")
statements.append(self.parse_statement())
self.expect('RBRACE')
return Block(statements)
def parse_if_statement(self):
"""Parse an if-then-else statement."""
self.expect('IF')
self.expect('LPAREN')
condition = self.parse_expression()
self.expect('RPAREN')
then_branch = self.parse_statement()
else_branch = None
if self.match('ELSE'):
self.advance()
else_branch = self.parse_statement()
return IfStatement(condition, then_branch, else_branch)
def parse_for_loop(self):
"""Parse a for loop."""
self.expect('FOR')
self.expect('LPAREN')
initialization = None
if not self.match('SEMICOLON'):
if self.match('IDENTIFIER') and self.peek_token() and self.peek_token().type == 'EQUALS':
initialization = self.parse_assignment()
else:
initialization = self.parse_expression()
if self.match('SEMICOLON'):
self.advance()
else:
self.advance()
condition = None
if not self.match('SEMICOLON'):
condition = self.parse_expression()
self.expect('SEMICOLON')
increment = None
if not self.match('RPAREN'):
increment = self.parse_expression()
self.expect('RPAREN')
body = self.parse_statement()
return ForLoop(initialization, condition, increment, body)
def parse_while_loop(self):
"""Parse a while loop."""
self.expect('WHILE')
self.expect('LPAREN')
condition = self.parse_expression()
self.expect('RPAREN')
body = self.parse_statement()
return WhileLoop(condition, body)
def parse_print_statement(self):
"""Parse a print statement."""
self.expect('PRINT')
self.expect('LPAREN')
expression = self.parse_expression()
self.expect('RPAREN')
if self.match('SEMICOLON'):
self.advance()
return PrintStatement(expression)
def parse_assignment(self):
"""Parse a variable assignment."""
name = self.expect('IDENTIFIER').value
self.expect('EQUALS')
value = self.parse_expression()
if self.match('SEMICOLON'):
self.advance()
return Assignment(name, value)
def parse_return(self):
"""Parse a return statement."""
self.expect('RETURN')
value = None
if not self.match('SEMICOLON') and not self.match('RBRACE'):
value = self.parse_expression()
if self.match('SEMICOLON'):
self.advance()
return Return(value)
def parse_expression(self):
"""Parse an expression (top level: logical OR)."""
return self.parse_logical_or()
def parse_logical_or(self):
"""Parse logical OR expressions."""
left = self.parse_logical_and()
while self.match('OR'):
operator = self.current_token()
self.advance()
right = self.parse_logical_and()
left = BinaryOp(left, operator, right)
return left
def parse_logical_and(self):
"""Parse logical AND expressions."""
left = self.parse_comparison()
while self.match('AND'):
operator = self.current_token()
self.advance()
right = self.parse_comparison()
left = BinaryOp(left, operator, right)
return left
def parse_comparison(self):
"""Parse comparison expressions."""
left = self.parse_additive()
while self.match('EQUALS_EQUALS', 'NOT_EQUALS', 'LESS_THAN', 'GREATER_THAN', 'LESS_EQUALS', 'GREATER_EQUALS'):
operator = self.current_token()
self.advance()
right = self.parse_additive()
left = BinaryOp(left, operator, right)
return left
def parse_additive(self):
"""Parse addition and subtraction expressions."""
left = self.parse_multiplicative()
while self.match('PLUS', 'MINUS'):
operator = self.current_token()
self.advance()
right = self.parse_multiplicative()
left = BinaryOp(left, operator, right)
return left
def parse_multiplicative(self):
"""Parse multiplication, division, and modulo expressions."""
left = self.parse_unary()
while self.match('MULTIPLY', 'DIVIDE', 'MODULO'):
operator = self.current_token()
self.advance()
right = self.parse_unary()
left = BinaryOp(left, operator, right)
return left
def parse_unary(self):
"""Parse unary expressions."""
if self.match('MINUS', 'NOT', 'NOT_OP'):
operator = self.current_token()
self.advance()
operand = self.parse_unary()
return UnaryOp(operator, operand)
return self.parse_primary()
def parse_primary(self):
"""Parse primary expressions (literals, identifiers, function calls, etc.)."""
token = self.current_token()
if token is None:
raise SyntaxError("Unexpected end of file in expression")
if token.type == 'INTEGER':
self.advance()
return Number(token.value)
if token.type == 'FLOAT':
self.advance()
return Number(token.value)
if token.type == 'STRING':
self.advance()
return self.parse_string_with_interpolation(token.value)
if token.type == 'TRUE':
self.advance()
return Boolean(True)
if token.type == 'FALSE':
self.advance()
return Boolean(False)
if token.type == 'NULL':
self.advance()
return Null()
builtin_functions = ['SIN', 'COS', 'TAN', 'ASIN', 'ACOS', 'ATAN',
'SQRT', 'POW', 'LOG', 'LOG10', 'EXP', 'ABS', 'BINOMIAL']
if token.type in builtin_functions:
func_name = token.value
self.advance()
self.expect('LPAREN')
arguments = []
if not self.match('RPAREN'):
arguments.append(self.parse_expression())
while self.match('COMMA'):
self.advance()
arguments.append(self.parse_expression())
self.expect('RPAREN')
return BuiltinCall(func_name, arguments)
if token.type == 'IDENTIFIER':
name = token.value
self.advance()
if self.match('LPAREN'):
self.advance()
arguments = []
if not self.match('RPAREN'):
arguments.append(self.parse_expression())
while self.match('COMMA'):
self.advance()
arguments.append(self.parse_expression())
self.expect('RPAREN')
return FunctionCall(name, arguments)
return Identifier(name)
if token.type == 'LPAREN':
self.advance()
expr = self.parse_expression()
self.expect('RPAREN')
return expr
raise SyntaxError(f"Unexpected token {token.type} at {token.line}:{token.column}")
def parse_string_with_interpolation(self, string_value):
"""Parse string interpolation with curly braces."""
parts = []
current = ''
i = 0
while i < len(string_value):
if string_value[i] == '{':
if current:
parts.append(String(current))
current = ''
i += 1
expr_str = ''
brace_count = 1
while i < len(string_value) and brace_count > 0:
if string_value[i] == '{':
brace_count += 1
elif string_value[i] == '}':
brace_count -= 1
if brace_count == 0:
break
expr_str += string_value[i]
i += 1
if expr_str:
expr_tokenizer = Tokenizer(expr_str)
expr_tokens = expr_tokenizer.tokenize()
expr_parser = Parser(expr_tokens)
expr = expr_parser.parse_expression()
parts.append(expr)
i += 1
else:
current += string_value[i]
i += 1
if current:
parts.append(String(current))
if not parts:
return String('')
result = parts[0]
for part in parts[1:]:
result = BinaryOp(result, Token('PLUS', '+', 0, 0), part)
return result
class ReturnValue:
"""Wrapper for return values to propagate through function calls."""
def __init__(self, value):
self.value = value
class Function:
"""Runtime representation of a function."""
def __init__(self, name, parameters, body, closure_env):
self.name = name
self.parameters = parameters
self.body = body
self.closure_env = closure_env
class Interpreter:
"""
Tree-walking interpreter that executes the AST.
Maintains environments for variable scoping and executes statements.
Includes built-in mathematical and logical functions.
"""
def __init__(self):
self.global_env = {}
self.local_envs = []
def current_env(self):
"""Return the current environment (local if in function, else global)."""
if self.local_envs:
return self.local_envs[-1]
return self.global_env
def push_env(self):
"""Push a new local environment onto the stack."""
self.local_envs.append({})
def pop_env(self):
"""Pop the current local environment from the stack."""
if self.local_envs:
self.local_envs.pop()
def get_variable(self, name):
"""Look up a variable in the environment chain."""
for env in reversed(self.local_envs):
if name in env:
return env[name]
if name in self.global_env:
return self.global_env[name]
raise NameError(f"Undefined variable '{name}'")
def set_variable(self, name, value):
"""Set a variable in the current environment."""
self.current_env()[name] = value
def evaluate(self, node):
"""Evaluate an AST node and return its value."""
if isinstance(node, Program):
return self.evaluate_program(node)
if isinstance(node, Block):
return self.evaluate_block(node)
if isinstance(node, Number):
return node.value
if isinstance(node, String):
return node.value
if isinstance(node, Boolean):
return node.value
if isinstance(node, Null):
return None
if isinstance(node, Identifier):
return self.get_variable(node.name)
if isinstance(node, BinaryOp):
return self.evaluate_binary_op(node)
if isinstance(node, UnaryOp):
return self.evaluate_unary_op(node)
if isinstance(node, Assignment):
return self.evaluate_assignment(node)
if isinstance(node, FunctionDef):
return self.evaluate_function_def(node)
if isinstance(node, FunctionCall):
return self.evaluate_function_call(node)
if isinstance(node, BuiltinCall):
return self.evaluate_builtin_call(node)
if isinstance(node, Return):
return self.evaluate_return(node)
if isinstance(node, IfStatement):
return self.evaluate_if_statement(node)
if isinstance(node, ForLoop):
return self.evaluate_for_loop(node)
if isinstance(node, WhileLoop):
return self.evaluate_while_loop(node)
if isinstance(node, PrintStatement):
return self.evaluate_print_statement(node)
raise RuntimeError(f"Unknown node type: {type(node).__name__}")
def evaluate_program(self, node):
"""Evaluate all statements in the program."""
result = None
for statement in node.statements:
result = self.evaluate(statement)
if isinstance(result, ReturnValue):
return result.value
return result
def evaluate_block(self, node):
"""Evaluate all statements in a block."""
result = None
for statement in node.statements:
result = self.evaluate(statement)
if isinstance(result, ReturnValue):
return result
return result
def evaluate_binary_op(self, node):
"""Evaluate a binary operation."""
operator = node.operator.type
if operator == 'AND':
left = self.evaluate(node.left)
if not self.is_truthy(left):
return False
right = self.evaluate(node.right)
return self.is_truthy(right)
if operator == 'OR':
left = self.evaluate(node.left)
if self.is_truthy(left):
return True
right = self.evaluate(node.right)
return self.is_truthy(right)
left = self.evaluate(node.left)
right = self.evaluate(node.right)
if operator == 'PLUS':
if isinstance(left, str) or isinstance(right, str):
return self.value_to_string(left) + self.value_to_string(right)
return left + right
if operator == 'MINUS':
return left - right
if operator == 'MULTIPLY':
return left * right
if operator == 'DIVIDE':
if right == 0:
raise RuntimeError("Division by zero")
return left / right
if operator == 'MODULO':
return left % right
if operator == 'EQUALS_EQUALS':
return left == right
if operator == 'NOT_EQUALS':
return left != right
if operator == 'LESS_THAN':
return left < right
if operator == 'GREATER_THAN':
return left > right
if operator == 'LESS_EQUALS':
return left <= right
if operator == 'GREATER_EQUALS':
return left >= right
raise RuntimeError(f"Unknown binary operator: {operator}")
def evaluate_unary_op(self, node):
"""Evaluate a unary operation."""
operand = self.evaluate(node.operand)
operator = node.operator.type
if operator == 'MINUS':
return -operand
if operator == 'NOT' or operator == 'NOT_OP':
return not self.is_truthy(operand)
raise RuntimeError(f"Unknown unary operator: {operator}")
def evaluate_assignment(self, node):
"""Evaluate a variable assignment."""
value = self.evaluate(node.value)
self.set_variable(node.name, value)
return value
def evaluate_function_def(self, node):
"""Evaluate a function definition."""
func = Function(node.name, node.parameters, node.body, dict(self.global_env))
self.set_variable(node.name, func)
return func
def evaluate_function_call(self, node):
"""Evaluate a function call."""
func = self.get_variable(node.name)
if not isinstance(func, Function):
raise RuntimeError(f"'{node.name}' is not a function")
if len(node.arguments) != len(func.parameters):
raise RuntimeError(
f"Function '{node.name}' expects {len(func.parameters)} arguments "
f"but got {len(node.arguments)}"
)
arg_values = [self.evaluate(arg) for arg in node.arguments]
self.push_env()
for param, arg_value in zip(func.parameters, arg_values):
self.set_variable(param, arg_value)
result = self.evaluate(func.body)
self.pop_env()
if isinstance(result, ReturnValue):
return result.value
return None
def evaluate_builtin_call(self, node):
"""Evaluate a built-in function call (mathematical, logical, etc.)."""
func_name = node.function_name
args = [self.evaluate(arg) for arg in node.arguments]
if func_name == 'sin':
if len(args) != 1:
raise RuntimeError("sin() expects exactly 1 argument")
return math.sin(args[0])
if func_name == 'cos':
if len(args) != 1:
raise RuntimeError("cos() expects exactly 1 argument")
return math.cos(args[0])
if func_name == 'tan':
if len(args) != 1:
raise RuntimeError("tan() expects exactly 1 argument")
return math.tan(args[0])
if func_name == 'asin':
if len(args) != 1:
raise RuntimeError("asin() expects exactly 1 argument")
if args[0] < -1 or args[0] > 1:
raise RuntimeError("asin() argument must be in range [-1, 1]")
return math.asin(args[0])
if func_name == 'acos':
if len(args) != 1:
raise RuntimeError("acos() expects exactly 1 argument")
if args[0] < -1 or args[0] > 1:
raise RuntimeError("acos() argument must be in range [-1, 1]")
return math.acos(args[0])
if func_name == 'atan':
if len(args) != 1:
raise RuntimeError("atan() expects exactly 1 argument")
return math.atan(args[0])
if func_name == 'sqrt':
if len(args) != 1:
raise RuntimeError("sqrt() expects exactly 1 argument")
if args[0] < 0:
raise RuntimeError("sqrt() argument must be non-negative")
return math.sqrt(args[0])
if func_name == 'pow':
if len(args) != 2:
raise RuntimeError("pow() expects exactly 2 arguments")
return math.pow(args[0], args[1])
if func_name == 'log':
if len(args) != 1:
raise RuntimeError("log() expects exactly 1 argument")
if args[0] <= 0:
raise RuntimeError("log() argument must be positive")
return math.log(args[0])
if func_name == 'log10':
if len(args) != 1:
raise RuntimeError("log10() expects exactly 1 argument")
if args[0] <= 0:
raise RuntimeError("log10() argument must be positive")
return math.log10(args[0])
if func_name == 'exp':
if len(args) != 1:
raise RuntimeError("exp() expects exactly 1 argument")
return math.exp(args[0])
if func_name == 'abs':
if len(args) != 1:
raise RuntimeError("abs() expects exactly 1 argument")
return abs(args[0])
if func_name == 'binomial':
if len(args) != 2:
raise RuntimeError("binomial() expects exactly 2 arguments (n, k)")
n, k = int(args[0]), int(args[1])
if n < 0 or k < 0:
raise RuntimeError("binomial() arguments must be non-negative")
if k > n:
return 0
return math.factorial(n) // (math.factorial(k) * math.factorial(n - k))
raise RuntimeError(f"Unknown built-in function: {func_name}")
def evaluate_return(self, node):
"""Evaluate a return statement."""
value = None
if node.value:
value = self.evaluate(node.value)
return ReturnValue(value)
def evaluate_if_statement(self, node):
"""Evaluate an if-then-else statement."""
condition = self.evaluate(node.condition)
if self.is_truthy(condition):
return self.evaluate(node.then_branch)
elif node.else_branch:
return self.evaluate(node.else_branch)
return None
def evaluate_for_loop(self, node):
"""Evaluate a for loop."""
if node.initialization:
self.evaluate(node.initialization)
while True:
if node.condition:
condition = self.evaluate(node.condition)
if not self.is_truthy(condition):
break
result = self.evaluate(node.body)
if isinstance(result, ReturnValue):
return result
if node.increment:
self.evaluate(node.increment)
return None
def evaluate_while_loop(self, node):
"""Evaluate a while loop."""
while True:
condition = self.evaluate(node.condition)
if not self.is_truthy(condition):
break
result = self.evaluate(node.body)
if isinstance(result, ReturnValue):
return result
return None
def evaluate_print_statement(self, node):
"""Evaluate a print statement."""
value = self.evaluate(node.expression)
print(self.value_to_string(value))
return None
def is_truthy(self, value):
"""Determine if a value is truthy."""
if value is None or value is False:
return False
if value == 0 or value == '':
return False
return True
def value_to_string(self, value):
"""Convert a value to its string representation."""
if isinstance(value, bool):
return 'true' if value else 'false'
if isinstance(value, float):
if value == int(value):
return str(int(value))
return str(value)
if value is None:
return 'null'
return str(value)
def run(self, source_code):
"""Compile and execute source code."""
tokenizer = Tokenizer(source_code)
tokens = tokenizer.tokenize()
parser = Parser(tokens)
ast = parser.parse()
return self.evaluate(ast)
def main():
"""Main entry point demonstrating the compiler with example programs."""
print("=" * 70)
print("Example 1: Trigonometric Functions")
print("=" * 70)
program1 = """
// Calculate trigonometric values for common angles
pi = 3.14159265359;
// 30 degrees = pi/6 radians
angle30 = pi / 6;
print("30 degrees:");
print(" sin(30) = " + {sin(angle30)});
print(" cos(30) = " + {cos(angle30)});
print(" tan(30) = " + {tan(angle30)});
// 45 degrees = pi/4 radians
angle45 = pi / 4;
print("45 degrees:");
print(" sin(45) = " + {sin(angle45)});
print(" cos(45) = " + {cos(angle45)});
print(" tan(45) = " + {tan(angle45)});
// 60 degrees = pi/3 radians
angle60 = pi / 3;
print("60 degrees:");
print(" sin(60) = " + {sin(angle60)});
print(" cos(60) = " + {cos(angle60)});
print(" tan(60) = " + {tan(angle60)});
"""
print("Source Code:")
print(program1)
print("\nOutput:")
interpreter1 = Interpreter()
interpreter1.run(program1)
print("\n" + "=" * 70)
print("Example 2: Pythagorean Theorem with Square Roots")
print("=" * 70)
program2 = """
// Calculate hypotenuse using Pythagorean theorem
func pythagorean(a, b) {
return sqrt(pow(a, 2) + pow(b, 2));
}
print("Pythagorean Theorem Calculator");
print("-------------------------------");
// Classic 3-4-5 triangle
c1 = pythagorean(3, 4);
print("Triangle with sides 3 and 4: hypotenuse = " + {c1});
// 5-12-13 triangle
c2 = pythagorean(5, 12);
print("Triangle with sides 5 and 12: hypotenuse = " + {c2});
// 8-15-17 triangle
c3 = pythagorean(8, 15);
print("Triangle with sides 8 and 15: hypotenuse = " + {c3});
// Verify with sqrt directly
side_a = 6;
side_b = 8;
hyp = sqrt(side_a * side_a + side_b * side_b);
print("Triangle with sides " + {side_a} + " and " + {side_b} + ": hypotenuse = " + {hyp});
"""
print("Source Code:")
print(program2)
print("\nOutput:")
interpreter2 = Interpreter()
interpreter2.run(program2)
print("\n" + "=" * 70)
print("Example 3: Logarithms and Exponentials")
print("=" * 70)
program3 = """
// Demonstrate logarithmic and exponential functions
print("Logarithms and Exponentials");
print("---------------------------");
// Natural logarithm
x = 10;
ln_x = log(x);
print("ln(" + {x} + ") = " + {ln_x});
// Base-10 logarithm
log10_x = log10(x);
print("log10(" + {x} + ") = " + {log10_x});
// Exponential function
y = 2;
exp_y = exp(y);
print("e^" + {y} + " = " + {exp_y});
// Verify that exp(ln(x)) = x
verification = exp(log(x));
print("exp(ln(" + {x} + ")) = " + {verification});
// Calculate compound interest using exponentials
principal = 1000;
rate = 0.05; // 5% annual rate
time = 10; // 10 years
amount = principal * exp(rate * time);
print("");
print("Compound Interest (continuous):");
print(" Principal: $" + {principal});
print(" Rate: " + {rate * 100} + "%");
print(" Time: " + {time} + " years");
print(" Final amount: $" + {amount});
"""
print("Source Code:")
print(program3)
print("\nOutput:")
interpreter3 = Interpreter()
interpreter3.run(program3)
print("\n" + "=" * 70)
print("Example 4: Binomial Coefficients and Combinatorics")
print("=" * 70)
program4 = """
// Calculate binomial coefficients (n choose k)
print("Binomial Coefficients (n choose k)");
print("-----------------------------------");
// Pascal's triangle row 5
n = 5;
print("Row " + {n} + " of Pascal's Triangle:");
for (k = 0; k <= n; k = k + 1) {
coeff = binomial(n, k);
print(" C(" + {n} + ", " + {k} + ") = " + {coeff});
}
print("");
// Probability example: poker hands
total_cards = 52;
hand_size = 5;
total_hands = binomial(total_cards, hand_size);
print("Total possible 5-card poker hands: " + {total_hands});
// Number of ways to choose 3 items from 10
ways = binomial(10, 3);
print("Ways to choose 3 items from 10: " + {ways});
// Verify symmetry: C(n, k) = C(n, n-k)
n2 = 8;
k2 = 3;
c1 = binomial(n2, k2);
c2 = binomial(n2, n2 - k2);
print("");
print("Symmetry verification:");
print(" C(" + {n2} + ", " + {k2} + ") = " + {c1});
print(" C(" + {n2} + ", " + {n2 - k2} + ") = " + {c2});
"""
print("Source Code:")
print(program4)
print("\nOutput:")
interpreter4 = Interpreter()
interpreter4.run(program4)
print("\n" + "=" * 70)
print("Example 5: Logical Operations and Boolean Algebra")
print("=" * 70)
program5 = """
// Demonstrate logical operations
print("Logical Operations");
print("------------------");
a = true;
b = false;
print("a = " + {a});
print("b = " + {b});
print("");
print("a and b = " + {a and b});
print("a or b = " + {a or b});
print("not a = " + {not a});
print("not b = " + {not b});
print("");
print("Truth table for AND:");
print(" true and true = " + {true and true});
print(" true and false = " + {true and false});
print(" false and true = " + {false and true});
print(" false and false = " + {false and false});
print("");
print("Truth table for OR:");
print(" true or true = " + {true or true});
print(" true or false = " + {true or false});
print(" false or true = " + {false or true});
print(" false or false = " + {false or false});
// Practical example: checking ranges
x = 5;
in_range = x > 0 and x < 10;
print("");
print("Is " + {x} + " in range (0, 10)? " + {in_range});
"""
print("Source Code:")
print(program5)
print("\nOutput:")
interpreter5 = Interpreter()
interpreter5.run(program5)
print("\n" + "=" * 70)
print("Example 6: Complex Mathematical Program - Numerical Integration")
print("=" * 70)
program6 = """
// Numerical integration using the trapezoidal rule
// Integrate f(x) = x^2 from 0 to 1
func f(x) {
return pow(x, 2);
}
func trapezoidal_rule(a, b, n) {
h = (b - a) / n;
sum = (f(a) + f(b)) / 2;
i = 1;
while (i < n) {
x = a + i * h;
sum = sum + f(x);
i = i + 1;
}
return h * sum;
}
print("Numerical Integration using Trapezoidal Rule");
print("--------------------------------------------");
print("Integrating f(x) = x^2 from 0 to 1");
print("Exact answer: 1/3 = 0.333333...");
print("");
// Try different numbers of intervals
for (n = 10; n <= 1000; n = n * 10) {
result = trapezoidal_rule(0, 1, n);
error = abs(result - 0.333333333);
print("n = " + {n} + ": result = " + {result} + ", error = " + {error});
}
"""
print("Source Code:")
print(program6)
print("\nOutput:")
interpreter6 = Interpreter()
interpreter6.run(program6)
print("\n" + "=" * 70)
print("Example 7: Inverse Trigonometric Functions")
print("=" * 70)
program7 = """
// Demonstrate inverse trigonometric functions
print("Inverse Trigonometric Functions");
print("--------------------------------");
pi = 3.14159265359;
// asin examples
print("Arc Sine (asin):");
print(" asin(0.5) = " + {asin(0.5)} + " radians");
print(" asin(0.5) = " + {asin(0.5) * 180 / pi} + " degrees");
print(" asin(1) = " + {asin(1)} + " radians (90 degrees)");
print("");
// acos examples
print("Arc Cosine (acos):");
print(" acos(0.5) = " + {acos(0.5)} + " radians");
print(" acos(0.5) = " + {acos(0.5) * 180 / pi} + " degrees");
print(" acos(0) = " + {acos(0)} + " radians (90 degrees)");
print("");
// atan examples
print("Arc Tangent (atan):");
print(" atan(1) = " + {atan(1)} + " radians");
print(" atan(1) = " + {atan(1) * 180 / pi} + " degrees (45 degrees)");
print(" atan(0) = " + {atan(0)} + " radians");
// Verify inverse relationship
print("");
print("Verification of inverse relationship:");
x = 0.7;
result = sin(asin(x));
print(" sin(asin(" + {x} + ")) = " + {result});
"""
print("Source Code:")
print(program7)
print("\nOutput:")
interpreter7 = Interpreter()
interpreter7.run(program7)
if __name__ == "__main__":
main()
This complete implementation provides a fully functional compiler and interpreter for MiniLang with comprehensive mathematical extensions. The code demonstrates the power of the language through seven detailed examples covering trigonometry, geometry, logarithms, exponentials, combinatorics, boolean logic, numerical methods, and inverse trigonometric functions. Each example showcases different aspects of the mathematical library and demonstrates how these functions can be combined to solve real-world problems.