Skip to content

TTSS0529/lox-interpreter-cpp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lox Interpreter in Modern C++

A complete implementation of the Lox programming language from Crafting Interpreters, rewritten in Modern C++ (C++17). Note: The REPL (interactive prompt) is not yet implemented. This interpreter includes a full compiler pipeline, a static resolver, a runtime object model, and a comprehensive automated test suite. The project focuses on:

  • Correctness
  • Memory safety
  • Clear modular architecture
  • Engineering quality suitable for technical portfolios

Table of Contents


Features

  • Language Features
    • Dynamic typing with std::variant
    • Closures capturing lexical environments
    • Functions, return values, higher-order functions
    • Classes, inheritance, fields, methods, this, super
    • Control flow statements (if, while, for)
    • Block scoping with nested environments
    • Native functions (e.g., clock())
  • Implementation Features
    • AST + Visitor Pattern architecture
    • Two-phase execution: parsing → resolution → interpretation
    • Syntactic error recovery in parser
    • Shared/weak pointer ownership model avoiding cyclic references
    • 90 automated tests with GoogleTest
    • CTest + Valgrind memory safety integration
    • Clean modular code across scanner/parser/resolver/interpreter

Architecture Overview

The interpreter is structured as a modular pipeline, where each component has a clear responsibility. This design ensures maintainability, memory safety, and easy testability.

  • Scanner (Lexer)
    • Converts source code into a sequence of tokens.
    • Handles literals, identifiers, keywords, operators, and punctuation.
    • Reports lexical errors for invalid characters.
  • Parser
    • Implements a recursive descent parser based on the Lox grammar.
    • Builds an Abstract Syntax Tree (AST) representing the program structure.
    • Performs basic error recovery to continue parsing after syntax errors.
  • Resolver
    • Performs static analysis of variable and function scopes.
    • Resolves variable bindings for closures and nested functions.
    • Handles this and super references for classes.
    • Ensures references are correctly linked before execution.
  • Interpreter
    • Walks the AST to execute statements and evaluate expressions.
    • Uses a Value System based on std::variant to support dynamic typing.
    • Supports:
      • Functions, closures, and higher-order functions
      • Classes, inheritance, fields, methods
      • Lexical scoping and block environments
      • Native functions such as clock()
  • Memory & Ownership
    • AST nodes are managed with std::unique_ptr, enforcing a single-owner hierarchy.
    • Environments use std::shared_ptr/std::weak_ptr to allow shared access without creating cycles.
    • Closures capture their defining environments safely, referencing functions stored in a global pool.
  • Testing & Safety
    • Unit tests cover each module separately: scanner, parser, resolver, interpreter.
    • Integrated Valgrind checks ensure memory safety.
    • The ownership model guarantees that no dangling pointers or memory leaks occur during execution.

Interpreter Pipeline

flowchart LR
    A[Source Code] --> B[Scanner<br/>Tokenization]
    B --> C[Parser<br/>AST]
    C --> D[Resolver<br/>Static Scope Analysis]
    D --> E[Interpreter<br/>Runtime Execution]
    E --> F[Program Output]
Loading

Object, Class, Closure Model

flowchart TD
    A[LoxFunction] --> B[Closure Environment]
    B --> C[Parent Environment]
    C --> D[...]

    E[LoxClass] --> F[LoxInstance]
    F --> G[Fields]
    E --> H[Methods]
Loading

Directory Structure

.
├── includes/        # Header files
├── srcs/            # Implementation
├── test/            # Test cases + GoogleTest integration
├── grammar.md       # Lox grammar reference
└── CMakeLists.txt

Test categories include:

  • scanner tests
  • parser tests
  • resolver tests
  • interpreter semantic tests (functions, closures, classes, errors, etc.)

Build & Run

Requirements

  • CMake ≥ 3.14
  • C++17 compatible compiler

Build

mkdir build && cd build
cmake ..
cmake --build .

Run a script

./jlox ../examples/demo.lox

Running Tests

The project uses GoogleTest, downloaded via CMake FetchContent. Run the full suite:

cd build
ctest --verbose

The test suite covers:

  • lexical analysis
  • grammar parsing
  • scope resolution
  • runtime semantics
  • class/object model

Memory Leak Checking

Valgrind is integrated via CTest:

cd build
ctest -T memcheck

The interpreter is tested for:

  • memory leaks
  • potential leaks
  • invalid read/write
  • use-after-free All tests pass without any errors, except for some still reachable memory. Still reachable means that when the program exits, some memory has not been freed. However, this memory is still accessible during the program's lifetime, often allocated for global or static objects, and is not considered a true memory leak.

Language Examples

Lox has Python-like dynamic features for variabls functions and objects, but uses C-style {} blocks and explicit variable declarations.

The example folder contains a simple demo.lox script to illustrate basic usage of the interpreter.
For more comprehensive examples and test cases, refer to the files under test/interpreter_case.


Design Notes

Value System All runtime values use std::variant, enabling a clean dynamic type system:

using   literal_value = std::variant<std::monostate, double, std::string, bool,
            std::weak_ptr<LoxFunction>, std::shared_ptr<LoxClass>,
            std::shared_ptr<LoxInstance>, std::shared_ptr<LoxCallable>,
            std::shared_ptr<LoxFunction>>;

Ownership Model This project uses explicit ownership models to ensure memory safety, prevent leaks, and simplify resource management across the AST, environments, and closures.

The Abstract Syntax Tree (AST) Designed with strict ownership semantics to ensure memory safety and avoid leaks. Each node in the AST is owned exclusively by its parent node, and the ownership is managed using std::unique_ptr. This means:

  • Every AST node is dynamically allocated, but its lifetime is tied to the parent node that contains it.
  • When a parent node is destroyed, all its child nodes are automatically destroyed as well.
  • Copying AST nodes is not done via default copy operations to prevent accidental shared ownership, but each node class provides a clone() method to explicitly create a deep copy when needed. Nodes can also be moved to transfer ownership efficiently.
  • This design guarantees that each node has a single, clear owner, simplifying memory management and avoiding dangling pointers. Using std::unique_ptr aligns with modern C++ best practices and provides a clear, maintainable, and safe way to manage the AST hierarchy.

Environment Represents a variable scope and optionally links to a parent environment. Its ownership is carefully managed to ensure memory safety:

  • Each environment is typically held in a std::shared_ptr, allowing safe shared ownership.
  • The parent environment (_enclosing) is stored as a std::weak_ptr to avoid reference cycles.
  • When an environment is no longer referenced by any closures or other structures, it is automatically destroyed.
  • Each environment has a unique _env_id, which helps distinguish scopes and enables correct function shadowing in the global function pool. This design guarantees that environments can be safely shared while preventing cyclic references and dangling pointers.

Closures Closures capture their defining environment to support lexical scoping, using a combination of shared_ptr and weak_ptr to manage ownership safely:

  • During execution, each closure holds a std::shared_ptr to its defining environment.
  • Internally, std::weak_ptr<LoxFunction> is used in _value to avoid cyclic references with the environment.
  • To ensure that weak_ptrs remain valid, all function objects are stored in a global function pool as shared_ptr, keyed by (env_id, name) to handle shadowing. This ownership model ensures closures always reference valid environments and functions while preventing memory leaks and dangling pointers.

Future Work

  • Bytecode VM version
  • Mark-Sweep or incremental GC
  • Standard library
  • Performance optimization via arena allocation
  • Interactive debugger (breakpoints, stepping, variable inspection)

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

An end-to-end implementation of the Lox language from “Crafting Interpreters,” rewritten in modern C++ with strong memory-safety guarantees and clean compiler architecture.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors