The Python code generator (PythonCodeGenerator) is the primary backend for multilingual. It takes a CoreIRProgram and emits executable Python source code.
Pipeline Position
1
2
3
4
5
6
7
8
9
10
11
12
13
| Source Language (.ml)
│
▼ [Lexer + Parser + SurfaceNormalizer]
Core AST
│
▼ [SemanticAnalyzer]
Validated CoreIRProgram
│
▼ [PythonCodeGenerator]
Python Source Code (str)
│
▼ [ProgramExecutor / exec()]
Runtime Output
|
Quick Start
Transpile to Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| from multilingualprogramming import ProgramExecutor
executor = ProgramExecutor()
# Transpile without executing
python_code = executor.transpile("""
let items = [1, 2, 3, 4, 5]
let squares = [x**2 for x in items]
print(squares)
""", language="en")
print(python_code)
# Output:
# items = [1, 2, 3, 4, 5]
# squares = [x ** 2 for x in items]
# print(squares)
|
Execute Directly
1
2
3
4
5
6
| executor.execute("""
let items = [1, 2, 3, 4, 5]
let squares = [x**2 for x in items]
print(squares)
""", language="en")
# Prints: [1, 4, 9, 16, 25]
|
Multi-Language Examples
The same Python output is generated regardless of source language:
1
2
3
4
5
6
7
8
9
10
11
| # English source
executor.transpile("let x = 42\nprint(x)", language="en")
# → "x = 42\nprint(x)"
# French source — identical output
executor.transpile("soit x = 42\nafficher(x)", language="fr")
# → "x = 42\nprint(x)"
# Japanese source — identical output
executor.transpile("変数 x = 42\n表示(x)", language="ja")
# → "x = 42\nprint(x)"
|
Code Generation Rules
Variables
| multilingual |
Generated Python |
let x = 42 |
x = 42 |
let x: int = 42 |
x: int = 42 |
const PI = 3.14 |
PI = 3.14 |
Functions
| multilingual |
Generated Python |
def f(x): |
def f(x): |
def f(x: int) -> str: |
def f(x: int) -> str: |
def f(a, /, *, b): |
def f(a, /, *, b): |
async def f(x): |
async def f(x): |
Control Flow
| multilingual |
Generated Python |
if x > 0: |
if x > 0: |
elif x < 0: |
elif x < 0: |
else: |
else: |
for i in range(5): |
for i in range(5): |
while x > 0: |
while x > 0: |
Classes
| multilingual |
Generated Python |
class Foo: |
class Foo: |
class Bar(Foo): |
class Bar(Foo): |
Expressions
| multilingual |
Generated Python |
f"Hello {name}" |
f"Hello {name}" |
x if cond else y |
x if cond else y |
[x for x in items] |
[x for x in items] |
{k: v for k, v in d.items()} |
{k: v for k, v in d.items()} |
lambda x: x**2 |
lambda x: x**2 |
(x := 10) |
(x := 10) |
Imports
Localized import keywords generate standard Python imports:
| multilingual (any language) |
Generated Python |
import math |
import math |
from math import sqrt |
from math import sqrt |
from math import sqrt as root |
from math import sqrt as root |
Built-in Aliases
Localized built-in names are resolved to their Python equivalents at code generation:
| multilingual (French) |
Generated Python |
afficher(x) |
print(x) |
intervalle(10) |
range(10) |
longueur(lst) |
len(lst) |
Using the Generator Directly
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
| from multilingualprogramming import (
Lexer, Parser, SemanticAnalyzer,
PythonCodeGenerator
)
from multilingualprogramming.core.lowering import lower_to_core_ir
source = """
let nums = [1, 2, 3]
let total = sum(n**2 for n in nums)
print(f"Sum of squares: {total}")
"""
language = "en"
# Pipeline
lexer = Lexer(language=language)
parser = Parser(language=language)
analyzer = SemanticAnalyzer(language=language)
codegen = PythonCodeGenerator()
tokens = lexer.tokenize(source)
ast = parser.parse(tokens)
core = lower_to_core_ir(ast, source_language=language)
analyzer.analyze(core.ast)
python_code = codegen.generate(core.ast)
print(python_code)
# Output:
# nums = [1, 2, 3]
# total = sum(n ** 2 for n in nums)
# print(f"Sum of squares: {total}")
|
Runtime Builtins
The Python executor injects a runtime namespace with all multilingual builtins:
1
2
3
4
5
6
7
8
9
| from multilingualprogramming.codegen import RuntimeBuiltins
# Get namespace for a language
builtins = RuntimeBuiltins.for_language("fr")
# Contains:
# - All Python built-in functions
# - Localized aliases for all 41 concepts
# - e.g., "afficher" → print, "intervalle" → range, "longueur" → len
|
When the generated Python code is executed, it runs in a namespace that includes both the universal Python names and the localized aliases. This means both print and afficher work in a French program.
Code Generation Configuration
1
2
3
4
5
6
7
8
9
10
| from multilingualprogramming.codegen import PythonCodeGenerator
# Default settings
codegen = PythonCodeGenerator(
indent_spaces=4, # indentation width
emit_type_annotations=True, # preserve type annotations in output
normalize_whitespace=True, # normalize spacing
)
python_src = codegen.generate(ast)
|
WASM Code Generation
The WASM code generator follows the same interface:
1
2
3
4
5
6
7
| from multilingualprogramming.codegen import WasmGenerator
wasm_gen = WasmGenerator()
wasm_bytes = wasm_gen.generate_wasm(ast, function_name="my_func")
# Or: get the Rust intermediate code
rust_code = wasm_gen.generate_rust(ast, function_name="my_func")
|
See WASM Architecture for details.
Example: Complete Transpilation
Full end-to-end transpile from Japanese to Python:
Input (Japanese, program.ml):
1
2
3
4
5
6
7
8
9
10
11
12
| 取込 math
変数 数値 = [1, 2, 3, 4, 5]
関数 計算(リスト):
変数 合計 = 0
毎 n 中 リスト:
合計 = 合計 + n * n
戻る 合計
変数 結果 = 計算(数値)
表示(f"二乗和: {結果}")
|
Generated Python:
1
2
3
4
5
6
7
8
9
10
11
12
| import math
数値 = [1, 2, 3, 4, 5]
def 計算(リスト):
合計 = 0
for n in リスト:
合計 = 合計 + n * n
return 合計
結果 = 計算(数値)
print(f"二乗和: {結果}")
|
Notes:
- Keywords translated:
取込 → import, 変数 → variable declaration, 関数 → def, 毎/中 → for/in, 戻る → return, 表示 → print
- Identifiers preserved:
数値, 計算, リスト, 合計, 結果 (Japanese identifiers remain as-is)
- Output:
二乗和: 55