← Back to Home

samlang Language Specification

This is the spec document for samlang based on the latest trunk. It might not match the currently released version of the language, but it should reflect the latest status well.

1. Introduction

samlang is a statically-typed programming language with bidirectional type inference and functional programming features. The language emphasizes type safety through its strong type system while providing developer ergonomics via automatic type deduction.

Key Features

The language is designed to provide a concise, expressive syntax while maintaining strong type safety and good performance characteristics.


2. Lexical Structure

A samlang program is a sequence of tokens formed from Unicode characters. Whitespace (spaces, tabs, newlines) is ignored except to separate tokens.

2.1 Identifiers

Identifiers in samlang are case-sensitive and follow two patterns:

2.2 Literals

Integer Literals: Decimal integers matching pattern

0
or
[1-9][0-9]*
. Integers are 32-bit signed values ranging from
-2147483648
to
2147483647
. The special case
-2147483648
is recognized as the minimum 32-bit integer value.

Boolean Literals: The reserved words

true
and
false
.

String Literals: Double-quoted character sequences. Strings can contain escape sequences preceded by a backslash:

Unit Literal: The keyword

unit
represents the unit type and its sole value.

2.3 Comments

Three forms of comments are supported:

  1. Line Comment: Starts with

    //
    and continues to the end of the line.

    • Example:
      // This is a comment
  2. Block Comment: Starts with

    /*
    and ends with
    */
    . Can span multiple lines.

    • Example:
      /* This is a
         multi-line
            comment */
  3. Doc Comment: Starts with

    /**
    and ends with
    */
    . Used for documentation.

    • Example:
      /**
       * This is a documentation comment
       * that spans multiple lines
       */

2.4 Keywords

The following keywords are reserved and cannot be used as identifiers:

Import and Module Keywords:

Declaration Keywords:

Visibility Modifiers:

Control Flow Keywords:

Type Keywords:

Literal Keywords:

Forbidden Keywords: These are reserved but not used in the language. Using them as identifiers will result in an error:

2.5 Operators and Punctuation

Parentheses and Braces:

Separators:

Arrow:

Assignment Operator:

Unary Operators:

Binary Operators:

Arithmetic operators:

Comparison operators:

Logical operators:

Ellipsis:

2.6 Tokenization Rules

Tokens are minimal lexical units. The lexer processes source text sequentially, recognizing the longest possible token at each position. Comments and whitespace are ignored during tokenization and do not affect the syntactic structure of the program.


3. Module System

3.1 File-to-Module Mapping

Each

.sam
source file corresponds to exactly one module. The module name is derived from the file path by replacing directory separators with dots and removing the
.sam
extension.

tests/AllTests.sam       → tests.AllTests
std/option.sam          → std.option
std/tuples.sam          → std.tuples

Module references are interned in the compiler's heap to enable efficient comparison and avoid string duplication.

3.2 Imports

Modules declare explicit imports to use names from other modules:

import { Option, TryUnwrap } from std.option;
import { Pair, Triple } from std.tuples;
import { ForTests } from tests.StdLib;

The import syntax is:

import { Name1, Name2, Name3 } from module.path

Imports must be at the top of a file and can import classes, interfaces, and their members (functions, methods, and variants of enum classes).

3.3 Standard Library

The

std/
directory contains standard library modules which are automatically included in compilation:

Standard library modules can be shadowed by user-defined modules only when

__dangerously_allow_libdef_shadowing__
is enabled in
sconfig.json
.

3.4 Visibility

The

private
keyword restricts members to the containing module. A private class, interface, or member cannot be accessed from outside its defining module.

private class NodeEnumerationHelper<K: Comparable<K>, V>(
  End,
  More(K, V, NodeEnumerationHelper<K, V>)
) {
  // Implementation hidden from other modules
}

class PublicClass {
  private function helper(): unit = { /* module-private */ }
  function publicFunction(): unit = { /* visible to all */ }
}

Private classes are commonly used as implementation details within a module (e.g., helper enum types for tree traversal).


4. Top-Level Declarations

4.1 Classes

Classes are the primary top-level declaration in samlang. There are three kinds of classes based on their type definitions.

4.1.1 Struct Classes (Product Types)

Struct classes define product types with named fields using the

val
keyword. An
init
constructor is automatically generated.

class Student(val name: Str, val age: int) {
  method getName(): Str = this.name
  method getAge(): int = this.age
}

Usage:

let s = Student.init("Alice", 21)
let n = s.getName()

Fields in struct classes may be omitted from the constructor:

class Clazz(val t: Pair<Triple<int, int, bool>, Str>) {
  function of(): Clazz = Clazz.init(((42, 2, false), "")
}

4.1.2 Enum Classes (Sum Types)

Enum classes define sum types with variants. Variants are listed in parentheses and automatically generate variant constructors.

class Color(Red, Green, Blue, Custom(int, int, int)) {
  method toRGB(): Triple<int, int, int> =
    match this {
      Red -> Triple.init(255, 0, 0),
      Green -> Triple.init(0, 255, 0),
      Blue -> Triple.init(0, 0, 255),
      Custom(r, g, b) -> Triple.init(r, g, b),
    }
}

class Option<T>(None, Some(T)) {
  method <R> map(f: (T) -> R): Option<R> =
    match this {
      None -> Option.None(),
      Some(v) -> Option.Some(f(v)),
    }
}

Usage:

let red = Color.Red()
let custom = Color.Custom(128, 64, 32)
let some = Option.Some(42)
let none = Option.None<int>()

Variants with associated data specify types in parentheses. A variant without data (like

None
) uses
unit
internally.

4.1.3 Plain Classes

Plain classes have no type definition—only braces containing static functions and methods.

class Math {
  function plus(a: int, b: int): int = a + b
}

class FunctionExample {
  function <T> getIdentityFunction(): (T) -> T = (x: T) -> x
}

4.2 Interfaces

Interfaces define method signatures without implementations. They are used to specify contracts that classes must fulfill.

interface Comparable<T> {
  method compare(other: T): int
}

interface TryUnwrap<T> {
  method tryUnwrap(): Option<T>
}

interface GeneralTuple<E0, E1> {
  method first(): E0
  method second(): E1
}

Interface members cannot have bodies—they consist only of method declarations.

4.3 Type Parameters

Classes and interfaces can be generic with type parameters declared in angle brackets.

class List<T>(Nil, Cons(T, List<T>)) {
  method <R> map(f: (T) -> R): List<R> = ...
}

class Box<T>(val content: T) {
  method getContent(): T = this.content
}

Type parameters can have interface bounds using a colon:

class Set<V: Comparable<V>>(Empty, Leaf(V), Node(...)) { ... }
class Map<K: Comparable<K>, V>(Empty, Leaf(K, V), Node(...)) { ... }

4.4 Supertype Declarations

Classes can declare that they implement one or more interfaces using a colon-separated list after the class name or type definition.

class Int(val value: int) : Comparable<Int> {
  method compare(other: Int): int = this.value - other.value
}

class Option<T>(None, Some(T)) : TryUnwrap<T> {
  method tryUnwrap(): Option<T> = this
}

class Pair<E0, E1>(val e0: E0, val e1: E1) : GeneralTuple<E0, E1> {
  method first(): E0 = this.e0
  method second(): E1 = this.e1
}

class BoxedInt(val i: int) : Comparable<BoxedInt>, Useless {
  method compare(other: BoxedInt): int = this.i - other.i
}

Interfaces can extend other interfaces (the colon after an interface name means "extends").

4.5 Members

Classes contain two kinds of members: functions (static) and methods (instance).

Functions

Static functions are defined with the

function
keyword:

class Math {
  function plus(a: int, b: int): int = a + b
}

Called as:

Math.plus(2, 2)

Generic functions declare type parameters before the function name:

class Option<T>(None, Some(T)) {
  function <T> getSome(d: T): Option<T> = Option.Some(d)
}

Methods

Methods are defined with the

method
keyword and operate on
this
:

class Student(val name: Str, val age: int) {
  method getName(): Str = this.name
}

Called on an instance:

let s = Student.init("Bob", 30)
s.getName()

Methods can also be generic:

class Option<T>(None, Some(T)) {
  method <R> map(f: (T) -> R): Option<R> = ...
}

Fields

Fields are declared with

val
in struct classes and are accessed via
this.fieldName
or via pattern matching. Fields are immutable—there is no
mut
or equivalent keyword.

class Person(val name: Str, val age: int) {
  method getInfo(): Str = this.name
}

Access via pattern:

let Person { name, age } = Person.init("Alice", 25)

4.6 Visibility on Members

Individual members can be marked

private
to restrict them to the containing module:

class DifferentModulesDemo {
  private function assertTrue(condition: bool, message: Str): unit =
    if condition { } else { Process.panic(message) }

  function run(): unit = ...
}

class List<T>(Nil, Cons(T, List<T>)) {
  private method reverseWithAccumulator(acc: List<T>): List<T> =
    match this {
      Nil -> acc,
      Cons(v, rest) -> rest.reverseWithAccumulator(List.Cons(v, acc)),
    }
}

Private members can only be accessed from within the same module file.


5. Type System

samlang uses a static type system with bidirectional type inference. Every expression has a statically known type, and the compiler checks type compatibility at compile time. There are no implicit type conversions, no runtime type checks (except pattern matching on enum variants), and no null values.

5.1 Primitive Types

samlang has three primitive types:

| Type | Description | Literal examples | | ------ | ------------------------------------------------------- | ------------------- | |

int
| 32-bit signed integer (-2,147,483,648 to 2,147,483,647) |
0
,
42
,
-65536
| |
bool
| Boolean value |
true
,
false
| |
unit
| Unit type with a single value |
{ }
(empty block) |

The

unit
type is the return type of functions that perform side effects without producing a meaningful value. Its only value is written as
{ }
(an empty block expression).

5.2 Nominal Types

All non-primitive, non-function types in samlang are nominal types -- they are identified by their declared class or interface name, not by their structure. A nominal type is written as an upper-case identifier optionally followed by type arguments in angle brackets.

NominalType ::= UpperId
              | UpperId '<' Type (',' Type)* '>'

Examples:

Str                    // built-in string type
Option<int>            // generic type with one argument
Map<Str, List<int>>    // nested generic type arguments
Pair<int, bool>        // tuple-backed pair type

Two nominal types are considered the same type if and only if they have the same module reference, the same class/interface name, and structurally identical type arguments. There is no structural type equivalence.

5.3 Function Types

Function types describe the signature of callable values (lambdas, function references, method references). A function type lists parameter types and return type.

FunctionType ::= '(' [Type (',' Type)*] ')' '->' Type

Examples:

() -> unit              // no parameters, returns unit
(int) -> int            // one int parameter, returns int
(T, T) -> bool          // two parameters of type T, returns bool
(int, Str) -> Option<int>  // two parameters, returns Option<int>
((T) -> R) -> R         // higher-order: takes a function, returns R

Function types are structurally compared: two function types are the same if they have the same number of parameter types, each corresponding parameter type is the same, and the return types are the same.

5.4 Generic Type Variables

Type variables introduced by type parameter declarations (see Section 4.3) may appear as types within their scope. A generic type variable is written as a bare upper-case identifier.

class Container<T>(val value: T) {
  method <R> transform(f: (T) -> R): R = f(this.value)
}

In this example,

T
is a generic type variable from the class-level type parameters, and
R
is a generic type variable from the method-level type parameters. Both may appear as types in the method signature and body.

Generic type variables are resolved by name within the enclosing scope. A generic type

T
is the same as another generic type
T
if and only if they have the same name (within the same scope).

5.5 Tuples

Tuples are anonymous product types created with parenthesized, comma-separated expressions. Under the hood, tuples are desugared to named struct types from the standard library:

| Tuple size | Mapped type | | ---------- | ----------------------------------- | | 2 |

std.tuples.Pair<E0, E1>
| | 3 |
std.tuples.Triple<E0, E1, E2>
| | 4 |
std.tuples.Tuple4<E0, E1, E2, E3>
| | ... | ... | | 16 |
std.tuples.Tuple16<E0, ..., E15>
|

Tuple construction:

let pair = (42, "hello");        // Pair<int, Str>
let triple = (1, true, "world"); // Triple<int, bool, Str>

Tuple fields are accessed by name:

e0
,
e1
,
e2
, etc., corresponding to the positional order. Tuple types can also be destructured using tuple patterns:

let (x, y) = pair;
let (a, _, c) = triple;

The minimum tuple size is 2 and the maximum is 16.

5.6 Bounded Polymorphism

Type parameters may declare upper bounds to constrain the types that may be substituted for them. A bound is specified as

: BoundType
after the type parameter name, where
BoundType
must be a nominal type (class or interface, optionally with type arguments).

class List<T: Comparable<T>>(Nil, Cons(T, List<T>)) {
  method sort(): List<T> = ...
}

When the compiler resolves a type argument for a bounded type parameter, it checks that the argument either:

  1. Is the same type as the bound, or
  2. Is a subtype of the bound (implements the bound interface, directly or transitively).

If the type argument does not satisfy the bound, the compiler reports an error.

Bounds may reference other type parameters from the same declaration:

interface IBase<A, B> {
  method <C: A> m1(a: A, b: B): C
}

5.7 Type Inference

samlang employs bidirectional type inference, which combines two modes:

  1. Synthesis mode -- the type of an expression is determined bottom-up from the expression itself. Literals, variable references, field accesses, and binary operators synthesize their types directly.
  2. Checking mode -- a type is propagated top-down as a "hint" to an expression. This is used when the expected type is known from context, such as a return type annotation of a function or parameter types of a lambda.

Type inference works as follows:

// Type arguments inferred from context
let none = Option.None<int>();           // explicit type argument required (no context)
let some = Option.Some(42);             // T inferred as int from argument
let mapped = some.map((x) -> x + 1);   // R inferred as int from lambda return

// Lambda parameter types inferred from context
list.map((x) -> x + 1)      // x : T inferred from list's element type
list.fold((acc, v) -> acc + v, 0) // acc : int, v : T inferred from fold's signature

Type arguments can also be explicitly provided when inference is insufficient or ambiguous:

let none = Option.None<int>();
let result = Result.Ok<int, Str>(42);

5.7.1 Constraint Solving

When a generic function or method is called, the compiler solves for type arguments by unifying concrete argument types with the generic parameter types. The solver:

  1. Collects constraints by matching each concrete argument type against the corresponding generic parameter type.
  2. If a return type hint is available, it also constrains the return type.
  3. Resolves each type variable to a concrete type. If a variable remains unsolved, it is assigned a placeholder type.
  4. Substitutes the solved types back into the generic function type and verifies that the result is compatible with the concrete argument types.

5.8 Subtyping

samlang has a limited subtyping relation based exclusively on interface implementation. There is no structural subtyping -- two distinct classes with identical fields and methods are not interchangeable.

The subtyping rules are:

  1. Reflexivity: Every type is a subtype of itself.
  2. Interface implementation: A class
    C
    that implements interface
    I
    (directly or transitively through extended interfaces) is a subtype of
    I
    .
  3. Interface extension: An interface
    I1
    that extends interface
    I2
    is a subtype of
    I2
    .
  4. Bounded type variables: A generic type variable
    T : Bound
    is a subtype of
    Bound
    .

Subtyping is used in the following contexts:

5.8.1 Invariant Generics

Generic type arguments are invariant. Given a class

Container<T>
,
Container<Cat>
is not a subtype of
Container<Animal>
even if
Cat
is a subtype of
Animal
. This applies uniformly to all generic types.

5.9 Type Compatibility (Assignability)

Two types are compatible (assignable) if and only if they are structurally identical according to these rules:

Incompatible types produce a compile-time error with a detailed mismatch trace.

5.10 The
Str
Type

Str
is a built-in nominal type representing strings. It is not a primitive type -- it is a class type defined in the root module with special compiler support.
Str
has the following built-in members:

The

::
operator concatenates two
Str
values:

let greeting = "Hello" :: " " :: "world";

String literals produce values of type

Str
. See Section 2 for string literal syntax.

5.11 The
Process
Type

Process
is a built-in class type providing interaction with the runtime environment:

Process
has no constructors and cannot be instantiated. It is an uninhabited type (empty enum) that serves purely as a namespace for its static functions.

5.12 Type Errors

The type checker reports errors in the following situations:


6. Expressions

Expressions are the building blocks of samlang programs. Every expression produces a value and has a statically determined type.

Expression ::= Literal
           | LocalId
           | ClassId
           | Tuple
           | FieldAccess
           | MethodAccess
           | UnaryExpression
           | CallExpression
           | BinaryExpression
           | IfElseExpression
           | MatchExpression
           | LambdaExpression
           | BlockExpression

6.1 Literals

Literal expressions represent constant values.

IntLiteral      ::= '-'? ('0' | [1-9][0-9]*)
BoolLiteral     ::= 'true' | 'false'
StringLiteral   ::= '"' (character | escape)* '"'
UnitLiteral     ::= '{' '}'

6.2 Variable References

A variable reference produces a value bound to that variable in the nearest enclosing

let
binding or parameter.

Variable ::= lowerId

Referencing an undefined variable produces a compile-time error.

6.3 The
this
Reference

The keyword

this
is a special expression that is only valid within a method body. It refers to the instance on which the method was invoked.

This ::= 'this'

Using

this
outside of a method (i.e., in a static function or top-level expression context) produces a compile-time error.

6.4 Class References

A class name used as an expression produces a value representing the class itself. Class references are used for static function calls.

ClassReference ::= UpperId

Example:

MathUtils.max(1, 2)
calls the static function
max
on the class
MathUtils
.

6.5 Tuple Construction

A tuple is an ordered collection of values. Tuple construction uses parenthesized, comma-separated expressions.

Tuple ::= '(' Expression (',' Expression)+ ')'

The minimum tuple size is 2; a parenthesized single expression is not a tuple (it's just a parenthesized expression). The maximum tuple size is 16.

let pair = (1, 2);         // type: Pair<int, int>
let triple = (1, "x", true); // type: Triple<int, Str, bool>

See Section 5.5 for the mapping of tuple sizes to underlying struct types.

6.6 Field Access

Field access retrieves a field from a value of a nominal type (class) or tuple type.

FieldAccess ::= Expression '.' lowerId

For struct classes, fields are accessed by their declared names. For tuples, fields are named

e0
,
e1
,
e2
, ...,
e15
based on their position.

let point = Point.init(10, 20);
let x = point.x;      // field access on struct

let pair = (1, 2);
let first = pair.e0;  // field access on tuple

Accessing a non-existent field produces a compile-time error.

6.7 Function and Method Calls

Function calls invoke a callable value with arguments. There are several forms of calls.

6.7.1 Static Function Calls

A static function is called on a class reference.

StaticCall ::= UpperId '.' lowerId '(' [ArgumentList] ')'
let p = Point.init(3, 4);

6.7.2 Method Calls

An instance method is called on an object expression using dot notation.

MethodCall ::= Expression '.' lowerId '(' [ArgumentList] ')'
let p = Point.init(3, 4);
let d = p.distanceSquared();

Method calls may include explicit type arguments:

instance.method<T1, T2>(args)

6.7.3 Direct Function Calls

A callable expression (variable, function reference, or lambda) is invoked directly.

DirectCall ::= Expression '(' [ArgumentList] ')'
let add = (x: int, y: int) -> x + y;
add(1, 2);                  // 3

6.7.4 Type Arguments

Generic function calls may include explicit type arguments:

Option.None<int>()
Result.Ok<int, Str>(42)
List.Cons(1, List.Nil<int>())

Type arguments can often be inferred from context:

let some = Option.Some(42);     // T inferred as int

6.7.5 Call Semantics and Evaluation Order

Arguments are evaluated left-to-right before the callee is invoked. The callee is evaluated only after all arguments.

f(a(), b(), c())               // a() evaluated first, then b(), then c()

6.8 Unary Operators

Unary operators have higher precedence than binary operators and bind to the immediately following expression:

| Operator | Meaning | Example | | -------- | ------------------- | ------- | |

!
| Logical negation |
!flag
| |
-
| Arithmetic negation |
-42
|

The operand must have the expected type; otherwise, a type error is reported.

6.9 Binary Operators

Binary operators combine two operand expressions. They are left-associative with standard precedence rules.

BinaryExpression ::= Expression BinaryOperator Expression
BinaryOperator  ::= '*' | '/' | '%' | '+' | '-' | '::' | '<' | '<=' | '>' | '>=' | '==' | '!=' | '&&' | '||'

Arithmetic Operators

| Operator | Operand types | Result type | Description | | -------- | ------------- | ----------- | ---------------- | |

*
|
int
,
int
|
int
| Multiplication | |
/
|
int
,
int
|
int
| Integer division | |
%
|
int
,
int
|
int
| Remainder (mod) | |
+
|
int
,
int
|
int
| Addition | |
-
|
int
,
int
|
int
| Subtraction |

1 * 2 + 3 / 4 % 5 - 6    // parsed as (((1 * 2) + ((3 / 4) % 5)) - 6)

Division by zero results in runtime behavior defined by the target platform (typically a panic or trap).

Comparison Operators

| Operator | Operand types | Result type | Description | | -------- | ------------- | ----------- | --------------------- | |

<
|
int
,
int
|
bool
| Less than | |
<=
|
int
,
int
|
bool
| Less than or equal | |
>
|
int
,
int
|
bool
| Greater than | |
>=
|
int
,
int
|
bool
| Greater than or equal | |
==
|
T
,
T
|
bool
| Equality | |
!=
|
T
,
T
|
bool
| Inequality |

Equality operators are structural: two values are equal if they have the same structure and all components are recursively equal. Nominal type values are equal if they are the same variant (for enums) with equal associated data.

Option.Some(42) == Option.Some(42)    // true
Option.Some(42) == Option.None()        // false

Logical Operators

| Operator | Operand types | Result type | Description | | -------- | -------------- | ----------- | -------------- | ------ | ---------- | |

&&
|
bool
,
bool
|
bool
| Logical AND | |
       |                |
|
bool
,
bool
|
bool
| Logical OR |

Logical operators short-circuit: the right operand is evaluated only if necessary.

true && false        // false (both evaluated)
false && panic()     // false (panic() NOT evaluated)
true || panic()      // true (panic() NOT evaluated)

String Concatenation

| Operator | Operand types | Result type | Description | | -------- | ------------- | ----------- | -------------------- | |

::
|
Str
,
Str
|
Str
| String concatenation |

"Hello" :: " " :: "world"    // "Hello world"

6.10 If-Else Expressions

If-else expressions conditionally evaluate one of two branches based on a boolean condition.

IfElseExpression ::=
  'if' Condition Expression 'else' Expression

Condition ::= Expression | PatternGuard
PatternGuard ::= 'let' Pattern '=' Expression

6.10.1 Simple If-Else

The condition must evaluate to

bool
. Both branches must have compatible types.

if x > 0 {
  x
} else {
  -x
}

Single-line form:

if a > b { a } else { b }

6.10.2 If-Let (Guard Pattern)

An if-let expression uses a pattern to destructure and test a value. If the pattern matches, the first branch executes with the pattern bindings in scope. If it doesn't match, the else branch executes.

if let Some(x) = option {
  x + 1
} else {
  -1
}

The pattern is checked for exhaustiveness; a pattern that always matches (e.g., a variable binding) produces a warning.

6.10.3 Chained If-Else

If-else expressions can be chained by nesting

else if
:

if x < 0 {
  "negative"
} else if x == 0 {
  "zero"
} else {
  "positive"
}

6.11 Match Expressions

Match expressions provide exhaustive pattern matching on a value.

MatchExpression ::=
  'match' Expression '{' [VariantPatternToExpression (',' VariantPatternToExpression)* [',']] '}'
match option {
  None -> -1,
  Some(x) -> x
}

Match expressions must be exhaustive: all possible values of the matched type must be covered by some pattern. The compiler reports an error if a non-exhaustive match is detected, showing a counterexample.

6.12 Lambda Expressions

Lambda expressions create anonymous function values.

LambdaExpression ::=
  '(' [ParameterList] ')' '->' Expression
ParameterList ::= OptionallyAnnotatedId (',' OptionallyAnnotatedId)*
OptionallyAnnotatedId ::= lowerId [':' Type]

Examples:

(x) -> x + 1                    // lambda taking one parameter
(x, y) -> x + y                // lambda taking two parameters
(x: int, y: int) -> x + y       // lambda with explicit parameter types
() -> 42                         // lambda taking no parameters

Lambda parameters may omit type annotations when the surrounding context provides a type hint. If no hint is available, parameter types must be explicitly annotated.

// Type hint from context
let f: (int) -> int = (x) -> x + 1;   // x : int inferred from hint

// No type hint available -- explicit annotation required
let add = (x: int, y: int) -> x + y;

Lambdas capture variables from their enclosing scope. Captured variables are read-only within the lambda body.

function makeAdder(n: int): (int) -> int = (x) -> x + n

6.13 Block Expressions

Blocks are sequences of statements followed by an optional final expression. Statements and final expressions can be freely mixed within a block. The block's value is the value of the final expression, or

unit
if there is no final expression.

BlockExpression ::=
  '{' [DeclarationStatement (';' DeclarationStatement)* [';' [Expression]]] '}'
samlang
{
  let x = 42;
  let y = x + 1;
  y * 2        // result of block
}

Blocks introduce a new scope for local variables. Variables declared in a block are only accessible within that block.

{
  let x = 42;
  let y = 2;
  { }            // block evaluates to unit
}
{
  let x = 1;
  {
    let y = x + 1;    // x is accessible
  }
}
// y is not accessible here

6.13.1 SSA and Variable Rebinding

samlang uses Static Single Assignment (SSA) semantics. Each

let
binding creates a new variable binding, even if the same identifier is reused. References are resolved to their defining binding site.

let x = 1;
let x = x + 1;    // This creates a new binding for x
                    // The right-hand x refers to the first binding
                    // The left-hand x is a new binding

This ensures that each variable is assigned exactly once (within its binding scope), enabling optimizations and simplifying reasoning about code.

6.14 Expression Precedence

For reference, the complete precedence table (highest to lowest):

| Level | Expression forms | | ----- | ----------------------------------------------------- | | 12 | Lambda

->
| | 11 | Match
match
| | 10 | If-else
if ... else
| | 4-9 | Binary operators (see Section 6.9) | | 2 | Unary operators
!
,
-
| | 1 | Field/method access
.
, call
()
, block
{}
| | 0 | Literals, variables, class references, tuples
(...)
|

Parentheses can be used to override default precedence:

x * y + z        // parsed as (x * y) + z
x * (y + z)      // x * (y + z)

x.f(y)            // method call
(x.f(y)) + z       // (x.f(y)) + z

(x.f)(y)          // call result of x.f with y

6.15 Evaluation Order

Expression evaluation follows these rules:

  1. Literals, variables, and class references evaluate immediately.
  2. Function calls: arguments are evaluated left-to-right, then the callee is evaluated and invoked.
  3. Binary operators: left operand evaluated first, then right operand, then operator applied.
  4. Logical
    &&
    and
    ||
    short-circuit (right operand may not be evaluated).
  5. Block expressions: statements are executed in order; final expression is evaluated last.
  6. If-else and match: condition/matched expression evaluated first, then only the selected branch is evaluated.

7. Statements

samlang has two statement forms:

let
binding statements and expression statements.

7.1 Let Bindings

A

let
binding introduces a new variable in the current scope.

LetStatement ::= 'let' Pattern [':' Type] '=' Expression ';'

The pattern on the left-hand side may be a variable name, a tuple pattern, a struct pattern, a variant pattern, or a wildcard (

_
). The expression on the right-hand side is evaluated, and if the pattern matches, its bindings are introduced into the scope.

The optional type annotation allows specifying the expected type of the binding. If present, the right-hand side must produce a value compatible with that type.

// Simple variable binding
let x = 42;

// Tuple pattern
let (a, b) = pair;

// Struct pattern
let { name, github } = developer;

// Variant pattern
let Some(value) = option;

// Wildcard pattern (discards value)
let _ = computeResult();
let count: int = 10;

All bindings are immutable. Once bound, a variable cannot be reassigned.

7.2 Expression Statements

An expression statement evaluates an expression for its side effects and discards the result.

ExpressionStatement ::= Expression ';'

The expression is evaluated, but its value is not bound to any variable. This is useful for calling functions with side effects.

// Function call for side effects
Process.println("Hello, world!");

// Method call for side effects
list.push(42);

// Complex expression with side effects
if condition { Process.println("yes") } else { Process.println("no") };

Note: Expression statements can have any return type. The value is simply discarded.


8. Patterns

Patterns are used in

let
bindings,
if let
expressions, and
match
expressions to destructure values. Patterns are matched against values from left to right in the order they appear.

8.1 Wildcard Pattern

The wildcard pattern

_
matches any value and binds no variables.

WildcardPattern ::= '_'
match value {
  _ -> "anything",
}

8.2 Variable Pattern

A variable pattern matches any value and binds that value to a variable.

VariablePattern ::= lowerId
let x = value;

8.3 Literal Patterns

Literal patterns match against specific constant values.

LiteralPattern ::= IntLiteral | BoolLiteral
match x {
  0 -> "zero",
  1 -> "one",
  _ -> "other",
}

String literals cannot be used as patterns.

8.4 Tuple Patterns

Tuple patterns match tuple values by position.

TuplePattern ::= '(' Pattern (',' Pattern)+ ')'
let (x, y) = pair;

match triple {
  (0, 0, 0) -> "origin",
  (x, y, 0) -> "on XY plane",
  _ -> "elsewhere",
}

8.5 Struct Patterns

Struct patterns match values of struct class types by field name.

StructPattern ::= '{' FieldPattern (',' FieldPattern)* '}'
FieldPattern ::= lowerId | lowerId 'as' lowerId

A field pattern can be just a field name (which binds the field value to a variable of the same name) or

field as binding
(which binds the field value to a different variable).

let { name, github } = developer;
let point = Point.init(10, 20);
let { x as pX, y as pY } = point;

Fields are matched by name, not position. Omitted fields are not matched (they remain inaccessible in the pattern scope).

8.6 Variant Patterns

Variant patterns match enum values by variant name and optionally destructure the associated data.

VariantPattern ::= UpperId ['(' Pattern (',' Pattern)* ')']
match option {
  Option.None() -> "nothing",
  Option.Some(value) -> "found: " :: Str.fromInt(value),
}

8.7 Nested Patterns

Patterns can be nested within each other.

match result {
  Ok(Some(x)) -> x,
  Ok(None) -> 0,
  Error(_) -> -1,
}

8.8 Pattern Matching Semantics

Pattern matching is evaluated as follows:

  1. For a variable pattern, match succeeds and the variable is bound to the value.
  2. For a literal pattern, match succeeds if the value equals the literal.
  3. For a tuple pattern, match succeeds if the value is a tuple of the same size and each subpattern matches the corresponding element.
  4. For a struct pattern, match succeeds if the value is an instance of the specified struct class and each field pattern matches the corresponding field.
  5. For a variant pattern, match succeeds if the value is an instance of the enum class, is of the specified variant, and each subpattern matches the corresponding data field.
  6. For a wildcard pattern, match always succeeds with no bindings.

Pattern matching in

match
expressions is checked for exhaustiveness. The compiler ensures that for every possible value of the matched expression, at least one pattern will match.

8.9 Or-Patterns

samlang does not support or-patterns (e.g.,

A(x) | B(x)
). Multiple variant patterns must be written as separate match arms.

8.10 As-Patterns

The

as
keyword is used to rename bindings in struct patterns, but there is no general as-pattern for aliasing an entire matched value.


9. Operator Precedence Table

The following table lists all operators and constructs in order from tightest binding (evaluated first) to loosest binding (evaluated last). Operators at the same precedence level are left-associative unless otherwise noted.

| Level | Construct | Description | Associativity | | ----- | ---------------------------------------------------------- | ------------------------ | ------------- | ---------- | ---- | | 0 | Literals, identifiers,

this
, tuple construction | Atoms | N/A | | 1 |
.
field access,
expr(...)
function call,
{...}
block | Postfix | Left | | 2 |
-expr
,
!expr
| Unary operators (prefix) | N/A | | 3 | N/A | (reserved) | N/A | | 4 |
*
,
/
,
%
| Multiplicative | Left | | 5 |
+
,
-
,
::
| Additive, string concat | Left | | 6 |
<
,
<=
,
>
,
>=
,
==
,
!=
| Comparison | Left | | 7 |
&&
| Logical AND | Left | | 8 |
                                                         |                          |
| Logical OR | Left | | 9 | N/A | (reserved) | N/A | | 10 |
if
...
else
,
if let
...
else
| Conditional | N/A | | 11 |
match
| Pattern matching | N/A | | 12 |
(params) -> expr
| Lambda | N/A |

Notes:


10. Built-in Types and Functions

samlang provides several built-in types and functions that are available without explicit import.

10.1 The
Str
Type

The

Str
type represents immutable string values.

Static Methods:

Instance Methods:

String Concatenation:

Strings can be concatenated using the

::
operator:

let greeting = "Hello" :: " " :: "World"  // Results in "Hello World"

10.2 Process Functions

The

Process
module provides runtime interactions.

Functions:

10.3 Auto-generated Constructors

For user-defined classes, the compiler automatically generates constructors:

Struct Classes:

For a class with only fields (no variants), a constructor

ClassName.init(...)
is automatically generated:

class Person(val name: Str, val age: int) {

let p = Person.init("Alice", 30)

Enum Classes:

For a class with variants, constructors are generated for each variant:

class Color(Red, Green, Blue, Custom(Str)) {

let red = Color.Red()
let custom = Color.Custom("#ff0000")

11. Standard Library

The standard library is located in the

std/
directory and provides commonly used data structures and utilities. All modules must be explicitly imported using
import { ... } from std.moduleName
.

11.1 std.interfaces

Provides interface definitions for type-based operations.

Comparable Interface:

interface Comparable<T> {
  method compare(other: T): int
}

The

compare
method returns:

TryUnwrap Interface:

interface TryUnwrap<T> {
  method tryUnwrap(): Option<T>
}

11.2 std.boxed

Boxed wrappers for primitive types that implement

Comparable
.

Int Class:

class Int(val value: int) : Comparable<Int>

Bool Class:

class Bool(val value: bool) : Comparable<Bool>

11.3 std.option

Represents optional values, similar to

Option
in Rust or
Maybe
in Haskell.

class Option<T>(None, Some(T))

Static Methods:

Instance Methods:

11.4 std.result

Represents results that may fail, with separate success and error types.

class Result<T, E>(Ok(T), Error(E))

Static Methods:

Instance Methods:

11.5 std.list

Immutable singly-linked list providing functional operations.

class List<T>(Nil, Cons(T, List<T>))

Static Methods:

Instance Methods:

11.6 std.map

Immutable balanced binary search tree map with ordered keys.

class Map<K: Comparable<K>, V>(Empty, Leaf(K, V), Node(int, K, V, Map<K, V>, Map<K, V>))

Static Methods:

Instance Methods:

11.7 std.set

Immutable balanced binary search tree set with ordered elements.

class Set<V: Comparable<V>>(Empty, Leaf(V), Node(int, V, Set<V>, Set<V>))

Static Methods:

Instance Methods:

11.8 std.tuples

Tuple types for grouping values together.

GeneralTuple Interface:

interface GeneralTuple<E0, E1> {
  method first(): E0
  method second(): E1
}

Tuple Classes:

All tuple classes from

Pair
through
Tuple16
implement
GeneralTuple<E0, E1>
:

All tuple classes provide:


12. Compilation Pipeline

The samlang compiler transforms source code through a series of intermediate representations (IRs) before producing final output. The compilation pipeline supports two backends: WebAssembly and TypeScript.

12.1 Overview

Source (.sam) → HIRMIRLIRWASM
TypeScript

IR Stages

  1. Source: Parse samlang source files into a typed AST
  2. HIR (High-Level IR): Direct lowering from typed AST, preserves generics
  3. MIR (Mid-Level IR): Generics specialized, enum representations optimized
  4. LIR (Low-Level IR): Types abstracted, GC-specific instructions
  5. WASM/TS: Final code generation

12.2 Source to HIR

The High-Level IR is a direct lowering from the typed AST. Generics are preserved in their polymorphic form; specialization happens later in the pipeline.

HIR AST Design

The HIR AST in

samlang-ast/src/hir.rs
represents the structure of a samlang program at a high level. The key HIR node types are:

Source → HIR Transformations

Key transformations performed when lowering from Source AST to HIR:

  1. Method → Static Function: Instance methods are converted to static functions with an explicit

    _this
    parameter as the first parameter. The receiver is passed as the first argument.

    // Source
    class Point(val x: int, val y: int) {
      method distanceSquared(): int = ...
    }
    
    // HIR
    class Point(val x: int, val y: int) {
      function distanceSquared(this: Point, x: int): int = ...
    }
  2. Struct Constructors: Lowered to

    StructInit
    statements that allocate and initialize struct fields. For a struct with
    val
    fields
    f1, f2, ..., fn
    , the constructor becomes:

    StructInit(Point, x, y) { ... }
  3. Enum Constructors: Lowered to

    EnumInit
    statements that create enum values with appropriate variant tag and data fields. For an enum with variants
    V1(T1), ..., Vn(Tn)
    , each variant
    Vi(Ti)
    becomes:

    EnumInit(Vi, args...) { ... }
  4. Pattern Matching:

    match
    expressions are lowered to nested
    ConditionalDestructure
    statements that test variant tags and extract data fields. The condition expression uses tag comparison (e.g.,
    this.tag == Variant1
    ).

  5. Lambdas: Anonymous functions are converted to named synthetic functions and wrapped in

    ClosureInit
    values. Each lambda
    (x) -> body
    becomes:

    ClosureInit(fn, [captured_vars...], context) { ... }

    The lambda body is extracted as a static function and the captured environment is stored in the closure struct.

  6. Method References: References to instance methods become closures capturing the receiver. For

    this.method(args)
    , the transformation creates:

    ClosureInit(method, [this, captured_vars...], context) { ... }
  7. Tuple Types: Synthesized as named struct types (

    Pair
    ,
    Triple
    , etc.) from the standard library. Tuple construction
    (e0, e1, e2)
    becomes a
    StructInit
    for the appropriate tuple class.

  8. String Literals: Converted to global string constants referenced by name. Each unique string literal is assigned a name like

    _Str_42
    and referenced globally.

  9. Control Flow: SSA-style control flow using

    IfElse
    with
    final_assignments
    (phi nodes) to merge values from different branches. This enables efficient SSA-based optimizations.

SSA in HIR

HIR uses Static Single Assignment (SSA) semantics. Each variable binding creates a new variable binding, even if the same identifier is reused. References are resolved to their defining binding site.

12.3 HIR to MIR

The Mid-Level IR applies several transformations to prepare code for optimization.

12.3.1 Generic Specialization

A demand-driven monomorphization process specializes generic types from entry points:

  1. Demand Collection: Starting from entry points, the compiler walks the call graph.
  2. Type Instantiation: When a generic type is encountered with concrete type arguments, a specialized version is created. For example,
    List<int>
    becomes a concrete type
    List__int
    in MIR.
  3. Name Mangling: Specialized type names incorporate the type arguments. For example,
    Foo<int>
    becomes
    Foo__int
    .
  4. Caching: Specialized versions are cached to avoid re-computation across the call graph.
  5. Incremental Specialization: The process repeats until no new specializations are needed.

This transformation enables downstream optimizations to work on concrete types without carrying generic overhead.

12.3.2 Enum Representation Optimization

Each enum variant is classified into one of three representation strategies to optimize memory and performance:

| Strategy | When Used | Description | | ----------- | --------------------------------- | --------------------------------------------------------------------------------- | | Int31 | 0 data fields | Stored directly as a raw

ref.i31
value. No heap allocation required. | | Unboxed | 1 data field (primitive) | Uses an identity cast. The value is stored without a wrapper struct. | | Boxed | Multiple data fields or 1 complex | Default strategy - uses a heap-allocated struct with a tag field and data fields. |

The compiler analyzes enum patterns and usage across the codebase to select the optimal representation for each variant:

This optimization significantly reduces memory allocation and improves performance for common enum patterns.

12.3.3 Type Deduplication

Structurally identical specialized types are merged to reduce code size. If two specializations produce the same field types, they share a single type definition.

For example:

12.3.4 Constant Parameter Elimination

Parameters that are always passed the same constant value are inlined at each call site, reducing parameter passing overhead.

// Before: calls to foo(5, 5) allocate space for both arguments
foo(5, 5);  // After inlining: no call overhead

12.4 MIR Optimization Passes

The MIR optimizer runs four rounds of per-function optimization combined with function inlining and global dead code elimination. Each round consists of the following passes:

12.4.1 Conditional Constant Propagation (CCP)

Folds constant expressions, performs algebraic simplification, and eliminates dead branches:

The pass tracks which variables are constant across the function and propagates this knowledge to eliminate redundant computations.

12.4.2 Loop Optimizations

These transformations improve performance by reducing loop overhead and enabling better register allocation.

12.4.3 Common Subexpression Elimination (CSE)

Eliminates redundant computations by identifying structurally identical expressions:

This pass operates at the expression level and identifies duplicated computations that can be replaced with a single computed value.

12.4.4 Local Value Numbering (LVN)

Performs scoped deduplication within basic blocks, identifying repeated computations at a finer granularity than CSE:

12.4.5 Dead Code Elimination (DCE)

Performs backward liveness analysis to eliminate unused bindings and unreachable code:

This pass reduces code size and eliminates unnecessary computations.

12.4.6 Cross-Function Passes

Between optimization rounds, the compiler performs:

12.5 MIR to LIR

The Low-Level IR introduces type erasure and backend-specific instructions.

12.5.1 Type Erasure

Introduces

AnyPointer
for enums with
Int31
variants to enable uniform handling:

// MIR: enum E<A, B> { V1(A), ..., Vn(An) }
// LIR: enum E<A, B> { V1(AnyPointer), ..., Vn(AnyPointer) }

This allows all variants to be stored uniformly and handled with simple pointer comparisons rather than variant-specific logic.

12.5.2 Closure Expansion

ClosureInit
is lowered to
StructInit
creating a two-element struct
[fn_ptr, context]
:

// MIR
ClosureInit(fn, captured_vars..., context)

// LIR
StructInit(Closure, fn_ptr, context) {
  fn_ptr = ...;    // Function pointer
  context = ...;  // Captured environment
}

Each closure struct contains:

12.5.3 Indirect Call Expansion

Closure calls are expanded to extract the function pointer and context from the closure struct, followed by a

call_indirect
instruction:

// Before expansion
let result = closure(arg)

// After expansion
StructInit fn_ptr = ...;  // closure struct
context = ...;
let fn_result = call_indirect(fn_ptr, context, arg)  // Indirect call

This enables the implementation of first-class functions with proper closure semantics.

12.5.4 Subtype Hierarchies for Enums

For enum types, a parent type is created with an extensible tag field to enable uniform dispatch:

// MIR: enum E<T> has variants V1, ..., Vn
// LIR creates parent type
enum Parent<T> {
  tag: int,
}

Pattern matching uses

ref.test
to check the tag and
ref.cast
to obtain the parent type, then dispatches through the parent's vtable.

12.5.5 Closure Function Signatures

The first parameter of all closure functions is forced to

AnyPointer
to represent the captured context:

// All closure functions have this signature
fn closure(ctx: AnyPointer, args...): T

This uniform signature enables the same closure type to be used regardless of which enum variant it contains.

12.6 LIR to WebAssembly

The WebAssembly backend uses the WasmGC proposal:

Memory Management

Data Type Representations

Integer Variants (Int31)

Whole-Program Compilation

Type Mappings

The LIR integer type maps to Wasm types:

| LIR Type | WASM Type | Description | | ------------ | --------- | ---------------- | |

Int32
|
i32
| 32-bit integer | |
Int31
|
i31
| Enum variant tag | |
AnyPointer
|
i32
| Function pointer |

12.7 LIR to TypeScript

The TypeScript backend uses the same LIR but emits TypeScript syntax:

Structs

Converted to tuple types for compatibility with JavaScript:

// LIR: struct Point { x: int, y: int }
// TypeScript
type _Point = [number, _Str] = [0, "Point"];

Type Mappings

Division

Integer division uses

Math.floor(a / b)
for semantics matching WebAssembly's truncating division.

Comparisons

Coerced via

Number(a op b)
to produce
0
or
1
for boolean results.

Module Structure

Each entry module gets its own

.ts
file with a trailing
_Module_Main$main()
call.

Runtime Helpers

The TypeScript prolog provides runtime functions:


13. Limits and Constraints

13.1 Struct Limits

Struct definitions may contain at most 16 fields. This limit applies to the total number of field declarations within a single struct type definition.

13.2 Tuple Limits

Tuple types and tuple literals may contain at most 16 elements. This limit applies to both type declarations and value expressions.

13.3 Integer Range

Integer values must be within the signed 32-bit range:

Integer literals outside this range result in a compilation error. Integer arithmetic operations that overflow are not guaranteed to wrap or trap; behavior is implementation-defined.

13.4 String Limits

String literals cannot span multiple lines. Multi-line strings must be constructed through string concatenation or other runtime operations.

13.6 Identifier Length

Identifiers may be of arbitrary length, subject to memory constraints of the compilation environment.

13.7 Recursion Depth

The language does not enforce a maximum recursion depth. Programs with deep recursion may exhaust runtime stack resources; tail-call optimization is not guaranteed.

13.8 Module Nesting

There is no enforced limit on module nesting depth, as modules are identified by dot-separated qualified names.

13.9 Type Parameter Limits

Type parameters may be applied to any number of types. There is no explicit limit on the number of type parameters a generic definition may declare.


14. Intentional Omissions

14.1 No Mutable Variables or Assignment

Samlang does not provide mutable variable declarations or assignment operators. All bindings are immutable. State changes are achieved through function calls that return new values rather than in-place modifications.

14.2 No Loops

There are no looping constructs (no

while
,
for
,
do
, or
loop
keywords). Iteration is performed through recursion or through higher-order functions provided by the standard library (e.g.,
List.map
,
List.fold
).

14.3 No Null/Nullable Types

Samlang does not have a null value or nullable type constructors. Optional values are represented using the

Option
type from the standard library, with variants
Option::Some(value)
and
Option::None()
.

14.4 No Exceptions

There are no exception types, throw statements, or try-catch blocks. Error handling is performed using the

Result
type from the standard library, with variants
Result::Ok(value)
and
Result::Error(error)
. Runtime panics are triggered through
Process.panic(message)
.

14.5 No Class Inheritance

There is no class-based inheritance. Types can implement any number of interfaces, and interfaces can extend other interfaces. Code reuse is achieved through composition and higher-order functions.

14.6 No Method or Function Overloading

Functions and methods cannot be overloaded. Each function name within a scope must refer to a single function definition. Polymorphism is achieved through generics and pattern matching.

14.7 No Implicit Conversions

Samlang does not perform implicit type conversions between distinct types, including numeric types. All conversions must be explicit through constructor functions or conversion utilities provided by the standard library.

14.8 No Global Variables or Top-Level Expressions

All state must be encapsulated within functions. There are no global variable declarations, and top-level expressions are not permitted. Module-level definitions are limited to type declarations and function definitions.

14.9 No Array/List Literal Syntax

There is no syntax for array or list literals. Lists are constructed using the

List
module functions, such as
List.empty()
,
List.singleton(value)
, and
List.cons(head, tail)
.

14.10 No Switch Statements

Pattern matching serves as the primary branching mechanism. There is no separate switch statement syntax. All multi-way branching is expressed through

match
expressions.

14.11 No Bitwise Operators

The language does not provide bitwise operators (

&
,
|
,
^
,
~
,
<<
,
>>
). Bit-level operations are not part of the core language.

14.12 No Floating-Point Types

Samlang does not provide floating-point number types. Only signed 32-bit integers are provided as the numeric primitive type.

14.13 No Union Types

There is no union type syntax (

string | int
). Discriminated unions are expressed through enums with variants.

14.14 No Intersect Types

There is no intersect type syntax. Types cannot be combined through intersection.