Skip to content

package parse

import "github.com/cloudboss/unobin/pkg/lang/parse"

Functions

func LineStarts

func LineStarts(src []byte) []int

LineStarts returns the byte offset of the first byte of each line.

func Parse

func Parse(filename string, b []byte, opts ...Option) (any, error)

Parse parses the data from b using filename as information in the error messages.

func ParseFile

func ParseFile(filename string, opts ...Option) (i any, err error)

ParseFile parses the file identified by filename.

func ParseReader

func ParseReader(filename string, r io.Reader, opts ...Option) (any, error)

ParseReader parses the data from r using filename as information in the error messages.

func PascalToKebab

func PascalToKebab(s string) string

PascalToKebab converts a PascalCase Go identifier to its kebab-case UB form. A '-' goes before a capital that follows a lowercase letter or digit, and before the last capital in a stretch of capitals that precedes a lowercase letter (so HTTPSProxy becomes https-proxy). Non-letter, non-digit bytes pass through unchanged.

Types

type ArrayLit

type ArrayLit struct {
    S        Span
    Elements []Expr
}

ArrayLit is a bracket-delimited list: `[ v1 v2 v3 ]`. Elements are whitespace separated.

func (*ArrayLit) Span

func (n *ArrayLit) Span() Span

type BoolLit

type BoolLit struct {
    S     Span
    Value bool
}

BoolLit is `true` or `false`.

func (*BoolLit) Span

func (n *BoolLit) Span() Span

type Call

type Call struct {
    S       Span
    Callee  *Ident // Simple name; nil if Library is set.
    Library *Ident // Library alias (e.g., "lib") when callee is `lib.foo`.
    Func    *Ident // Function name in library-qualified form.
    Args    []Expr
}

Call is a function call: `format('%s-%s' a b)`. Args are whitespace- separated. The callee is either a bare identifier (built-in: `range`, `format`, etc.) or a qualified dotted function name. For now we model the callee as its raw text; the resolver disambiguates.

func (*Call) Span

func (n *Call) Span() Span

type Cloner

type Cloner interface {
    Clone() any
}

Cloner is implemented by any value that has a Clone method, which returns a copy of the value. This is mainly used for types which are not passed by value (e.g map, slice, chan) or structs that contain such types.

This is used in conjunction with the global state feature to create proper copies of the state to allow the parser to properly restore the state in the case of backtracking.

type Comment

type Comment struct {
    S    Span
    Text string
}

Comment is a single `# ...` line comment as recorded during parsing. The Text field includes the leading `#` and runs to (but does not include) the terminating newline.

type Comprehension

type Comprehension struct {
    S      Span
    Kind   ComprehensionKind
    Names  []string
    Source Expr
    Key    Expr
    Value  Expr
    Group  bool
    Filter Expr
}

Comprehension is a list or map comprehension over an iterable. Names holds one or two bound identifiers: one binds each element (list) or value (map); two binds index+element (list source) or key+value (map source), resolved by the source type at check time. The bound names are a new dot-path root class scoped to the body, so the reference walker excludes them from the dependency graph.

For CompList, Value is the produced element and Key is nil. For CompMap, Key and Value are the produced pair, and Group reports a trailing `...` that collects same-key values into a list. Filter is the `when` predicate, nil when absent.

func (*Comprehension) Span

func (n *Comprehension) Span() Span

type ComprehensionKind

type ComprehensionKind int

ComprehensionKind distinguishes the list form `[ for x in xs : elem ]` from the map form `{ for x in xs : key => val }`.

func (ComprehensionKind) String

func (k ComprehensionKind) String() string

String returns the constant's identifier. Used by codegen to emit a human-readable kind constant in generated source.

type Conditional

type Conditional struct {
    S    Span
    Cond Expr
    Then Expr
    Else Expr
}

Conditional is `if cond then a else b`. The else branch is mandatory; chaining `else if` falls out of Else holding another Conditional. The runtime evaluates Cond, then only the taken branch (short circuit), so the dead branch never runs. The static dependency set is the union of refs in both branches, since the graph is built before Cond is known.

func (*Conditional) Span

func (n *Conditional) Span() Span

type DotPath

type DotPath struct {
    S        Span
    Root     *Ident
    Segments []DotSegment
}

DotPath is a dot-separated address like `input.region`, `resource.app.id`, or `data-source.ami.id`. Segments after the root navigate by name (`.id`), by a string key or integer position (`["alpha"]`, `[0]`), or project over a list with a splat (`[*]`). The first segment (Root) is one of the reserved address roots: input, data-source, resource, action, @each.

func (*DotPath) Span

func (n *DotPath) Span() Span

type DotSegment

type DotSegment struct {
    S     Span
    Name  string // Set when this segment is `.name` or `?.name`.
    Index Expr   // Set when this segment is `[expr]`, otherwise nil.
    Splat bool   // Set when this segment is `[*]`.
    // Guarded is set when this segment is `?.name`: a null value
    // stops the navigation and the whole path reads as null, so the
    // path's type is optional.
    Guarded bool
}

DotSegment is one piece of a DotPath following the root: a name (`.foo`), an index (`["alpha"]` or `[0]`), or a splat (`[*]`) that projects the segments to its right over each element of a list.

type Error

type Error struct {
    Kind ErrorKind
    Pos  Position
    Msg  string
    // Hint is an optional second line offering a fix or pointer.
    Hint string
}

Error is a single diagnostic. It always carries a Position so output formatters can produce file:line:col prefixes consistently.

func Errorf

func Errorf(kind ErrorKind, pos Position, format string, args ...any) *Error

Errorf constructs an Error with a formatted message.

func (*Error) Error

func (e *Error) Error() string

type ErrorKind

type ErrorKind int

ErrorKind tags an Error so callers can branch on its kind. The set is deliberately small - finer-grained classification is the message's job.

func (ErrorKind) String

func (k ErrorKind) String() string

type ErrorList

type ErrorList struct {
    // contains filtered or unexported fields
}

ErrorList collects diagnostics from a compilation step (parsing, type checking, etc.). Callers append errors and continue until the budget is exceeded, then inspect Len() to decide whether to advance to the next step or surface the errors.

func NewErrorList

func NewErrorList(budget int) *ErrorList

NewErrorList returns an ErrorList that stops collecting after `budget` errors. Use 0 for unlimited.

func (*ErrorList) Add

func (l *ErrorList) Add(e *Error)

Add appends e. If the budget has been reached, the error is dropped.

func (*ErrorList) Addf

func (l *ErrorList) Addf(kind ErrorKind, pos Position, format string, args ...any)

Addf is a convenience for Errorf + Add.

func (*ErrorList) Err

func (l *ErrorList) Err() error

Err returns nil if there are no errors, the single error if exactly one, and an aggregate error otherwise. Useful as the return value of a compilation step that wants to surface its accumulated diagnostics.

func (*ErrorList) Error

func (l *ErrorList) Error() string

func (*ErrorList) Errors

func (l *ErrorList) Errors() []*Error

Errors returns the collected errors in source order (file path, then line, then column). The returned slice aliases internal storage; callers should treat it as readonly.

func (*ErrorList) Len

func (l *ErrorList) Len() int

Len returns the number of collected errors.

func (*ErrorList) Messages

func (l *ErrorList) Messages() []string

Messages returns the bare Msg of each error in source order, without the file:line:col prefix. Use it when a test pins message content but not exact positions. The result is nil when there are no errors.

func (*ErrorList) Strings

func (l *ErrorList) Strings() []string

Strings returns the fully rendered form of each error in source order, including the file:line:col prefix, kind, and any hint. The result is nil when there are no errors.

type Expr

type Expr interface {
    Node
    // contains filtered or unexported methods
}

Expr is any node that produces a value. Object and array literals are expressions, as are dotted paths, calls, infix and prefix operations, type expressions, and bare identifiers (which act as enum values).

func ParseExpr

func ParseExpr(path string, b []byte) (Expr, error)

ParseExpr parses b as a single UB expression and returns its AST. path labels Position.File on each node. Trailing content past the expression is rejected: parsing happens through a synthetic single-field file so the grammar's EOF rule rejects leftovers. Position columns are reported one greater than the input column because of the wrapping prefix; callers showing source context to users may want to adjust.

type Field

type Field struct {
    S     Span
    Key   FieldKey
    Value Expr
    Decl  *SelectorBody
}

Field is one entry in an ObjectLit. It is either a value field (`key: value`) or a selector-body declaration (`name: selector { ... }` or `selector { ... }`).

func (*Field) Span

func (n *Field) Span() Span

type FieldKey

type FieldKey struct {
    S      Span
    Kind   FieldKeyKind
    Name   string   // Identifier text, including any leading `@`.
    String string   // Raw string content, when Kind == FieldString.
    Path   []string // Dotted segments, when Kind == FieldPath.
}

FieldKey distinguishes the key forms an object field can have.

Kind == FieldIdent: bare identifier (possibly @-prefixed). Kind == FieldString: quoted string literal. Kind == FieldPath: dotted identifier path, such as aws.iam-role.it. Only a resource, data-source, or action declaration head uses this form.

The post-parse pass is responsible for deciding whether a given key is permitted at its position (e.g., closed set enum identifier vs free form string vs meta key).

func (FieldKey) IsMeta

func (k FieldKey) IsMeta() bool

IsMeta reports whether the key is a `@`-prefixed meta key.

type FieldKeyKind

type FieldKeyKind int

type File

type File struct {
    S    Span
    Kind FileKind
    Path string
    Body *ObjectLit // The top level body is always an object.
    // Comments is every `#` line comment captured during parsing, in
    // source order. The format renderer interleaves them by position;
    // the runtime and type checker ignore them.
    Comments []Comment
}

File is the top level container for a parsed .ub source file. The Kind is determined after parsing by inspecting which top level keys are present and matching them against the known file types (stack, library, exported type, config). Until that classification step runs, a File's Kind is FileUnknown.

func ParseSource

func ParseSource(path string, b []byte) (*File, error)

ParseSource reads .ub source from b and returns the parsed File. The path populates Position.File on each AST node. Pass an empty string when parsing in-memory input. File.Kind is left at its zero value; callers that need it use pkg/lang's wrapper which classifies by filename.

On parse failure, the returned error wraps pigeon's diagnostics. Callers that want structured errors should switch on the underlying type.

func (*File) Span

func (f *File) Span() Span

type FileKind

type FileKind int

FileKind is the classification tag for a parsed file.

func (FileKind) String

func (k FileKind) String() string

type Ident

type Ident struct {
    S    Span
    Name string
}

Ident is a bare identifier appearing at value position. Its meaning depends on context: an enum value (e.g., `type: string`), a closed set constraint kind (`kind: required-together`), or a field name reference inside a `fields:` list. The parser doesn't disambiguate; the type checker / schema validator does.

func (*Ident) Span

func (n *Ident) Span() Span

type Infix

type Infix struct {
    S     Span
    Op    string
    Left  Expr
    Right Expr
}

Infix is a binary operation: a + b, a == b, a && b. The Op field is the raw operator text.

func (*Infix) Span

func (n *Infix) Span() Span

type InterpolatedPart

type InterpolatedPart struct {
    S    Span
    Lit  string
    Expr Expr
    Verb string
}

InterpolatedPart is one segment of an InterpolatedString. When Expr is nil it is a literal run carried in Lit. Otherwise it is a `{{ Expr }}` slot, rendered through the Go printf verb in Verb (e.g. "%03d") when Verb is non-empty, or with the default rendering when Verb is empty.

type InterpolatedString

type InterpolatedString struct {
    S     Span
    Parts []InterpolatedPart
    Form  StringForm
}

InterpolatedString is an interpolated string from the `$'...'` form. Parts run left to right, alternating literal text and `{{ expr }}` slots. Form records the underlying string form so the formatter re-emits the right delimiter. A slot's value must evaluate to a scalar; the type checker enforces that.

func (*InterpolatedString) Span

func (n *InterpolatedString) Span() Span

type Node

type Node interface {
    Span() Span
}

Node is the root of the AST hierarchy. Every node knows its source span; the End may be the zero Position when only a starting point is known.

type NullLit

type NullLit struct {
    S Span
}

NullLit is the `null` keyword.

func (*NullLit) Span

func (n *NullLit) Span() Span

type NumberLit

type NumberLit struct {
    S           Span
    Value       string
    IsFloat     bool
    ParsedInt   int64
    ParsedFloat float64
}

NumberLit covers both integers and floats - the type system narrows to `integer` when a constraint demands it. Value is the canonical text from source (preserving trailing zeros etc.). ParsedFloat / ParsedInt hold the numeric form. IsFloat distinguishes them.

func (*NumberLit) Span

func (n *NumberLit) Span() Span

type ObjectLit

type ObjectLit struct {
    S      Span
    Fields []*Field
    Source []byte
}

ObjectLit is a brace delimited map: `{ key1: value1 key2: value2 }`. Fields are whitespace separated (newlines or spaces); the language has no commas. Keys preserve source order (operators and humans both rely on that).

func (*ObjectLit) Span

func (n *ObjectLit) Span() Span

type Option

type Option func(*parser) Option

Option is a function that can set an option on the parser. It returns the previous setting as an Option.

func AllowInvalidUTF8

func AllowInvalidUTF8(b bool) Option

AllowInvalidUTF8 creates an Option to allow invalid UTF-8 bytes. Every invalid UTF-8 byte is treated as a utf8.RuneError (U+FFFD) by character class matchers and is matched by the any matcher. The returned matched value, c.text and c.offset are NOT affected.

The default is false.

func Debug

func Debug(b bool) Option

Debug creates an Option to set the debug flag to b. When set to true, debugging information is printed to stdout while parsing.

The default is false.

func Entrypoint

func Entrypoint(ruleName string) Option

Entrypoint creates an Option to set the rule name to use as entrypoint. The rule name must have been specified in the -alternate-entrypoints if generating the parser with the -optimize-grammar flag, otherwise it may have been optimized out. Passing an empty string sets the entrypoint to the first rule in the grammar.

The default is to start parsing at the first rule in the grammar.

func GlobalStore

func GlobalStore(key string, value any) Option

GlobalStore creates an Option to set a key to a certain value in the globalStore.

func InitState

func InitState(key string, value any) Option

InitState creates an Option to set a key to a certain value in the global "state" store.

func MaxExpressions

func MaxExpressions(maxExprCnt uint64) Option

MaxExpressions creates an Option to stop parsing after the provided number of expressions have been parsed, if the value is 0 then the parser will parse for as many steps as needed (possibly an infinite number).

The default for maxExprCnt is 0.

func Memoize

func Memoize(b bool) Option

Memoize creates an Option to set the memoize flag to b. When set to true, the parser will cache all results so each expression is evaluated only once. This guarantees linear parsing time even for pathological cases, at the expense of more memory and slower times for typical cases.

The default is false.

func Recover

func Recover(b bool) Option

Recover creates an Option to set the recover flag to b. When set to true, this causes the parser to recover from panics and convert it to an error. Setting it to false can be useful while debugging to access the full stack trace.

The default is true.

func Statistics

func Statistics(stats *Stats, choiceNoMatch string) Option

Statistics adds a user provided Stats struct to the parser to allow the user to process the results after the parsing has finished. Also the key for the "no match" counter is set.

Example usage:

input := "input"
stats := Stats{}
_, err := Parse("input-file", []byte(input), Statistics(&stats, "no match"))
if err != nil {
    log.Panicln(err)
}
b, err := json.MarshalIndent(stats.ChoiceAltCnt, "", "  ")
if err != nil {
    log.Panicln(err)
}
fmt.Println(string(b))

type Position

type Position struct {
    File   string
    Line   int
    Column int
    Offset int
}

Position locates a byte in a source file.

Line and Column are 1-based; Offset is 0-based bytes from start of file. File is the path supplied to the parser (may be empty for in-memory inputs).

func (Position) IsZero

func (p Position) IsZero() bool

IsZero reports whether p has not been set.

func (Position) String

func (p Position) String() string

type Prefix

type Prefix struct {
    S    Span
    Op   string
    Expr Expr
}

Prefix is a unary operation: !a, -a.

func (*Prefix) Span

func (n *Prefix) Span() Span

type Selector

type Selector struct {
    S     Span
    Parts []Ident
}

Selector is one or more identifier parts separated by dots.

type SelectorBody

type SelectorBody struct {
    S        Span
    Default  bool
    Selector Selector
    Body     *ObjectLit
}

SelectorBody is a declaration whose body is classified by a selector. Default is true for selector defaults such as `greet { ... }`, where the selector itself is the declaration head.

func (*SelectorBody) Span

func (n *SelectorBody) Span() Span

type SourceFile

type SourceFile struct {
    File       string
    LineStarts []int
}

SourceFile maps byte offsets in one file to source positions.

func NewSourceFile

func NewSourceFile(file string, lineStarts []int) SourceFile

NewSourceFile creates a file position helper from a line-start table.

func (SourceFile) Position

func (f SourceFile) Position(offset int) Position

Position returns the 1-based line and column for offset.

func (SourceFile) Span

func (f SourceFile) Span(start, end int) Span

Span returns the source span between two byte offsets.

type Span

type Span struct {
    Start Position
    End   Position
}

Span is a half open byte range from Start (inclusive) to End (exclusive). Both ends share the File from Start. End may be the zero value when only a point is known.

type Stats

type Stats struct {
    // ExprCnt counts the number of expressions processed during parsing
    // This value is compared to the maximum number of expressions allowed
    // (set by the MaxExpressions option).
    ExprCnt uint64

    // ChoiceAltCnt is used to count for each ordered choice expression,
    // which alternative is used how may times.
    // These numbers allow to optimize the order of the ordered choice expression
    // to increase the performance of the parser
    //
    // The outer key of ChoiceAltCnt is composed of the name of the rule as well
    // as the line and the column of the ordered choice.
    // The inner key of ChoiceAltCnt is the number (one-based) of the matching alternative.
    // For each alternative the number of matches are counted. If an ordered choice does not
    // match, a special counter is incremented. The name of this counter is set with
    // the parser option Statistics.
    // For an alternative to be included in ChoiceAltCnt, it has to match at least once.
    ChoiceAltCnt map[string]map[string]int
}

Stats stores some statistics, gathered during parsing

type StringForm

type StringForm int

StringForm distinguishes the source form a StringLit was parsed from and tells the formatter which form to re-emit. The zero value is StringSingleQuoted.

func (StringForm) IsMultiLine

func (f StringForm) IsMultiLine() bool

IsMultiLine reports whether the form occupies multiple source lines. It returns true for the six sigil-bearing triple-quote forms and false for single-quoted and single-line triple-quote.

func (StringForm) String

func (f StringForm) String() string

String returns the constant's identifier. Used by codegen to emit a human-readable form constant in generated source.

type StringLit

type StringLit struct {
    S     Span
    Value string
    Form  StringForm
}

StringLit is a string literal. Three families of source form produce this node, tracked on Form:

  • Single quoted: 'hello\nworld'. Backslash escapes are processed during parsing, so Value holds the decoded content. Double quotes are not a string delimiter.
  • Single-line triple-quoted (delimited by three single quotes). The body has no newline, no escape processing, and is verbatim.
  • Multi-line triple-quoted with a sigil that selects mode (literal / folded / joined) and chomp (clip / strip). Value holds the dedented and mode-processed content.

The formatter dispatches on Form to choose the source form when re-emitting; the runtime and type-checker only read Value.

func (*StringLit) Span

func (n *StringLit) Span() Span

type TypeAtomic

type TypeAtomic struct {
    S    Span
    Name string
}

TypeAtomic names a primitive: string, number, integer, boolean, null, opaque.

func (*TypeAtomic) Span

func (n *TypeAtomic) Span() Span

type TypeExpr

type TypeExpr interface {
    Expr
    // contains filtered or unexported methods
}

TypeExpr is an expression in the type sub-grammar. Type expressions appear wherever a type is declared (input schema's `type:` field, object field types, etc.). They aren't usable as runtime values - the type checker rejects them outside type-position.

func ParseType

func ParseType(path string, b []byte) (TypeExpr, error)

ParseType parses b as a UB type expression and returns its AST.

func ParseTypeAt

func ParseTypeAt(path string, b []byte, base Position) (TypeExpr, error)

ParseTypeAt parses b as a UB type expression whose first byte starts at base in the source file.

type TypeLibraryConfig

type TypeLibraryConfig struct {
    S    Span
    Path *StringLit
}

TypeLibraryConfig is library-config('\').

func (*TypeLibraryConfig) Span

func (n *TypeLibraryConfig) Span() Span

type TypeList

type TypeList struct {
    S    Span
    Elem TypeExpr
}

TypeList is `list(T)`.

func (*TypeList) Span

func (n *TypeList) Span() Span

type TypeMap

type TypeMap struct {
    S    Span
    Elem TypeExpr
}

TypeMap is `map(T)`. Keys are always strings.

func (*TypeMap) Span

func (n *TypeMap) Span() Span

type TypeObject

type TypeObject struct {
    S      Span
    Open   bool
    Fields []*TypeObjectField
}

TypeObject is `object({ field1: T1 field2: T2 ... })`. Open is true when the type is wrapped in `open(...)`: a value may then hold fields beyond the declared ones, which pass through unread.

func (*TypeObject) Span

func (n *TypeObject) Span() Span

type TypeObjectField

type TypeObjectField struct {
    S    Span
    Name string
    Type TypeExpr
    // Decl is set when the field's right-hand side is an input declaration
    // (an object literal) rather than a bare type expression. The two are
    // mutually exclusive.
    Decl *ObjectLit
}

TypeObjectField is one field inside a TypeObject. The type may be a plain type expression or - when the field is declared in an `inputs:` block - a full input declaration (an object literal with `type:`, modifiers, etc.). At AST level we keep both possibilities; the schema validator disambiguates.

type TypeOptional

type TypeOptional struct {
    S    Span
    Elem TypeExpr
}

TypeOptional is optional(T).

Optionality implies nullability - wrapping with optional() allows null values; bare types do not.

func (*TypeOptional) Span

func (n *TypeOptional) Span() Span

type TypeTuple

type TypeTuple struct {
    S        Span
    Elements []TypeExpr
}

TypeTuple is tuple(T1, T2, ...).

func (*TypeTuple) Span

func (n *TypeTuple) Span() Span