Lace-030 · Datalog and Lacegrams

Tags: datalog, engine, lacegram

Purpose

Lace needs one portable rule language for record-set selection, support closure, authority evidence, query exposure, and exchange policy. This spec defines the Datalog profile used by Lace: syntax, safety, stratification, evaluation, built-ins, canonical lacegrams, and resource limits.

Record facts and origin-aware record-fact views are defined by 020. Exchange policy, peer advertisement facts, and generated exchange predicates are defined by 040 and 050. This document defines the rule language over fact sources supplied by those profiles.

This document is for Datalog engine implementors and profile authors. After reading it, an implementor should be able to parse, validate, canonicalize, and evaluate Lace rule programs over explicit finite fact sources with portable semantics.

Defines

Values, terms, fact classes, rule syntax, safety, shadowing, stratification, evaluation, built-ins, canonical lacegrams, identifiers, limits, and incremental evaluation.

Values

All fact values are UTF-8 NFC text strings. An implementation MUST reject a fact value that is not valid UTF-8 NFC or that exceeds configured tuple limits.

Hash texts, verifier texts, resource paths, decimal numbers, source labels, exchange-plan identifiers, and rule identifiers are all represented as text values when used in facts.

Fact classes

The engine evaluates two fact classes:

Base facts: facts supplied by a profile for this evaluation snapshot. Base facts include record facts from 020, peer advertisement facts from 050, runtime facts from 050 and ILTP, query-view facts from 020/040, and profile-defined local facts.
Derived facts: facts produced by rules.

Runtime facts are base facts supplied by a runtime, binding, profile, or exchange driver for the current snapshot.

A Datalog implementation MUST NOT read hidden wall-clock time, fetch record bytes from peers, or consult mutable external state except through explicit base facts supplied to the current evaluation snapshot.

Derived predicates are ordinary rule heads. A profile may designate some derived predicates as outputs read by a runtime, but core Datalog gives no predicate name special exchange behavior.

Fact sources and views

Datalog evaluation is parameterized by fact sources. A fact source may expose all facts of a predicate, queries over a predicate, and counts over a predicate. Profiles define which fact sources are available to which modules.

For record facts, 020 defines origin-aware record-fact resolution. A peer-origin module evaluated on a local machine sees local record facts only through the local query view for that peer. This applies equally to positive atoms, negated atoms, and cardinality atoms.

Advertisement facts and public runtime facts are ordinary base facts when supplied by the exchange profile. Advertisement facts are claims used for discovery; they are not record facts.

Rule syntax

A lacegram is UTF-8 NFC text. Each non-empty line is one rule:

Head(Term, ...) :- BodyAtom, BodyAtom.

Zero-arity predicates are written with empty parentheses, for example:

TransportEncrypted()

A fact-like rule with no body uses true:

KnownVerifier('V.example.H3') :- true.

Public predicate names match:

[A-Za-z][A-Za-z0-9_~-]*

Predicate names beginning with _, matching _[A-Za-z][A-Za-z0-9_~-]*, are reserved for local-only or profile-defined predicates such as diagnostics. They are not part of the standard shared exchange surface unless a profile explicitly permits them.

Named variables match:

[A-Z][A-Za-z0-9_]*

_ is the anonymous term. It matches any value and never binds. Each _ occurrence is independent; repeated _ terms do not require equal values. Underscore-prefixed names such as _Name are not variables and are invalid in term position.

Constants are single-quoted UTF-8 NFC strings. Inside a constant, \\ encodes a backslash and \' encodes a single quote. Constants MUST NOT contain LF or CR.

Valid body atoms are:

Form	Meaning
`Predicate(Term, ...)` or `Predicate()`	positive atom
`not Predicate(Term, ...)`	stratified negation
`X != Y`	inequality
`IntCompare(A, Op, B)`	decimal integer comparison
`LexCompare(A, Op, B)`	bytewise lexicographic comparison
`TextShape(Text, Start, Delims, End)`	anchored text-shape test
`Cardinality(Predicate(Term, ...), Op, N)`	stratified cardinality test
`true`	always true

Rule bodies are conjunctions. Multiple rules with the same head predicate are union. A predicate with no base facts and no rules has an empty extension.

Safety

Rules are range-restricted:

every variable in the head MUST appear in a positive non-builtin body atom;
_ is not a variable for safety and MUST NOT appear in a rule head;
every variable in a negated atom MUST appear in a positive non-builtin body atom;
every variable in !=, IntCompare, LexCompare, or TextShape MUST appear in a positive non-builtin body atom;
_ MUST NOT appear in !=, IntCompare, LexCompare, or TextShape;
Cardinality does not bind variables. A variable that appears both inside and outside a Cardinality atom MUST satisfy safety outside the Cardinality atom. Variables that appear only inside the counted atom are local to the cardinality test.

_ MAY appear inside positive, negated, and cardinality predicate atoms as an anonymous, non-binding wildcard.

true, inequality, comparison, text-shape, and cardinality atoms are built-ins and do not satisfy the positive non-builtin binding requirement.

Reserved and shadowed predicates

A derived predicate MUST NOT have the same name and arity as any base predicate defined by the profile evaluating the lacegram.

Profiles MAY reserve predicate names or prefixes for generated rules, internal fact views, query exposure, annotations, or lowering. A lacegram MUST NOT define a reserved predicate unless the profile explicitly permits it.

Predicate names beginning with _ are reserved for local-only or profile-defined predicates. Standard shared selector operands MUST NOT expose leading-underscore predicates through operand merge, exchange-plan identity, advertisements, request lists, or peer-visible fact blocks unless the active profile explicitly defines that behavior.

The standard record fact predicates from 020 are reserved base predicates:

Have/1
Field/4
RecordLink/5
BlobHash/2
PlexHash/2

The standard exchange profile reserves MaySend/1, MayRequest/1, SelectHave/1, SelectAdvertised/2, CanQueryRecord/2, generated operand-scope prefixes, and any internal Allow* predicates it defines.

Stratification

A predicate Q depends on P when a rule for Q contains positive P. It negatively depends on P when a rule for Q contains not P. It cardinality-depends on P when a rule for Q contains Cardinality(P(...), Op, N).

A valid lacegram MUST be stratified: no predicate may transitively depend on itself through a negative edge or cardinality edge. The predicate counted by a Cardinality atom is evaluated in a lower stratum than the rule containing that atom.

Negation is closed-world over the current evaluation snapshot and fact view only. Lacegrams that need a bounded universe SHOULD define it explicitly, for example with Have(P) or with profile-defined runtime facts.

Evaluation

Evaluation makes derived facts explainable from explicit inputs: base facts, runtime facts, and rules. No hidden clock, peer fetch, or mutable outside state is part of a result.

The engine evaluates bottom-up to a least fixed point:

Load the base fact sources for the current evaluation snapshot and module view.
Load runtime facts supplied for the current evaluation snapshot.
Validate safety, reserved-predicate rules, shadowing, and stratification.
Assign strata so negative and cardinality dependencies point to lower strata.
Evaluate strata from low to high until each stratum reaches a fixed point.

Implementations MAY use any strategy that produces the same facts as full bottom-up evaluation over the same resolved fact sources.

Rule evaluation is side-effect free over the current snapshot. Implementations MUST NOT fetch record bytes from peers, create record facts from unvalidated claims, or read hidden clocks or mutable external state during fact lookup. Lazy local indexes and iterators are allowed only when they expose the same facts that the current snapshot already contains.

Within a stratum, facts are added monotonically and the active domain is finite, so evaluation terminates unless a resource limit is exceeded.

Built-ins

`IntCompare(A, Op, B)`

After substitution, A and B are decimal integer strings. Op is one of '<', '<=', '>', '>='. Comparison is numeric.

`LexCompare(A, Op, B)`

After substitution, A and B are text strings. Op is one of '<', '<=', '>', '>='. Comparison is bytewise lexicographic over UTF-8 bytes.

TAI text and equal-length B64A hash payloads are designed so bytewise lexical comparison matches their intended numeric byte order.

`TextShape(Text, Start, Delims, End)`

TextShape is a boolean test. It does not bind variables. Delims MUST be a quoted constant in the current profile. Text, Start, and End are ordinary terms whose variables must already be safe-bound by positive non-builtin atoms.

All comparisons are exact UTF-8 text comparisons. Delims is interpreted as a set of Unicode scalar values; duplicate delimiter characters have no extra effect.

When Delims is the empty string, TextShape(Text,Start,'',End) is true iff there exists a middle string such that:

Text = Start + middle + End

middle may be empty, but Start and End do not overlap. Equivalently, after matching Start, the remaining suffix must be at least as long as End and end with End. TextShape(K,'links/','','') is a prefix test; TextShape(K,'','','.json') is a suffix test; TextShape(K,'links/','','.md') is a combined prefix/suffix test.

When Delims is non-empty, TextShape(Text,Start,Delims,End) is true iff:

Text starts with Start;
the first Unicode scalar value after Start that is in Delims exists;
the skipped text before that delimiter is non-empty;
the complete suffix after that one delimiter is exactly End.

Delimiters inside Start are ignored. Delimiters inside End are ordinary text. A delimiter immediately after Start fails because the skipped segment is empty. If End is empty, the first delimiter after a non-empty skipped segment must be the final scalar value in Text.

`Cardinality(Predicate(Term, ...), Op, N)`

Cardinality counts distinct facts for one predicate atom under the current outer binding and resolved fact view. Variables already bound outside the Cardinality atom are substituted. Variables not bound outside the atom are local wildcards. Op is one of '<', '<=', '>', '>='. N is a decimal integer string.

Cardinality is a boolean test. It does not bind variables and is not a general aggregation mechanism.

For origin-aware record facts, cardinality counts the record facts visible to the module origin. A peer-origin module cannot use cardinality to count raw local record facts outside its query view.

Equality is not Lace Datalog

= is not a Lace Datalog body atom. Core Lace parsers, canonicalizers, annotated-source readers, and resource validators MUST reject source text that contains = as a body atom. They MUST NOT normalize equality constraints into canonical lacegram text.

!= remains a canonical boolean builtin. It is not an equality-normalization constraint.

A DSL outside the core Lace library can provide its own syntax and compile that syntax to valid Lace Datalog, but that source syntax is not Lace Datalog and is not accepted by core Lace Datalog APIs.

Canonical lacegrams

Lacegrams exchanged or identified by hash MUST use canonical text:

UTF-8 NFC;
LF line endings only;
one rule per line;
no blank lines;
no comments;
exactly one space before and after :-;
exactly one space after each comma that separates body atoms;
no spaces inside atom argument lists;
constants use the shortest valid escaping;
rules are sorted by head predicate name, then full rule bytes;
duplicate rules are removed.

The current lacegram identifier is:

R.<b64a>

where <b64a> encodes BLAKE3-256(canonical-lacegram-bytes). R.* values identify canonical lacegram text only; they are not record hashes and MUST NOT be used as record identifiers. The R. prefix identifies the current canonical lacegram text format and digest construction.

Faceted exchange plans defined by 040 use their own canonical transcript or plan identifier. A plan identifier is not interchangeable with a plain canonical lacegram identifier unless the profile explicitly says so.

Annotated lacegram source

Canonical lacegrams do not contain comments or annotations. Tooling MAY support an annotated authoring format that is stripped before canonicalization and emits annotation metadata keyed by canonical rule id.

Annotated source supports two comment-line forms:

# human-only comment
#:json {"label":"member proof","tags":["evidence"],"why":"Authority-signed member records admit content."}

Rules:

a # comment line is human-only source text and is ignored by canonicalization;
a #:json line contains one JSON object encoded as UTF-8 NFC text;
structured annotation lines apply to the next rule line;
multiple #:json lines before one rule merge as JSON object fields, with later duplicate keys replacing earlier keys;
an annotation line with no following rule is invalid annotated source;
comments and annotations are never part of canonical lacegram text or R.<b64a> lacegram identifiers;
annotation metadata is not a runtime fact unless a profile explicitly imports it.

A rule identifier is:

U.<b64a(BLAKE3("lace-rule/v1" || canonical-rule-line-bytes))>

Canonicalization of annotated source emits:

canonical 030 lacegram text;
the canonical lacegram identifier;
an annotation map from rule identifier to merged JSON object.

Merge, lowering, and tracing tools SHOULD preserve provenance from generated rules to source rule identifiers when practical.

Resource limits

An implementation MUST enforce configurable limits. Defaults MUST be at least:

Limit	Minimum supported value
base facts	2^20
runtime facts	2^20
derived facts per predicate	2^18
rule count	256
stratum iterations	1000
tuple arity	8
value bytes	1024

The engine MUST stop with an error rather than exceed a limit.

Fact-source interface

A fact source exposes facts to the Datalog engine:

iterate(predicate_name) -> Iterator<(values...)>
query(predicate_name, value1, value2, ...) -> Iterator<(values...)>
count(predicate_name) -> integer

A query with fewer arguments than the predicate arity returns matching facts with remaining positions unconstrained.

A fact source MAY be a storage index, peer advertisement snapshot, runtime tuple set, materialized relation, query-view wrapper, or generated rule output. It MUST expose the same facts as the profile’s current evaluation snapshot.

Incremental evaluation

Implementations MAY evaluate incrementally when base facts, runtime facts, or rules change. Incremental evaluation MUST produce the same result as full evaluation of the new snapshot and resolved fact views.

If facts are removed, implementations MUST use a deletion strategy that handles stratified negation correctly, such as counting derivations or full re-evaluation of affected strata.

Examples

Local records under a Group/App/Name prefix

Selected(P) :- Have(P), Field(P,'Group',_,'u'), Field(P,'App',_,'ding'), Field(P,'Name',_,K), TextShape(K,'links/','','').

When this rule is local-origin on the local machine, Have and Field resolve against the full local record view. When the same rule is peer-origin on another machine, they resolve against that machine’s query view for the rule origin.

Follow linked support records visible in the current view

Support(Target) :- Have(P), RecordLink(P,'+Link',_,'evidence',Target).

Latest 100 records in the current view

Before(P,Q) :- Candidate(P,T), Candidate(Q,T), LexCompare(Q,'>',P).
Before(P,Q) :- Candidate(P,TP), Candidate(Q,TQ), LexCompare(TQ,'>',TP).
Candidate(P,T) :- Have(P), Field(P,'Group',_,'u'), Field(P,'App',_,'ding'), Field(P,'Name',_,K), TextShape(K,'links/','',''), Field(P,'TAI',_,T).
Selected(P) :- Candidate(P,T), Cardinality(Before(P,Q),'<','100').

Match one skipped segment

Selected(P) :- Have(P), Field(P,'Name',_,K), TextShape(K,'links/','./','msg').

This matches links/bob/msg and links/bob.msg, but rejects links/msg, links/bob/alice/msg, and links/.msg.

Deny blocked signers

Blocked(V) :- Have(P), Field(P,'Group',_,'keys'), Field(P,'App',_,'blocked'), Field(P,'Name',_,V).
Selected(P) :- Have(P), Field(P,'Signed-By',_,V), not Blocked(V).

Rejection examples

Selected(P,V) :- Field(P,'Group',_,'u'). has unbound V in the head.
Selected(_) :- Have(P). uses anonymous _ in a rule head.
Selected(P) :- Have(P), TextShape(_,'links/','',''). uses anonymous _ in a builtin.
Selected(P) :- Have(P), Field(P,'Group',_Name,'u'). uses invalid _Name in term position.
Have(P) :- true. shadows a record fact predicate.
A(P) :- not B(P). plus B(P) :- not A(P). is unstratified.
A(P) :- Candidate(P), Cardinality(A(Q),'<','100'). is unstratified.
A canonical lacegram containing Prefix(K,'links/') is invalid; use TextShape(K,'links/','','').
A lacegram containing X = Y is invalid; = is not Lace Datalog.
A canonical lacegram with comments, CRLF, blank lines, duplicate unsorted rules, or non-canonical escaping is not canonical. The annotated source format accepts # and #:json lines only before canonicalization.