# Lace-030 · Datalog and Lacegrams
Tags: datalog, engine, lacegram

## Purpose

Lace needs one portable rule language for record-set selection, support closure,
authority evidence, query exposure, and exchange policy. This spec defines the
Datalog profile used by Lace: syntax, safety, stratification, evaluation,
built-ins, canonical lacegrams, and resource limits.

Record facts and origin-aware record-fact views are defined by 020. Exchange
policy, peer advertisement facts, and generated exchange predicates are defined by 040
and 050. This document defines the rule language over fact sources supplied by
those profiles.

This document is for Datalog engine implementors and profile authors. After
reading it, an implementor should be able to parse, validate, canonicalize, and
evaluate Lace rule programs over explicit finite fact sources with portable
semantics.

## Defines

Values, terms, fact classes, rule syntax, safety, shadowing, stratification,
evaluation, built-ins, canonical lacegrams, identifiers, limits, and incremental
evaluation.

## Values

All fact values are UTF-8 NFC text strings. An implementation MUST reject a fact
value that is not valid UTF-8 NFC or that exceeds configured tuple limits.

Hash texts, verifier texts, resource paths, decimal numbers, source labels,
exchange-plan identifiers, and rule identifiers are all represented as text
values when used in facts.

## Fact classes

The engine evaluates two fact classes:

1. **Base facts**: facts supplied by a profile for this evaluation snapshot.
   Base facts include record facts from 020, peer advertisement facts from 050, runtime
   facts from 050 and ILTP, query-view facts from 020/040, and profile-defined local
   facts.
2. **Derived facts**: facts produced by rules.

Runtime facts are base facts supplied by a runtime, binding, profile, or exchange
driver for the current snapshot.

A Datalog implementation MUST NOT read hidden wall-clock time, fetch record bytes
from peers, or consult mutable external state except through explicit base facts
supplied to the current evaluation snapshot.

Derived predicates are ordinary rule heads. A profile may designate some derived
predicates as outputs read by a runtime, but core Datalog gives no predicate name
special exchange behavior.

## Fact sources and views

Datalog evaluation is parameterized by fact sources. A fact source may expose all
facts of a predicate, queries over a predicate, and counts over a predicate.
Profiles define which fact sources are available to which modules.

For record facts, 020 defines origin-aware record-fact resolution. A peer-origin
module evaluated on a local machine sees local record facts only through the
local query view for that peer. This applies equally to positive atoms, negated
atoms, and cardinality atoms.

Advertisement facts and public runtime facts are ordinary base facts when supplied by the
exchange profile. Advertisement facts are claims used for discovery; they are not record
facts.

## Rule syntax

A lacegram is UTF-8 NFC text. Each non-empty line is one rule:

```text
Head(Term, ...) :- BodyAtom, BodyAtom.
```

Zero-arity predicates are written with empty parentheses, for example:

```text
TransportEncrypted()
```

A fact-like rule with no body uses `true`:

```text
KnownVerifier('V.example.H3') :- true.
```

Public predicate names match:

```text
[A-Za-z][A-Za-z0-9_~-]*
```

Predicate names beginning with `_`, matching `_[A-Za-z][A-Za-z0-9_~-]*`, are
reserved for local-only or profile-defined predicates such as diagnostics. They
are not part of the standard shared exchange surface unless a profile explicitly
permits them.

Named variables match:

```text
[A-Z][A-Za-z0-9_]*
```

`_` is the anonymous term. It matches any value and never binds. Each `_`
occurrence is independent; repeated `_` terms do not require equal values.
Underscore-prefixed names such as `_Name` are not variables and are invalid in
term position.

Constants are single-quoted UTF-8 NFC strings. Inside a constant, `\\` encodes a
backslash and `\'` encodes a single quote. Constants MUST NOT contain LF or CR.

Valid body atoms are:

| Form | Meaning |
|---|---|
| `Predicate(Term, ...)` or `Predicate()` | positive atom |
| `not Predicate(Term, ...)` | stratified negation |
| `X != Y` | inequality |
| `IntCompare(A, Op, B)` | decimal integer comparison |
| `LexCompare(A, Op, B)` | bytewise lexicographic comparison |
| `TextShape(Text, Start, Delims, End)` | anchored text-shape test |
| `Cardinality(Predicate(Term, ...), Op, N)` | stratified cardinality test |
| `true` | always true |

Rule bodies are conjunctions. Multiple rules with the same head predicate are
union. A predicate with no base facts and no rules has an empty extension.

## Safety

Rules are range-restricted:

- every variable in the head MUST appear in a positive non-builtin body atom;
- `_` is not a variable for safety and MUST NOT appear in a rule head;
- every variable in a negated atom MUST appear in a positive non-builtin body
  atom;
- every variable in `!=`, `IntCompare`, `LexCompare`, or
  `TextShape` MUST appear in a positive non-builtin body atom;
- `_` MUST NOT appear in `!=`, `IntCompare`, `LexCompare`, or `TextShape`;
- `Cardinality` does not bind variables. A variable that appears both inside and
  outside a `Cardinality` atom MUST satisfy safety outside the `Cardinality`
  atom. Variables that appear only inside the counted atom are local to the
  cardinality test.

`_` MAY appear inside positive, negated, and cardinality predicate atoms as an
anonymous, non-binding wildcard.

`true`, inequality, comparison, text-shape, and cardinality atoms are
built-ins and do not satisfy the positive non-builtin binding requirement.

## Reserved and shadowed predicates

A derived predicate MUST NOT have the same name and arity as any base predicate
defined by the profile evaluating the lacegram.

Profiles MAY reserve predicate names or prefixes for generated rules, internal
fact views, query exposure, annotations, or lowering. A lacegram MUST NOT define
a reserved predicate unless the profile explicitly permits it.

Predicate names beginning with `_` are reserved for local-only or
profile-defined predicates. Standard shared selector operands MUST NOT expose
leading-underscore predicates through operand merge, exchange-plan identity,
advertisements, request lists, or peer-visible fact blocks unless the active
profile explicitly defines that behavior.

The standard record fact predicates from 020 are reserved base predicates:

```text
Have/1
Field/4
RecordLink/5
BlobHash/2
PlexHash/2
```

The standard exchange profile reserves `MaySend/1`, `MayRequest/1`, `SelectHave/1`,
`SelectAdvertised/2`, `CanQueryRecord/2`, generated operand-scope prefixes, and any
internal `Allow*` predicates it defines.

## Stratification

A predicate `Q` depends on `P` when a rule for `Q` contains positive `P`. It
negatively depends on `P` when a rule for `Q` contains `not P`. It
cardinality-depends on `P` when a rule for `Q` contains
`Cardinality(P(...), Op, N)`.

A valid lacegram MUST be stratified: no predicate may transitively depend on
itself through a negative edge or cardinality edge. The predicate counted by a
`Cardinality` atom is evaluated in a lower stratum than the rule containing that
atom.

Negation is closed-world over the current evaluation snapshot and fact view only.
Lacegrams that need a bounded universe SHOULD define it explicitly, for example
with `Have(P)` or with profile-defined runtime facts.

## Evaluation

Evaluation makes derived facts explainable from explicit inputs: base facts,
runtime facts, and rules. No hidden clock, peer fetch, or mutable outside state is
part of a result.

The engine evaluates bottom-up to a least fixed point:

1. Load the base fact sources for the current evaluation snapshot and module
   view.
2. Load runtime facts supplied for the current evaluation snapshot.
3. Validate safety, reserved-predicate rules, shadowing, and stratification.
4. Assign strata so negative and cardinality dependencies point to lower strata.
5. Evaluate strata from low to high until each stratum reaches a fixed point.

Implementations MAY use any strategy that produces the same facts as full
bottom-up evaluation over the same resolved fact sources.

Rule evaluation is side-effect free over the current snapshot. Implementations
MUST NOT fetch record bytes from peers, create record facts from unvalidated
claims, or read hidden clocks or mutable external state during fact lookup. Lazy
local indexes and iterators are allowed only when they expose the same facts that
the current snapshot already contains.

Within a stratum, facts are added monotonically and the active domain is finite,
so evaluation terminates unless a resource limit is exceeded.

## Built-ins

### `IntCompare(A, Op, B)`

After substitution, `A` and `B` are decimal integer strings. `Op` is one of
`'<', '<=', '>', '>='`. Comparison is numeric.

### `LexCompare(A, Op, B)`

After substitution, `A` and `B` are text strings. `Op` is one of `'<', '<=',
'>', '>='`. Comparison is bytewise lexicographic over UTF-8 bytes.

`TAI` text and equal-length B64A hash payloads are designed so bytewise lexical
comparison matches their intended numeric byte order.

### `TextShape(Text, Start, Delims, End)`

`TextShape` is a boolean test. It does not bind variables. `Delims` MUST be a
quoted constant in the current profile. `Text`, `Start`, and `End` are ordinary
terms whose variables must already be safe-bound by positive non-builtin atoms.

All comparisons are exact UTF-8 text comparisons. `Delims` is interpreted as a
set of Unicode scalar values; duplicate delimiter characters have no extra
effect.

When `Delims` is the empty string, `TextShape(Text,Start,'',End)` is true iff
there exists a middle string such that:

```text
Text = Start + middle + End
```

`middle` may be empty, but `Start` and `End` do not overlap. Equivalently, after
matching `Start`, the remaining suffix must be at least as long as `End` and end
with `End`. `TextShape(K,'links/','','')` is a prefix test;
`TextShape(K,'','','.json')` is a suffix test;
`TextShape(K,'links/','','.md')` is a combined prefix/suffix test.

When `Delims` is non-empty, `TextShape(Text,Start,Delims,End)` is true iff:

1. `Text` starts with `Start`;
2. the first Unicode scalar value after `Start` that is in `Delims` exists;
3. the skipped text before that delimiter is non-empty;
4. the complete suffix after that one delimiter is exactly `End`.

Delimiters inside `Start` are ignored. Delimiters inside `End` are ordinary text.
A delimiter immediately after `Start` fails because the skipped segment is empty.
If `End` is empty, the first delimiter after a non-empty skipped segment must be
the final scalar value in `Text`.

### `Cardinality(Predicate(Term, ...), Op, N)`

`Cardinality` counts distinct facts for one predicate atom under the current
outer binding and resolved fact view. Variables already bound outside the
`Cardinality` atom are substituted. Variables not bound outside the atom are
local wildcards. `Op` is one of `'<', '<=', '>', '>='`. `N` is a decimal integer
string.

`Cardinality` is a boolean test. It does not bind variables and is not a general
aggregation mechanism.

For origin-aware record facts, cardinality counts the record facts visible to the
module origin. A peer-origin module cannot use cardinality to count raw local
record facts outside its query view.


## Equality is not Lace Datalog

`=` is not a Lace Datalog body atom. Core Lace parsers, canonicalizers,
annotated-source readers, and resource validators MUST reject source text that
contains `=` as a body atom. They MUST NOT normalize equality constraints into
canonical lacegram text.

`!=` remains a canonical boolean builtin. It is not an equality-normalization
constraint.

A DSL outside the core Lace library can provide its own syntax and compile that
syntax to valid Lace Datalog, but that source syntax is not Lace Datalog and is
not accepted by core Lace Datalog APIs.

## Canonical lacegrams

Lacegrams exchanged or identified by hash MUST use canonical text:

- UTF-8 NFC;
- LF line endings only;
- one rule per line;
- no blank lines;
- no comments;
- exactly one space before and after `:-`;
- exactly one space after each comma that separates body atoms;
- no spaces inside atom argument lists;
- constants use the shortest valid escaping;
- rules are sorted by head predicate name, then full rule bytes;
- duplicate rules are removed.

The current lacegram identifier is:

```text
R.<b64a>
```

where `<b64a>` encodes `BLAKE3-256(canonical-lacegram-bytes)`. `R.*` values
identify canonical lacegram text only; they are not record hashes and MUST NOT be
used as record identifiers. The `R.` prefix identifies the current canonical
lacegram text format and digest construction.

Faceted exchange plans defined by 040 use their own canonical transcript or plan
identifier. A plan identifier is not interchangeable with a plain canonical
lacegram identifier unless the profile explicitly says so.

## Annotated lacegram source

Canonical lacegrams do not contain comments or annotations. Tooling MAY support
an annotated authoring format that is stripped before canonicalization and emits
annotation metadata keyed by canonical rule id.

Annotated source supports two comment-line forms:

```text
# human-only comment
#:json {"label":"member proof","tags":["evidence"],"why":"Authority-signed member records admit content."}
```

Rules:

- a `#` comment line is human-only source text and is ignored by
  canonicalization;
- a `#:json ` line contains one JSON object encoded as UTF-8 NFC text;
- structured annotation lines apply to the next rule line;
- multiple `#:json` lines before one rule merge as JSON object fields, with
  later duplicate keys replacing earlier keys;
- an annotation line with no following rule is invalid annotated source;
- comments and annotations are never part of canonical lacegram text or
  `R.<b64a>` lacegram identifiers;
- annotation metadata is not a runtime fact unless a profile explicitly imports
  it.

A rule identifier is:

```text
U.<b64a(BLAKE3("lace-rule/v1" || canonical-rule-line-bytes))>
```

Canonicalization of annotated source emits:

- canonical 030 lacegram text;
- the canonical lacegram identifier;
- an annotation map from rule identifier to merged JSON object.

Merge, lowering, and tracing tools SHOULD preserve provenance from generated
rules to source rule identifiers when practical.

## Resource limits

An implementation MUST enforce configurable limits. Defaults MUST be at least:

| Limit | Minimum supported value |
|---|---:|
| base facts | 2^20 |
| runtime facts | 2^20 |
| derived facts per predicate | 2^18 |
| rule count | 256 |
| stratum iterations | 1000 |
| tuple arity | 8 |
| value bytes | 1024 |

The engine MUST stop with an error rather than exceed a limit.

## Fact-source interface

A fact source exposes facts to the Datalog engine:

```text
iterate(predicate_name) -> Iterator<(values...)>
query(predicate_name, value1, value2, ...) -> Iterator<(values...)>
count(predicate_name) -> integer
```

A query with fewer arguments than the predicate arity returns matching facts with
remaining positions unconstrained.

A fact source MAY be a storage index, peer advertisement snapshot, runtime tuple set,
materialized relation, query-view wrapper, or generated rule output. It MUST
expose the same facts as the profile's current evaluation snapshot.

## Incremental evaluation

Implementations MAY evaluate incrementally when base facts, runtime facts, or
rules change. Incremental evaluation MUST produce the same result as full
evaluation of the new snapshot and resolved fact views.

If facts are removed, implementations MUST use a deletion strategy that handles
stratified negation correctly, such as counting derivations or full
re-evaluation of affected strata.

## Examples

### Local records under a Group/App/Name prefix

```text
Selected(P) :- Have(P), Field(P,'Group',_,'u'), Field(P,'App',_,'ding'), Field(P,'Name',_,K), TextShape(K,'links/','','').
```

When this rule is local-origin on the local machine, `Have` and `Field` resolve
against the full local record view. When the same rule is peer-origin on another
machine, they resolve against that machine's query view for the rule origin.

### Follow linked support records visible in the current view

```text
Support(Target) :- Have(P), RecordLink(P,'+Link',_,'evidence',Target).
```

### Latest 100 records in the current view

```text
Before(P,Q) :- Candidate(P,T), Candidate(Q,T), LexCompare(Q,'>',P).
Before(P,Q) :- Candidate(P,TP), Candidate(Q,TQ), LexCompare(TQ,'>',TP).
Candidate(P,T) :- Have(P), Field(P,'Group',_,'u'), Field(P,'App',_,'ding'), Field(P,'Name',_,K), TextShape(K,'links/','',''), Field(P,'TAI',_,T).
Selected(P) :- Candidate(P,T), Cardinality(Before(P,Q),'<','100').
```

### Match one skipped segment

```text
Selected(P) :- Have(P), Field(P,'Name',_,K), TextShape(K,'links/','./','msg').
```

This matches `links/bob/msg` and `links/bob.msg`, but rejects `links/msg`,
`links/bob/alice/msg`, and `links/.msg`.

### Deny blocked signers

```text
Blocked(V) :- Have(P), Field(P,'Group',_,'keys'), Field(P,'App',_,'blocked'), Field(P,'Name',_,V).
Selected(P) :- Have(P), Field(P,'Signed-By',_,V), not Blocked(V).
```

## Rejection examples

- `Selected(P,V) :- Field(P,'Group',_,'u').` has unbound `V` in the head.
- `Selected(_) :- Have(P).` uses anonymous `_` in a rule head.
- `Selected(P) :- Have(P), TextShape(_,'links/','','').` uses anonymous `_` in a builtin.
- `Selected(P) :- Have(P), Field(P,'Group',_Name,'u').` uses invalid `_Name` in term position.
- `Have(P) :- true.` shadows a record fact predicate.
- `A(P) :- not B(P).` plus `B(P) :- not A(P).` is unstratified.
- `A(P) :- Candidate(P), Cardinality(A(Q),'<','100').` is unstratified.
- A canonical lacegram containing `Prefix(K,'links/')` is invalid; use
  `TextShape(K,'links/','','')`.
- A lacegram containing `X = Y` is invalid; `=` is not Lace Datalog.
- A canonical lacegram with comments, CRLF, blank lines, duplicate unsorted
  rules, or non-canonical escaping is not canonical. The annotated source format
  accepts `#` and `#:json` lines only before canonicalization.