Self-Host Canonical Migration Status
Current status date: April 6, 2026
This document is the canonical status sheet for the in-tree self-host compiler work.
Authority and boundary
- The Rust workspace under
crates/remains the authoritative implementation for compiler behavior, runtime ABI, CLI behavior, packaging, and tests. - The in-tree self-host compiler lives under
compiler/. - The
compiler/tree is real code, not a placeholder directory, but it is still subordinate to the Rust workspace until parity is proven stage by stage. src/remains reserved for the docs site and is not a compiler implementation tree.- The current executable self-host path is the hidden Rust bootstrap command
drat selfhost-stage0, which builds and runs thecompiler/tree through Rust-owned tooling incrates/drat/src/commands/selfhost_stage0.rs.
Current stage summary
Phase 1 parity contract freeze
The current Phase 1 outcome is a frozen oracle surface for the stages that already exist in stage0. This is a parity-contract milestone, not a frontend-complete or backend-complete milestone.
drat selfhost-stage0 lex,parse,typeck, andbuildnow target one versioned envelope shape:draton.selfhost.stage0/v1.- The frozen envelope fields are
schema,stage,input_path,bridge,success,result, anderror. - The frozen stage artifacts are:
- lexer: token stream plus lex diagnostics
- parser: lex diagnostics plus parse diagnostics, parse warnings, and AST program payload
- typechecker: lex diagnostics, parse diagnostics, parse warnings, type diagnostics, type warnings, and typed program payload
- build: output artifact paths plus machine-readable build failure payload
- This does not mean parser, typechecker, ownership, or backend parity is complete. It only means the Rust-authoritative oracle surface for the current stage0 path is now explicit and machine-checkable.
Lexer parity
- Current status: partially real in Draton; Rust is still authoritative.
- Source of truth:
crates/draton-lexer, especiallycrates/draton-lexer/tests/selfhost_parity.rs. - What is already real:
compiler/lexer/lexer.dt,compiler/lexer/token.dt,compiler/lexer/errors.dt, andcompiler/lexer/result.dtcontain a real lexer rewrite.compiler/driver/pipeline.dtimplementslex_jsonin Draton and does not call ahost_lex_jsonbridge.compiler/main.dtexposes thelexstage0 entrypoint.crates/drat/src/commands/selfhost_stage0.rsnormalizeslexoutput into the frozendraton.selfhost.stage0/v1envelope so CI can diff the lexer oracle without shape fallbacks.
- What still bridges to host Rust:
- Bootstrap and execution still depend on
crates/drat/src/commands/selfhost_stage0.rs, which builds the stage0 binary with the Rust toolchain. - The compiled stage0 binary still runs on the Rust-owned codegen/runtime stack from
crates/.
- Bootstrap and execution still depend on
- Blockers:
crates/drat/src/commands/selfhost_stage0.rsstill owns stage0 build orchestration..github/workflows/ci.ymlonly exercises a small lex/typeck/build smoke surface, not broad lexer parity coverage.crates/draton-lexer/tests/selfhost_parity.rsremains the Rust-authoritative parity oracle.
- Exit criteria:
- Lexer parity fixtures cover representative repository inputs and fail on the first semantic drift.
- Stage0 lex stays bridge-free at the pipeline layer and is exercised broadly enough to trust it as a parity surface.
Parser parity
- Current status: stage0 parser output now comes from the self-host lexer/parser path, but Rust remains the authoritative parity oracle.
- Source of truth:
crates/draton-parser, especiallycrates/draton-parser/tests/selfhost_parity.rs. - What is already real:
compiler/parser/parser.dtand the parser subtrees undercompiler/parser/parse/contain an in-tree parser rewrite.compiler/ast/contains the self-host AST model used by the rewrite.
- What is already real in stage0:
compiler/driver/pipeline.dtnow routesparse_jsonthroughcompiler/driver/parse_stage.dtinstead ofhost_parse_json.compiler/driver/parse_stage.dtruns the self-host lexer and parser, then serializes the parser payload into the frozen Rust-shaped stage0 contract.crates/drat/src/commands/selfhost_stage0.rsnow reports the parser bridge asselfhostwith no builtin bridge name.
- What still depends on Rust authority:
crates/drat/src/commands/selfhost_stage0.rsstill owns stage0 bootstrap, caching, and envelope normalization.crates/draton-parser/tests/selfhost_parity.rsremains the authoritative parser oracle.- The self-host parser JSON serializer under
compiler/driver/parse_stage.dtmust keep matching the Rustserdeshape to preserve envelope v1.
- Blockers:
- Parser parity still has to stay aligned on warnings, spans, recovery behavior, and exact Rust-shaped AST JSON.
- Stage0 bootstrap still goes through Rust-owned build orchestration and runtime/codegen infrastructure.
crates/draton-parser/tests/selfhost_parity.rsremains the authoritative parser oracle for first-diff reporting.
- Exit criteria:
compiler/driver/pipeline.dtno longer callshost_parse_json.- Stage0 parse JSON comes from the Draton parser path under
compiler/parser/. - Diagnostics, spans, warnings, and recovery behavior match the Rust parser on selected parity fixtures.
Typechecker parity
- Current status: the self-host typechecker tree under
compiler/typeck/is growing real ownership metadata, but the hiddendrat selfhost-stage0 typeckcommand still normalizes the Rusthost_type_jsonbridge by default. - Source of truth:
crates/draton-typeck. - What is already real:
compiler/typeck/infer/,compiler/typeck/types/, andcompiler/typeck/typed/contain a real self-host typechecker tree.compiler/typeck/typed/program.dtand related files define typed-program structures in the self-host tree.
- What is already real in the self-host tree:
compiler/driver/typeck_stage.dtcontains a self-host lexer/parser/typechecker entrypoint and now serializes typed bodies, ownership summaries, and selecteduse_effectmetadata.compiler/typeck/infer/result.dtnow threads the post-inference ownership pass through the self-host typed program.
- What is already real in stage0:
crates/drat/tests/selfhost_stage0.rsnow compares Rust-oracleuse_effectmetadata on selected call/return sites so the target ownership metadata shape is gated in tests.
- What still depends on Rust authority:
crates/drat/src/commands/selfhost_stage0.rsstill owns stage0 bootstrap, caching, and envelope normalization.crates/drat/src/commands/selfhost_stage0.rsstill dispatches hidden stage0typeckthroughhost_type_jsonby default.crates/draton-typeck/src/check.rsandcrates/draton-typeck/src/ownership.rsremain the authoritative semantic and ownership oracle.- The self-host typechecker JSON serializer under
compiler/driver/typeck_stage.dtis still a secondary/raw path rather than the default hidden stage0 contract source.
- Blockers:
- Hidden stage0
typeckstill bridges throughhost_type_json, so default stage0 output is not yet direct evidence for the self-host typechecker implementation. - The self-host typed-program serializer still needs broader Rust-shape parity if it becomes the default hidden stage0 contract source.
crates/draton-typeck/src/check.rsandcrates/draton-typeck/src/ownership.rsremain the authoritative semantic and ownership logic that the self-host tree still has to match.
- Hidden stage0
- Exit criteria:
compiler/driver/pipeline.dtno longer callshost_type_json.- Stage0 typecheck JSON comes from the Draton typechecker path under
compiler/typeck/. - Type diagnostics, warnings, and typed-program envelopes match the Rust authority on parity fixtures.
Ownership parity
- Current status: partially real in the self-host tree;
compiler/typeck/infer/ownership.dtnow writes ownership summaries and selected expressionuse_effectmetadata into the self-host typed program, but hidden stage0typeckstill exposes the Rust oracle by default. - Source of truth:
crates/draton-typeck/src/ownership.rsanddocs/runtime/inferred-ownership-spec.md. - What is already real:
compiler/typeck/typed/ownership.dtexists and establishes where ownership behavior belongs in the self-host tree.- Ownership-aware typed data structures already exist alongside the self-host typed-program model.
compiler/typeck/infer/ownership.dtnow performs a self-host ownership-summary pass after HM inference and writesownership_summaryinto the typed program for stage0 output.compiler/typeck/infer/ownership.dtnow also populates selecteduse_effectmetadata on typed expressions using self-host desired-effect rules for lets, returns, calls, method calls, field/index reads, and common container literals.compiler/driver/typeck_stage.dtnow serializes typed function bodies and per-expressionuse_effectmetadata on its raw self-host typecheck path.
- What still bridges to host Rust:
compiler/driver/pipeline.dtstill routesbuild_jsonthroughhost_build_json, so the production build path still relies on Rust ownership behavior.crates/draton-runtime/src/lib.rsstill reaches the Rustdrat buildpath for production build fallback behavior.
- Blockers:
- Hidden
drat selfhost-stage0 typeckstill goes throughhost_type_json, so the default stage0 command does not yet execute the self-hostuse_effectpath directly. - The new self-host
use_effectpopulation currently covers a selected, high-value subset of expression forms rather than the fullcrates/draton-typeck/src/ownership.rsmatrix. - Ownership diagnostics still come from the Rust authority; the self-host path does not yet match
crates/draton-typeck/src/ownership.rson borrow/move error reporting. crates/draton-runtime/src/lib.rsremains the production-path bridge throughhost_build_json.crates/draton-typeck/src/ownership.rsremains the authoritative ownership implementation that the self-host tree has not yet matched.docs/runtime/inferred-ownership-spec.mdremains ahead of any proven self-host ownership parity claim.
- Hidden
- Exit criteria:
- Ownership summaries and diagnostics are emitted from the self-host typechecker path.
- Ownership behavior matches the Rust authority on selected programs.
- No safe-code lowering claim depends on Rust-only ownership behavior.
Backend parity
- Current status: not at parity; the self-host backend tree exists, but the build path is still bridged to the Rust host compiler and several LLVM-layer files are placeholder stubs.
- Source of truth:
crates/draton-codegenand the Rust runtime/link flow used bydrat build. - What is already real:
compiler/codegen/contains a broad rewrite tree for codegen structure, monomorphization, vtables, layout, and emission scaffolding.compiler/codegen/core/,compiler/codegen/emit/,compiler/codegen/mono/,compiler/codegen/typemap/, andcompiler/codegen/vtable/are populated with real in-tree Draton files.
- What still bridges to host Rust:
compiler/driver/pipeline.dtimplementsbuild_jsonby callinghost_build_json(path, output, mode, strict_flag(strict_syntax), target).crates/draton-runtime/src/lib.rsimplementshost_build_json_path,runtime_ensure_host_drat, andhost_build_source_impl.crates/draton-runtime/src/lib.rscan build or reuse the Rustdratbinary and then invokesdrat buildas the fallback compiler path.
- Blockers:
compiler/driver/pipeline.dtstill hard-codes thehost_build_jsonbridge.crates/draton-runtime/src/lib.rsstill shells out to the Rustdratbuild path fromhost_build_source_impl.compiler/codegen/llvm/builder.dt,compiler/codegen/llvm/context.dt,compiler/codegen/llvm/module.dt,compiler/codegen/llvm/pass.dt,compiler/codegen/llvm/target.dt,compiler/codegen/llvm/types.dt, andcompiler/codegen/llvm/values.dtstill expose placeholder or stub behavior.
- Exit criteria:
compiler/driver/pipeline.dtno longer callshost_build_json.- The default self-host build path emits real backend output from
compiler/codegen/. - The backend no longer depends on placeholder LLVM wrapper behavior for normal compilation.
Bootstrap parity
- Current status: bootstrap and rescue layers exist and are exercised, but they are still Rust-owned fallback infrastructure rather than self-host independence.
- Source of truth:
crates/drat/src/commands/selfhost_stage0.rs,crates/drat/tests/selfhost_stage0.rs, and.github/workflows/ci.yml. - What is already real:
crates/drat/src/commands/selfhost_stage0.rsbuilds and runs thecompiler/tree through the hiddendrat selfhost-stage0command.crates/drat/tests/selfhost_stage0.rsvalidates machine-readablelex,parse,typeck, andbuildenvelopes, including stable build-failure payloads..github/workflows/ci.ymlincludes a workflow-dispatch path that runs stage0 commands and uploads artifacts.crates/draton-runtime/src/lib.rsprovides a fallback/rescue path that can build or reuse the Rustdratbinary.
- What still bridges to host Rust:
- Stage0 binary construction still goes through Rust
build::runincrates/drat/src/commands/selfhost_stage0.rs. - Stage0 build output still uses
host_build_json, which can recurse into the Rustdrat buildpath. - The bootstrap path still assumes a working Rust toolchain and matching LLVM environment.
- Stage0 binary construction still goes through Rust
- Blockers:
crates/drat/src/commands/selfhost_stage0.rsstill owns stage0 bootstrap and cache layout.crates/draton-runtime/src/lib.rsstill owns the host fallback compiler path.
.github/workflows/ci.ymlkeeps parser parity as an opt-in heavier remote slice; the parser stage itself is now self-hosted, but the parity run remains expensive enough to stay outside the default fast path.docs/benchmarks/gc-results-2026-03-17.mdrecords the current bootstrap workload as blocked byLLVM ERROR: unknown special variable.
- Exit criteria:
- Stage0 commands expose deterministic parity envelopes for every intended frontend stage.
- The bootstrap story distinguishes clearly between parity checking, rescue mode, and true self-rebuild.
- A self-rebuild path exists without presenting Rust fallback as the normal compiler path.
Host bridges currently in use
host_build_json- Called from
compiler/driver/pipeline.dt. - Implemented by
crates/draton-runtime/src/lib.rsthroughhost_build_json_pathanddraton_host_build_json. - The build fallback ultimately shells out to the Rust CLI path in
crates/draton-runtime/src/lib.rsthroughruntime_ensure_host_dratandhost_build_source_impl.
- Called from
Known placeholder areas
These paths exist in-tree but must not be described as production-ready backend implementation yet:
compiler/codegen/llvm/builder.dtcompiler/codegen/llvm/context.dtcompiler/codegen/llvm/module.dtcompiler/codegen/llvm/pass.dtcompiler/codegen/llvm/target.dtcompiler/codegen/llvm/types.dtcompiler/codegen/llvm/values.dt
What must not be claimed yet
- Do not claim the self-host compiler is the authoritative implementation.
- Do not claim full typechecker parity merely because the
host_type_jsonbridge is gone; Rust still defines the oracle. - Do not claim ownership parity complete while self-host
use_effectand ownership diagnostics still lagcrates/draton-typeck/src/ownership.rs, even though summary emission is now partially real. - Do not claim backend independence or a production-ready self-host backend while
compiler/driver/pipeline.dtstill callshost_build_json. - Do not claim
drat selfhost-stage0 buildproves self-host backend completion; today it still goes through Rust fallback infrastructure. - Do not claim Rust is optional for bootstrap or recovery.
Next actions
Phase 0 to Phase 1 handoff should do the following, in order:
- Keep this status file current whenever a bridge, blocker, or parity claim changes.
- Expand deterministic parity fixtures for
drat selfhost-stage0 lex,parse,typeck, andbuild. - Expand focused typechecker parity coverage now that
typeck_jsonno longer bridges through Rust. - Treat parser, typechecker, ownership, backend, and bootstrap as separate parity tracks instead of one generic “self-host complete” milestone.