Self-Host Canonical Migration Status
This document tracks the canonical-syntax migration state of the self-host mirror under src/.
Scope
Audited self-host areas:
src/ast/src/parser/src/typeck/src/codegen/src/lexer/src/mono/
Canonical syntax targeted:
let x = ...- explicit
return import { item } from module.path@type { name: Type }at file, class, layer, interface, and function scope
Deprecated syntax targeted:
let x: T = valuefn f(a: T)fn f(...) -> T
Migrated In This Pass
The following remaining mechanical self-host files were canonicalized safely in this pass:
src/parser/parse/item.dtsrc/parser/parse/stmt.dtsrc/mono/collector.dtsrc/parser/parser.dt
Migration techniques used:
- converted file-scope
@typecontracts from legacyfn ...members to binding formname: (...) -> ... - removed inline parameter and return annotations from executable function definitions
- moved typed local bindings and typed empty-array initializers into function-scope
@type - moved class field annotations into class-scope
@type - kept parser, typechecker, and monomorphization behavior unchanged
Previously Migrated Core
The self-host core had already been canonicalized before this pass across:
- core AST support modules
- parser expression/type helpers
- type environment, exhaustiveness, substitution, and unification helpers
- expression and statement inference helpers
- backend entry, emit/layout, and closure/helper codegen slices
- major lexer and support utilities
That earlier work also resolved the semantic parity gap for canonical contracts:
- function-scope
@type - interface-scope
@type - canonical contract flow through the self-host frontend mirror
- canonical contract flow through the self-host backend entry and emit paths
Remaining Blocked Files
The remaining unmigrated self-host files are now explicit and non-executable.
| File | Blocker | Classification |
|---|---|---|
src/ast/dump.dt | Very large printer module with broad low-value mechanical churn; migrating it now would add review noise without affecting parser/typechecker/codegen behavior. | Deferred cleanup |
src/typeck/dump.dt | Similar large pretty-printer module; canonicalization is straightforward but noisy and low priority compared with executable self-host paths. | Deferred cleanup |
No remaining src/ blocker in this audit is a frontend/backend semantic-parity gap or an executable bootstrap issue.
Final Executable Blocker Resolved
src/typeck/infer/item.dt was the last executable self-host file left in compatibility form.
Root cause of the earlier bootstrap regression:
- the previous direct rewrite removed inline helper-function signatures without restoring equivalent file-scope canonical contracts
- two internal helpers,
predeclare_fn_schemeandpredeclare_class, were missing from the file-level contract block - once those helper schemes disappeared, bootstrap lost stable type information in the self-host inference pass and regressed
Safe fix implemented:
- converted the file-level
@typeblock to canonical binding form - added canonical file-scope contracts for the previously missing helpers
- moved typed local bindings to function-scope
@type - removed deprecated inline parameter and return syntax from executable function definitions
Current result:
src/typeck/infer/item.dtnow builds cleanly in canonical form- self-host bootstrap passes again
- compatibility warnings from that file are gone
- the strict self-host CI exclusion list no longer includes it
Files Completed Across The Final Phase
The final-phase blocker set has now been reduced as follows:
- completed semantic/frontend parity:
src/ast/item.dtsrc/parser/parse/expr.dtsrc/parser/parse/item.dtsrc/parser/parse/stmt.dtsrc/typeck/infer/expr.dtsrc/typeck/infer/stmt.dtsrc/mono/collector.dt
- completed backend parity and canonicalization:
src/codegen/codegen.dtsrc/codegen/emit/expr.dtsrc/codegen/emit/item.dtsrc/codegen/emit/stmt.dtsrc/codegen/layout/class.dtsrc/codegen/layout/vtable.dtsrc/codegen/closure/capture.dtsrc/codegen/closure/emit.dtsrc/codegen/closure/env.dt
Current Readiness
The self-host mirror is now effectively at canonical syntax parity for executable/compiler paths:
- parser, typechecker, monomorphization, and backend/codegen slices are canonicalized where safe
- all executable self-host files are now canonicalized where safe
- the remaining debt is confined to two non-executable dump/printer modules
- a focused strict-canonical self-host CI subset is now practical
Practical strict-canonical subset:
- canonical compile fixtures
- selected GC / lambda / interface strict builds
What is not yet true:
- the entire
src/tree is not fully strict-clean because two deferred dump/printer modules still rely on compatibility-form syntax - the self-host bootstrap path is still tracked separately because
drat build src/main.dtcan hitLLVM ERROR: unknown special variable - full-tree strict self-host CI should wait until
src/ast/dump.dtandsrc/typeck/dump.dtare migrated, and until the separate bootstrap-path LLVM blocker is resolved or intentionally retired
Strict-Canonical CI Subset
The repository now enforces a focused self-host strict-canonical subset in CI.
Checker:
python3 tools/check_selfhost_strict_subset.py
What it does:
- scans the migrated self-host mirror under
src/ - fails if any non-excluded file reintroduces deprecated inline type syntax
- fails if one of the tracked exclusions no longer needs to be excluded, so the list stays reviewable
Intentionally excluded files:
src/ast/dump.dtsrc/typeck/dump.dt
This subset is intentional. It gives regression coverage for the migrated self-host tree without claiming that full-tree strict self-host support is complete or that the current bootstrap path is stable enough to be a hard CI gate.
CI Readiness
A focused strict-canonical self-host CI subset is now practical and enabled:
- parser/typecheck regression tests cover the Rust frontend/tooling path
tools/check_selfhost_strict_subset.pyguards the migratedsrc/subset against compatibility-form regressions- CI also runs one representative strict canonical fixture build
- CI also runs one tracked self-host bootstrap probe, but that probe warns instead of failing the subset job when it hits the known
LLVM ERROR: unknown special variableblocker
What would still be required for full-tree strict self-host CI:
- canonicalize or intentionally retire
src/ast/dump.dt - canonicalize or intentionally retire
src/typeck/dump.dt - resolve the separate self-host bootstrap-path blocker currently surfacing as
LLVM ERROR: unknown special variable
Final Readiness
executable/compiler-path self-host canonical migration is complete.
Current repository state:
- the migrated self-host compiler path is covered by strict-canonical subset CI
- the only remaining exclusions are
src/ast/dump.dtandsrc/typeck/dump.dt - those files are deferred printer cleanup, not executable compiler blockers
That means contributors can now treat canonical syntax as the normal rule for executable self-host code. Full-tree strict self-host CI would still require canonicalizing or intentionally retiring the two remaining dump modules and resolving the separate bootstrap-path LLVM blocker.
Verification Run
Historical focused verification from earlier canonical passes included:
cargo test -p draton-parser --test items -p draton-typeck --test errors- strict canonical builds for interface, lambda, generic-class, and GC fixtures
- repeated self-host-facing
cargo run -p drat -- build src/main.dt ...bootstrap checks
Focused verification for this final mechanical pass:
cargo test -p draton-parser --test items -p draton-typeck --test errorscargo run -p drat -- build --strict-syntax tests/programs/compile/52_lambda_nested.dt -o /tmp/draton_mech_final_strict/tmp/draton_mech_final_strictcargo run -p drat -- build src/main.dt -o /tmp/draton_selfhost_mech_final
Expected current behavior during bootstrap:
- self-host bootstrap remains CPU-bound and may be slow
- warning output now comes only from the deferred dump/printer modules
- the build can still hit the tracked LLVM-side blocker
LLVM ERROR: unknown special variable - this is not a deadlock; the main remaining cost is normal bootstrap work plus residual warning volume plus the unresolved LLVM blocker above
Recommended Next Step
Remaining cleanup order:
- Canonicalize
src/ast/dump.dt - Canonicalize
src/typeck/dump.dt - Then enable full-tree strict self-host CI if desired