Forge: Executive Summary
Read this first. Everything else is supporting documentation.
The One-Line Pitch
Opinionated toolchain that makes AI-generated code actually work by using primitives instead of frameworks.
The Problem (30 seconds)
React-era web development is too complex for AI to reason about reliably:
- 45% of AI-generated code has security vulnerabilities (Veracode 2025)
- Hooks, effects, hydration, client state = too many abstractions
- Senior engineers burn out fixing "vibe coded" spaghetti
- Nobody has proven conventions that make AI code generation reliable
The Thesis (1 minute)
AI struggles with React because the pattern is complex:
User action → useState → useEffect → fetch → JSON → state update → re-render
But AI understands primitives:
User action → HTTP request → Server handler → HTML response → Browser renders
If you give AI simpler patterns, it generates more reliable code.
The Stack (90 seconds)
Core (80% of app)
- Go 1.23+ — Fast, typed, single binary
- HTMX 2.x — HTML-over-the-wire (~14kb)
- Templ — Type-safe HTML templates
- sqlc — SQL → type-safe Go code
Islands (20% escape hatch)
- Svelte 5 — For drag-drop, rich editors (~5kb per island)
Why This Works
- Go stdlib = 20+ year old proven primitive (AI trained on billions of examples)
- HTMX = Browser-native HTTP + HTML (no framework abstractions)
- SQL = 50-year-old standard (AI knows SQL better than ORMs)
- Svelte islands = Opt-in complexity, not default
The Approach (2 minutes)
Phase 1: Foundation (Current)
Build rally-hq by hand following primitive-first conventions.
- No AI automation yet
- Prove conventions work
- Track every code generation attempt
Phase 2: Extraction
Extract forge new/test/deploy from what worked.
- Scaffold projects with conventions baked in
- Property-based tests from types
- One-command deploy
Phase 3: Generation
Add forge feature "<description>" for AI-assisted development.
- AI generates against known patterns
- Tests verify correctness
- Tight feedback loop
Phase 4: Polish
Make it usable by others.
- Documentation
- Example apps
- Blog post / launch
The Proof (What We're Building)
rally-hq — Tournament management app (existing Next.js version being rebuilt)
Why rebuild?
- Known functional requirements
- Direct comparison: Next.js vs Go+HTMX
- Real app, not toy example
- Extract patterns that work
Features:
- Tournament CRUD
- Team registration
- Bracket generation (Svelte island)
- Match scoring
- Live updates (SSE)
The Hypotheses (Being Validated)
| ID | Hypothesis | Target | Tracking |
|---|---|---|---|
| H1 | Go + HTMX → more reliable AI code | >85% compile success | .forge/ai-generations/ |
| H2 | HTML simpler than JSON→JS→DOM | Fewer bugs vs Next.js | Code comparison |
| H3 | Islands needed for <20% of features | <20% island usage | Feature audit |
| H4 | SSE handles real-time without WebSockets | <100ms latency | Performance tests |
| H5 | sqlc more AI-friendly than ORMs | >90% correct queries | Generation tracking |
| H6 | Bundle stays under 50kb | <50kb most pages | Build analysis |
Context Engineering Integration (New: Dec 23, 2025)
We analyzed Agent Skills for Context Engineering and found HIGHLY ALIGNED with Forge's thesis.
Key Validations
Tool Consolidation — "If a human can't pick the right tool, AI can't either"
- Forge: Finite component registries (10-20 tools max)
Primitive Exposure — "Models understand proven abstractions deeply"
- Forge: Go stdlib + HTMX + SQL (not frameworks)
Progressive Disclosure — "Load info only when needed"
- Forge: File-based structure (not config files)
Context Degradation — "Smaller token count = better results"
- HTMX: ~50% fewer tokens than React for same interaction
What We Implemented
- File-system-as-memory — Track AI generations in
.forge/ai-generations/YYYY-MM-DD/ - Tool Design Guidelines — 4-question framework for AI-friendly tools
- Enhanced descriptions — All tools answer: What, When, Inputs, Outputs
- Hypothesis tracking —
forge validate H1checks success rate
The Six Conventions
These make AI-generated code reliable:
- Finite Component Registry — AI selects from known components, doesn't invent
- Server-First State — No client state management to confuse AI
- Typed Primitives — All functions have schemas AI can reason about
- Workflow State Machines — Explicit states, not implicit control flow
- Property-Based Tests — Derived from types automatically
- File-Based Discovery — Structure in filesystem, not config
Current Status (Dec 23, 2025)
| Milestone | Status |
|---|---|
| Research phase | ✅ Done |
| Stack decisions | ✅ Done (Go + HTMX + Svelte islands) |
| Context engineering adoption | ✅ Done |
| rally-hq initialization | 🔲 Next |
| First feature (tournament CRUD) | 🔲 Planned |
| Deploy to Fly.io | 🔲 Planned |
forge new extraction |
🔲 Phase 2 |
forge feature AI generation |
🔲 Phase 3 |
Success Criteria
Phase 1 succeeds if:
- ✅ rally-hq deployed and functional
- ✅ H1-H6 hypotheses validated with data
- ✅ Svelte islands used for <20% of features
- ✅ Bundle size <50kb for most pages
- ✅ Compile success rate >85% (H1)
The project fails if:
- Go + HTMX blocks 2+ features with no workaround
- AI error rate exceeds SvelteKit equivalent
- Island integration proves too complex
The Anti-Patterns We're Avoiding
| Don't Do This | Why Not | Do This Instead |
|---|---|---|
| Build a new framework | Too much work, not the problem | Use existing tools with opinions |
| Solve for every framework | Can't be opinionated | Pick one stack, optimize it |
| Start with spec | Leads to ivory tower designs | Build real app, extract patterns |
| Promise timeline | Software is unpredictable | Show progress, no dates |
| Support everything | Dilutes the thesis | Opinionated = make choices |
Why This Matters
If Forge works:
- AI code generation becomes reliable enough for production
- Developers spend less time debugging AI output
- "AI-assisted development" graduates from toy demos to real tools
If Forge fails:
- Document why in LEARNINGS.md
- Share lessons learned
- Still contributed to the field
Quick Start (For Newcomers)
1. Understand the thesis → Read this document (you're doing it!)
2. See the stack decision → DECISION-LOG.md (why Go + HTMX)
3. Understand the architecture → ARCHITECTURE.md (how it works)
4. See what we're building → RALLY-HQ.md (the proof app)
5. Track progress → PROGRESS.md (what's done, what's next)
Want deeper understanding?
- AI agent patterns → 12-FACTOR-AGENTS.md
- Tool design → TOOL-DESIGN.md
- Context engineering → research/CONTEXT-ENGINEERING-ANALYSIS.md
The Origin Story (Optional)
This started as research into "AI-native web frameworks" (December 2025).
Key insight: Don't write specs first. Build something real, extract what works.
rally-hq exists as a Next.js app. Rebuilding it with primitives provides:
- Known functional requirements
- Controlled comparison
- Real patterns to extract
See research/AI-NATIVE-WEB-THESIS.md for the full backstory.
FAQ
Q: Why not just use Next.js? A: We're testing if primitives produce more reliable AI code. Can't test that without trying primitives.
Q: Why Go instead of TypeScript? A: Go is closer to HTTP primitives, has no framework churn, compiles to single binary. See DECISION-LOG.md.
Q: Why rebuild instead of building new? A: Known requirements eliminate "what should we build?" and let us focus on "how should AI build it?"
Q: What if this doesn't work? A: We document learnings in LEARNINGS.md and share them. Still valuable research.
Q: Can I use Forge now? A: Not yet. Phase 1 is proving conventions. Phase 2 extracts tooling. Phase 4 makes it public.
Q: How can I follow progress? A: Check PROGRESS.md regularly. Updated as we build.
The Bottom Line
Forge is an experiment to see if opinionated conventions can make AI code generation reliable enough for production.
We're building rally-hq by hand first, tracking every generation attempt, and extracting patterns that work. If it works, we'll share tooling. If not, we'll share learnings.
Current phase: Foundation (proving conventions) Next milestone: rally-hq deployed to Fly.io Timeline: No promises. Progress updates in PROGRESS.md
One-Page Version
For an even shorter overview, here's the elevator pitch:
The Problem: AI-generated code fails because React patterns are too complex.
The Solution: Use primitives (Go + HTMX + SQL) that AI actually understands.
The Proof: Rebuild rally-hq with primitives, track success rate, extract patterns.
The Bet: Primitive-first conventions will produce >85% compile success vs industry's 55%.
The Status: Phase 1 (proving) — building rally-hq by hand with tracked generations.
The Output: If it works, forge new tool. If not, published learnings.
That's Forge.
Last Updated: December 23, 2025 Status: Foundation Phase Progress: PROGRESS.md Full Docs: INDEX.html