Forge: Executive Summary

Read this first. Everything else is supporting documentation.

The One-Line Pitch

Opinionated toolchain that makes AI-generated code actually work by using primitives instead of frameworks.

The Problem (30 seconds)

React-era web development is too complex for AI to reason about reliably:

45% of AI-generated code has security vulnerabilities (Veracode 2025)
Hooks, effects, hydration, client state = too many abstractions
Senior engineers burn out fixing "vibe coded" spaghetti
Nobody has proven conventions that make AI code generation reliable

The Thesis (1 minute)

AI struggles with React because the pattern is complex:

User action → useState → useEffect → fetch → JSON → state update → re-render

But AI understands primitives:

User action → HTTP request → Server handler → HTML response → Browser renders

If you give AI simpler patterns, it generates more reliable code.

The Stack (90 seconds)

Core (80% of app)

Go 1.23+ — Fast, typed, single binary
HTMX 2.x — HTML-over-the-wire (~14kb)
Templ — Type-safe HTML templates
sqlc — SQL → type-safe Go code

Islands (20% escape hatch)

Svelte 5 — For drag-drop, rich editors (~5kb per island)

Why This Works

Go stdlib = 20+ year old proven primitive (AI trained on billions of examples)
HTMX = Browser-native HTTP + HTML (no framework abstractions)
SQL = 50-year-old standard (AI knows SQL better than ORMs)
Svelte islands = Opt-in complexity, not default

The Approach (2 minutes)

Phase 1: Foundation (Current)

Build rally-hq by hand following primitive-first conventions.

No AI automation yet
Prove conventions work
Track every code generation attempt

Phase 2: Extraction

Extract forge new/test/deploy from what worked.

Scaffold projects with conventions baked in
Property-based tests from types
One-command deploy

Phase 3: Generation

Add forge feature "<description>" for AI-assisted development.

AI generates against known patterns
Tests verify correctness
Tight feedback loop

Phase 4: Polish

Make it usable by others.

Documentation
Example apps
Blog post / launch

The Proof (What We're Building)

rally-hq — Tournament management app (existing Next.js version being rebuilt)

Why rebuild?

Known functional requirements
Direct comparison: Next.js vs Go+HTMX
Real app, not toy example
Extract patterns that work

Features:

Tournament CRUD
Team registration
Bracket generation (Svelte island)
Match scoring
Live updates (SSE)

The Hypotheses (Being Validated)

ID	Hypothesis	Target	Tracking
H1	Go + HTMX → more reliable AI code	>85% compile success	`.forge/ai-generations/`
H2	HTML simpler than JSON→JS→DOM	Fewer bugs vs Next.js	Code comparison
H3	Islands needed for <20% of features	<20% island usage	Feature audit
H4	SSE handles real-time without WebSockets	<100ms latency	Performance tests
H5	sqlc more AI-friendly than ORMs	>90% correct queries	Generation tracking
H6	Bundle stays under 50kb	<50kb most pages	Build analysis

Context Engineering Integration (New: Dec 23, 2025)

We analyzed Agent Skills for Context Engineering and found HIGHLY ALIGNED with Forge's thesis.

Key Validations

Tool Consolidation — "If a human can't pick the right tool, AI can't either"
- Forge: Finite component registries (10-20 tools max)
Primitive Exposure — "Models understand proven abstractions deeply"
- Forge: Go stdlib + HTMX + SQL (not frameworks)
Progressive Disclosure — "Load info only when needed"
- Forge: File-based structure (not config files)
Context Degradation — "Smaller token count = better results"
- HTMX: ~50% fewer tokens than React for same interaction

What We Implemented

File-system-as-memory — Track AI generations in .forge/ai-generations/YYYY-MM-DD/
Tool Design Guidelines — 4-question framework for AI-friendly tools
Enhanced descriptions — All tools answer: What, When, Inputs, Outputs
Hypothesis tracking — forge validate H1 checks success rate

The Six Conventions

These make AI-generated code reliable:

Finite Component Registry — AI selects from known components, doesn't invent
Server-First State — No client state management to confuse AI
Typed Primitives — All functions have schemas AI can reason about
Workflow State Machines — Explicit states, not implicit control flow
Property-Based Tests — Derived from types automatically
File-Based Discovery — Structure in filesystem, not config

Current Status (Dec 23, 2025)

Milestone	Status
Research phase	✅ Done
Stack decisions	✅ Done (Go + HTMX + Svelte islands)
Context engineering adoption	✅ Done
rally-hq initialization	🔲 Next
First feature (tournament CRUD)	🔲 Planned
Deploy to Fly.io	🔲 Planned
`forge new` extraction	🔲 Phase 2
`forge feature` AI generation	🔲 Phase 3

Success Criteria

Phase 1 succeeds if:

✅ rally-hq deployed and functional
✅ H1-H6 hypotheses validated with data
✅ Svelte islands used for <20% of features
✅ Bundle size <50kb for most pages
✅ Compile success rate >85% (H1)

The project fails if:

Go + HTMX blocks 2+ features with no workaround
AI error rate exceeds SvelteKit equivalent
Island integration proves too complex

The Anti-Patterns We're Avoiding

Don't Do This	Why Not	Do This Instead
Build a new framework	Too much work, not the problem	Use existing tools with opinions
Solve for every framework	Can't be opinionated	Pick one stack, optimize it
Start with spec	Leads to ivory tower designs	Build real app, extract patterns
Promise timeline	Software is unpredictable	Show progress, no dates
Support everything	Dilutes the thesis	Opinionated = make choices

Why This Matters

If Forge works:

AI code generation becomes reliable enough for production
Developers spend less time debugging AI output
"AI-assisted development" graduates from toy demos to real tools

If Forge fails:

Document why in LEARNINGS.md
Share lessons learned
Still contributed to the field

Quick Start (For Newcomers)

1. Understand the thesis → Read this document (you're doing it!)

2. See the stack decision → DECISION-LOG.md (why Go + HTMX)

3. Understand the architecture → ARCHITECTURE.md (how it works)

4. See what we're building → RALLY-HQ.md (the proof app)

5. Track progress → PROGRESS.md (what's done, what's next)

Want deeper understanding?

AI agent patterns → 12-FACTOR-AGENTS.md
Tool design → TOOL-DESIGN.md
Context engineering → research/CONTEXT-ENGINEERING-ANALYSIS.md

The Origin Story (Optional)

This started as research into "AI-native web frameworks" (December 2025).

Key insight: Don't write specs first. Build something real, extract what works.

rally-hq exists as a Next.js app. Rebuilding it with primitives provides:

Known functional requirements
Controlled comparison
Real patterns to extract

See research/AI-NATIVE-WEB-THESIS.md for the full backstory.

FAQ

Q: Why not just use Next.js? A: We're testing if primitives produce more reliable AI code. Can't test that without trying primitives.

Q: Why Go instead of TypeScript? A: Go is closer to HTTP primitives, has no framework churn, compiles to single binary. See DECISION-LOG.md.

Q: Why rebuild instead of building new? A: Known requirements eliminate "what should we build?" and let us focus on "how should AI build it?"

Q: What if this doesn't work? A: We document learnings in LEARNINGS.md and share them. Still valuable research.

Q: Can I use Forge now? A: Not yet. Phase 1 is proving conventions. Phase 2 extracts tooling. Phase 4 makes it public.

Q: How can I follow progress? A: Check PROGRESS.md regularly. Updated as we build.

The Bottom Line

Forge is an experiment to see if opinionated conventions can make AI code generation reliable enough for production.

We're building rally-hq by hand first, tracking every generation attempt, and extracting patterns that work. If it works, we'll share tooling. If not, we'll share learnings.

Current phase: Foundation (proving conventions) Next milestone: rally-hq deployed to Fly.io Timeline: No promises. Progress updates in PROGRESS.md

One-Page Version

For an even shorter overview, here's the elevator pitch:

The Problem: AI-generated code fails because React patterns are too complex.

The Solution: Use primitives (Go + HTMX + SQL) that AI actually understands.

The Proof: Rebuild rally-hq with primitives, track success rate, extract patterns.

The Bet: Primitive-first conventions will produce >85% compile success vs industry's 55%.

The Status: Phase 1 (proving) — building rally-hq by hand with tracked generations.

The Output: If it works, forge new tool. If not, published learnings.

That's Forge.

Last Updated: December 23, 2025 Status: Foundation Phase Progress: PROGRESS.md Full Docs: INDEX.html