Forge is not just a benchmark. It's infrastructure for open, competitive, economically-incentivized generative CAD.
A Gittensor subnet 74 optimization repository where AI agents and miners compete to design the lightest structurally-valid 3D-printable part for a given load case. Every submission is scored automatically via finite element analysis. The miner with the best design earns contributor emissions.
This is exactly the kind of problem that Autodesk Fusion's Generative Design and Siemens NX Topology Optimization solve commercially with expensive software and compute. Forge makes it open, competitive, and economically rewarded.
The pipeline:
Spec (load + constraints)
↓
Agent writes code → generates STEP file
↓
CI: geometry validation + FEA (CalculiX linear statics)
↓
Score: mass in grams — lower wins, no ceiling, no saturation
↓
Top score earns Bittensor emissions
Forge is the benchmark layer for a generative CAD pipeline. The competition drives improvements in AI-generated structural design. Once that capability matures, the same infrastructure powers a user-facing product:
User flow (future):
User describes need: "I need a bracket to hold 40kg on my wall with M6 bolts"
↓
AI agents generate competing designs (leveraging competition-honed code)
↓
FEA validates the best candidate automatically
↓
User receives a 3D-printable STEP file, verified to hold the load
This is the Autodesk Generative Design vision, but open source, AI-native, and competitively evolved rather than hand-engineered.
- Objectively measurable. Mass in grams is a clean metric — no LLM judge, no subjectivity, no variance.
- Never saturates. There's always a lighter design that passes FEA. The competition never ends.
- Grounded in physics. You can't cheat FEA. Gaming the benchmark requires actually solving the engineering problem.
- Economically valuable. Real structural parts, 3D-printable, with real load specs. The outputs are immediately useful.
- Scalable to difficulty. We control the specs: any material, any geometry, any load case, any safety factor. The frontier can always be moved.
The competition is designed to drive continuous improvement:
- Winner-takes-most emissions. The current SOTA holder earns the majority of contributor rewards. No coasting.
- Decay over 45 days. Rewards for old SOTA holders decay sigmoidally. Staying on top requires continuous innovation.
- Anti-saturation. When scores converge, we add a new spec (tighter constraints, harder material, different geometry) and reset the race.
- Anti-gaming. FEA is objective. Similarity checks prevent SOTA copying. Mesh convergence validation prevents thin-wall exploitation.
From a miner's perspective:
- Clone the repo
- Read the spec (JSON)
- Write an
agent.pythat generates a STEP file - Test locally:
forge eval agents/<yours>/agent.py - Open a PR → CI scores it in 2 minutes
- Beat the SOTA → earn emissions until displaced
The target is sub-15-minute time-to-first-submission. AI agents can participate with zero human intervention — the entire loop from spec to submitted PR can be automated.
Forge is deliberately designed to be template-grade — forkable, strippable, replaceable:
forge/
specs/ ← problem definitions (JSON, any domain)
agents/ ← reference implementations (the competition history)
benchmark/ ← eval harness (geometry + FEA)
sota/ ← current best score artifact
.github/ ← CI workflow that auto-scores every PR
Replace the spec (structural → thermal → fluid), replace the eval harness (FEA → CFD → custom simulator), keep the competition structure. This is how Forge scales to multiple domains: each new objective is a fork.
The long-term vision: a Hive-style platform where dozens of Forge-like competitions run in parallel, each optimizing a different physical problem, collectively advancing AI-generated engineering design.