Summary
Migrate Born's WebGPU backend from go-webgpu/webgpu (Rust FFI via wgpu-native shared library) to gogpu/wgpu (pure Go). Full replacement, not dual backend.
Context
Parent: Backend Strategy: GoGPU Integration Research #20
Decision: ADR-005 — use Core API, not HAL-direct (available on request; feel free to ask about specific points in comments)
Research: GOGPU_WGPU_ARCHITECTURE_2026-04-10 (internal)
gogpu/wgpu v0.24.6 stable, pure Go
Eliminates runtime dependency on wgpu_native shared library (.dll/.so/.dylib)
True single binary deployment
gogpu/naga supports DXIL (Rust naga does NOT)
WGSL shaders stay unchanged
Milestone: v0.8.0
Technical Approach
Use gogpu/wgpu Core API (root wgpu package):
Provides encoder pooling, staging belt, deferred destruction
Dispatch overhead negligible for ML (nanoseconds vs microsecond GPU kernels)
We maintain both libraries — can optimize Core API for ML workloads as needed
Do NOT use HAL-direct (except potentially for tensor arena allocator in the future).
Scope
Files to modify
internal/backend/webgpu/backend.go — device init, lifecycle
internal/backend/webgpu/compute.go — compute dispatch, encoder/pass
internal/backend/webgpu/gpu_ops.go — GPU tensor operations
internal/backend/webgpu/gpu_tensor.go — buffer management
internal/backend/webgpu/buffer_pool.go — buffer pooling
internal/backend/webgpu/lazy_compute.go — lazy mode
internal/backend/webgpu/gpu_creation.go — tensor creation
internal/backend/webgpu/ops.go, ops_extended.go — minor import updates
go.mod — swap dependency
Key changes
Replace go-webgpu/webgpu imports → gogpu/wgpu
Adapt device initialization to Core API flow (Instance → Adapter → Device)
Leverage Core API's staging belt for buffer uploads
Leverage Core API's encoder pooling (saves 64KB DX12 allocator per frame)
Use Core API's DestroyQueue for safe resource lifecycle
Update shader module creation API
WGSL shaders (shaders.go) — NO changes needed
Acceptance Criteria
Summary
Migrate Born's WebGPU backend from
go-webgpu/webgpu(Rust FFI via wgpu-native shared library) togogpu/wgpu(pure Go). Full replacement, not dual backend.Context
GOGPU_WGPU_ARCHITECTURE_2026-04-10(internal)gogpu/wgpuv0.24.6 stable, pure Gowgpu_nativeshared library (.dll/.so/.dylib)Technical Approach
Use gogpu/wgpu Core API (root
wgpupackage):Do NOT use HAL-direct (except potentially for tensor arena allocator in the future).
Scope
Files to modify
internal/backend/webgpu/backend.go— device init, lifecycleinternal/backend/webgpu/compute.go— compute dispatch, encoder/passinternal/backend/webgpu/gpu_ops.go— GPU tensor operationsinternal/backend/webgpu/gpu_tensor.go— buffer managementinternal/backend/webgpu/buffer_pool.go— buffer poolinginternal/backend/webgpu/lazy_compute.go— lazy modeinternal/backend/webgpu/gpu_creation.go— tensor creationinternal/backend/webgpu/ops.go,ops_extended.go— minor import updatesgo.mod— swap dependencyKey changes
go-webgpu/webgpuimports →gogpu/wgpushaders.go) — NO changes neededAcceptance Criteria
go build ./...passes with gogpu/wgpugo-webgpuimports remain.dll/.sodependency